From martijnverburg at gmail.com Tue Jan 1 16:08:18 2019 From: martijnverburg at gmail.com (Martijn Verburg) Date: Tue, 1 Jan 2019 16:08:18 +0000 Subject: SIGSEGV error on 11.0.1+13, libjvm.dylib In-Reply-To: References: Message-ID: Hi all, My bad! I steered Neil here - David and I have discussed the appropriate path forward for next time (thanks David!) in the adoption group and we've set up a JBS submit team. Cheers, Martijn On Fri, 28 Dec 2018 at 13:56, Neil Stevenson wrote: > Thanks and apologies > > I'll see if I can find someone with a copy of Oracle JDk to try with then > progress > > Neil > > On Fri, 28 Dec 2018 at 08:07, David Holmes > wrote: > > > Hi Neil, > > > > On 28/12/2018 5:34 pm, Neil Stevenson wrote: > > > Hi > > > Hope this is the right way to submit bug notifications, new here > > > > If you are an OpenJDK Contributor and have at least Author status then > > you can file a bug directly at: > > > > https://bugs.openjdk.java.net/ > > > > For AdoptOpenJDK issues I would have expected you to file the bug as you > > did per: > > > > # If you would like to submit a bug report, please visit: > > # https://github.com/AdoptOpenJDK/openjdk-build/issues > > > > and then I would have expected someone from AdoptOpenJDK to file the > > upstream bug at bugs.openjdk.java.net! Directing you to the mailing > > lists is not appropriate as these are for developer discussions not bug > > reporting. > > > > If this reproduces on Oracle JDK then you can file a bug at: > > > > https://bugreport.java.com/bugreport/ > > > > Thanks, > > David > > ----- > > > > > > > A bug has been found on AdoptOpenJDK and Azul Zulu, > > > that the AdoptOpenJDK folks thought should be submitted here. > > > Occurs at least on Darwin > > > > > > Steps to recreate and fuller history is here > > > https://github.com/AdoptOpenJDK/openjdk-build/issues/814. Gives a > > > segmentation fault fairly consistently. > > > > > > # A fatal error has been detected by the Java Runtime Environment: > > > # > > > # SIGSEGV (0xb) at pc=0x0000000104853d23, pid=23222, tid=953467 > > > # > > > # JRE version: OpenJDK Runtime Environment (11.0.1+13) (build > 11.0.1+13) > > > # Java VM: OpenJDK 64-Bit Server VM (11.0.1+13, mixed mode, tiered, > > > compressed oops, g1 gc, bsd-amd64) > > > # Problematic frame: > > > # V [libjvm.dylib+0x20dd23] ClassLoaderData::loader_name_and_id() > > const+0x7 > > > # > > > > > > > > > Neil > > > > > > From david.holmes at oracle.com Tue Jan 1 22:55:39 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Jan 2019 08:55:39 +1000 Subject: [PATCH] hsdis installation documentation update In-Reply-To: References: <79927d73-9592-4510-0157-93dd01f95536@oracle.com> Message-ID: <7c4dfeb2-4e76-a2d2-7e01-63f502804bb8@oracle.com> Hi Sergei, On 31/12/2018 8:42 am, Sergei Ustimenko wrote: > Hi David, > > > "server" should still say . While most platforms only have sevrer VM > > these days some do still have client and there is also minimal VM > > Ah, right. I though there are no client VMs out there already. > I've updated the patch, hope it is good now. > > > diff --git a/src/utils/hsdis/README b/src/utils/hsdis/README > --- a/src/utils/hsdis/README > +++ b/src/utils/hsdis/README > @@ -114,18 +114,16 @@ > > ?* Installing > > -Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so.? You can > -install them on your LD_LIBRARY_PATH, or inside of your JRE/JDK.? The > -search path in the JVM is: > - > -1. /jre/lib///libhsdis-.so > -2. /jre/lib///hsdis-.so > -3. /jre/lib//hsdis-.so > +Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so. You can > +install them next to your libjvm.so inside JRE/JDK or alternatively > +put it anywhere on your LD_LIBRARY_PATH. JVM looks up several paths > +derived from libjvm.so in the following order: For this last part I think you should keep the original text: "The search path in the JVM is:" The new text isn't quite grammatically correct, nor do I see the relevance of "derived from libjvm.so" given that the paths are not relative to the location of libjvm.so. Thanks, David > + > +1. /lib//libhsdis-.so > +2. /lib//hsdis-.so > +3. /lib/hsdis-.so > ?4. hsdis-.so? (using LD_LIBRARY_PATH) > > -Note that there's a bug in hotspot versions prior to hs22 that causes > -steps 2 and 3 to fail when used with JDK7. > - > ?Now test: > > ?? export LD_LIBRARY_PATH .../hsdis/build/$OS-$LIBARCH:$LD_LIBRARY_PATH > > > > Thanks, > Sergei From david.holmes at oracle.com Tue Jan 1 23:35:13 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Jan 2019 09:35:13 +1000 Subject: RFR(m): 8214271: Fast primitive to wake many threads In-Reply-To: <01873a0f-a0fb-18b9-f7d4-98bb638e9b57@oracle.com> References: <010211e3-93a6-80b9-678c-c84b08812e43@oracle.com> <70669453-e317-a30d-8d5a-e5b938b83c41@oracle.com> <4fb6cd22-cdd0-2419-c863-24b250ac0b16@oracle.com> <2a2679cc-b0e0-f8d0-7336-8666e1a42950@oracle.com> <01873a0f-a0fb-18b9-f7d4-98bb638e9b57@oracle.com> Message-ID: <41f5252b-3eb9-9a9e-70e5-49f6d8f9d670@oracle.com> Hi Robbin, On 21/12/2018 7:45 pm, Robbin Ehn wrote: > Hi David, > > On 2018-12-21 04:17, David Holmes wrote: >> >> Got it - subtle. >> >> Further this sounds like a race that could lead to bugs if not used >> very carefully ie. you can't assume between disarm() and wake() that >> all threads are blocked. > > I didn't realize how subtle this is. I think your original comment that > disarm/wake should be one operation was spot on. > Investigating... thinking... testing... yes I think this will work, fixed! > Sorry for not looking more into this before. I'm now curious how this will actually work in the context of the safepoint changes? >> >> I think perhaps this needs to be expanded to make this more obvious: >> >> ?? 68 //??? - A call to wait(tag) will block if the barrier is armed >> with the value >> ?? 69 //????? 'tag'; else it will return immediately. >> ?? 70 //??? - A blocked thread is eligible to execute again once the >> barrier is >> ?? 71 //????? disarmed and wake() has been called. >> +????????? - A call to wait(tag) that would block if it continued, but >> instead >> +??????????? is descheduled, may return immediately if scheduled after a >> +?????????? call to disarm(), but before the call to wake(). >> >> It also made me realize that in the general case (not when used with >> safepoints I think due to other state checks) a wake() may stall due >> to threads with a previous tag entering the wait() late. > > I added a double checking in the semaphore version, this means both > implementation should have progress guarantee. > > Making this v5 a bit large due to a lot of comments being changed. > > Inc: > http://cr.openjdk.java.net/~rehn/8214271/5/inc/webrev/ Nit: I would have kept disarm() rather than wake() as I like the arm/disarm duality. void GenericWaitBarrier::wait(int barrier_tag) { assert(barrier_tag != 0, "Trying to wait on disarmed value"); + if (barrier_tag == 0 && barrier_tag != _barrier_tag) { + OrderAccess::fence(); + return; + } I don't understand what the above is doing. A barrier_tag of 0 is a programming error caught during testing in debug builds. You don't need to account for it being 0 in product because this isn't something that can come in from an external source - we have full code control here. And even if you want to be this paranoid why would you need the fence? Thanks, David ----- > Full: > http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/ > > gtest passes thousands of loops locally and hundreds in mach5. > > Thanks, Robbin > >> >> Thanks, >> David >> >>>> >>>> s/Implementation/Implementations/ >>> >>> Fixed >>> >>>> >>>> The fourth line is no longer needed. >>> >>> Above is the reason I would like to keep the fourth line, since only >>> if you call >>> both disarm() and wake() you have that guarantee that waiter threads >>> will >>> return. >>> >>> Thanks, Robbin >>> >>>> >>>> Thanks, >>>> David >>>> >>>> >>>>> Inc: >>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/ >>>>> >>>>> Full: >>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/ >>>>> >>>>> /Robbin >>>>> >>>>>> >>>>>> Otherwise this all looks good! >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> >>>>>> >>>>>>> Full: >>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/ >>>>>>> >>>>>>> Thanks, Robbin >>>>>>> >>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote: >>>>>>>> Forgot RFR in subject. >>>>>>>> >>>>>>>> /Robbin >>>>>>>> >>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote: >>>>>>>>> Hi all, please review. >>>>>>>>> >>>>>>>>> When a safepoint is ended we need a way to get back to 100% >>>>>>>>> utilization as fast >>>>>>>>> as possible. 100% utilization means no idle cpu in the system >>>>>>>>> if there is a >>>>>>>>> JavaThread that could be executed. The traditional ways to wake >>>>>>>>> many, e.g. >>>>>>>>> semaphore, pthread_cond, is not implemented with a single >>>>>>>>> syscall instead they >>>>>>>>> typical do one syscall per thread to wake. >>>>>>>>> >>>>>>>>> This change-set contains that primitive, the WaitBarrier, and a >>>>>>>>> gtest for it. >>>>>>>>> No actual users, which is in coming patches. >>>>>>>>> >>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore >>>>>>>>> posting, threads woken >>>>>>>>> will also post. On Linux we can instead directly use a futex >>>>>>>>> and with one >>>>>>>>> syscall wake all. Depending on how many threads and cpus the >>>>>>>>> performance vary, >>>>>>>>> but a good utilization of the machine, just on the edge of >>>>>>>>> saturated, the time to reach 100% utilization is around 3 times >>>>>>>>> faster with the WaitBarrier (where futex is faster than >>>>>>>>> semaphore). >>>>>>>>> >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/ >>>>>>>>> >>>>>>>>> CR: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271 >>>>>>>>> >>>>>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug >>>>>>>>> and release. >>>>>>>>> And have been stable when used in safepoints (t1-8) (coming >>>>>>>>> patches). >>>>>>>>> >>>>>>>>> Thanks, Robbin From merkel05 at gmail.com Tue Jan 1 23:39:37 2019 From: merkel05 at gmail.com (Sergei Ustimenko) Date: Wed, 2 Jan 2019 00:39:37 +0100 Subject: [PATCH] hsdis installation documentation update In-Reply-To: <7c4dfeb2-4e76-a2d2-7e01-63f502804bb8@oracle.com> References: <79927d73-9592-4510-0157-93dd01f95536@oracle.com> <7c4dfeb2-4e76-a2d2-7e01-63f502804bb8@oracle.com> Message-ID: Hi David, On Tue, Jan 1, 2019, 23:55 David Holmes Hi Sergei, > > On 31/12/2018 8:42 am, Sergei Ustimenko wrote: > > Hi David, > > > > > "server" should still say . While most platforms only have sevrer > VM > > > these days some do still have client and there is also minimal VM > > > > Ah, right. I though there are no client VMs out there already. > > I've updated the patch, hope it is good now. > > > > > > diff --git a/src/utils/hsdis/README b/src/utils/hsdis/README > > --- a/src/utils/hsdis/README > > +++ b/src/utils/hsdis/README > > @@ -114,18 +114,16 @@ > > > > * Installing > > > > -Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so. You can > > -install them on your LD_LIBRARY_PATH, or inside of your JRE/JDK. The > > -search path in the JVM is: > > - > > -1. /jre/lib///libhsdis-.so > > -2. /jre/lib///hsdis-.so > > -3. /jre/lib//hsdis-.so > > +Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so. You can > > +install them next to your libjvm.so inside JRE/JDK or alternatively > > +put it anywhere on your LD_LIBRARY_PATH. JVM looks up several paths > > +derived from libjvm.so in the following order: > > For this last part I think you should keep the original text: > Sure, I just want to make sure libjvm.so is mentioned at all > "The search path in the JVM is:" > > The new text isn't quite grammatically correct, nor do I see the > relevance of "derived from libjvm.so" given that the paths are not > relative to the location of libjvm.so. > > Thanks, > David > > > + > > +1. /lib//libhsdis-.so > > +2. /lib//hsdis-.so > > +3. /lib/hsdis-.so > > 4. hsdis-.so (using LD_LIBRARY_PATH) > > > > -Note that there's a bug in hotspot versions prior to hs22 that causes > > -steps 2 and 3 to fail when used with JDK7. > > - > > Now test: > > > > export LD_LIBRARY_PATH .../hsdis/build/$OS-$LIBARCH:$LD_LIBRARY_PATH > > > > > > > > Thanks, > > Sergei > Thanks, Sergei > From david.holmes at oracle.com Wed Jan 2 00:56:59 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Jan 2019 10:56:59 +1000 Subject: [PATCH] hsdis installation documentation update In-Reply-To: References: <79927d73-9592-4510-0157-93dd01f95536@oracle.com> <7c4dfeb2-4e76-a2d2-7e01-63f502804bb8@oracle.com> Message-ID: <7f57621f-138f-21bd-e9e0-7c07a0d98c1f@oracle.com> Hi Sergei, I've filed: https://bugs.openjdk.java.net/browse/JDK-8215977 for this. (Not sure what subcomponent it really belongs to). I've also hosted the patch at: http://cr.openjdk.java.net/~dholmes/8215977/webrev/ Please check it and I'll sponsor the change. Thanks, David On 2/01/2019 9:39 am, Sergei Ustimenko wrote: > Hi David, > > On Tue, Jan 1, 2019, 23:55 David Holmes wrote: > > Hi Sergei, > > On 31/12/2018 8:42 am, Sergei Ustimenko wrote: > > Hi David, > > > >? > "server" should still say . While most platforms only have > sevrer VM > >? > these days some do still have client and there is also minimal VM > > > > Ah, right. I though there are no client VMs out there already. > > I've updated the patch, hope it is good now. > > > > > > diff --git a/src/utils/hsdis/README b/src/utils/hsdis/README > > --- a/src/utils/hsdis/README > > +++ b/src/utils/hsdis/README > > @@ -114,18 +114,16 @@ > > > >? ?* Installing > > > > -Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so. > You can > > -install them on your LD_LIBRARY_PATH, or inside of your > JRE/JDK.? The > > -search path in the JVM is: > > - > > -1. /jre/lib///libhsdis-.so > > -2. /jre/lib///hsdis-.so > > -3. /jre/lib//hsdis-.so > > +Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so. > You can > > +install them next to your libjvm.so inside JRE/JDK or alternatively > > +put it anywhere on your LD_LIBRARY_PATH. JVM looks up several paths > > +derived from libjvm.so in the following order: > > For this last part I think you should keep the original text: > > > Sure, I just want to make sure libjvm.so is mentioned at all > > > "The search path in the JVM is:" > > The new text isn't quite grammatically correct, nor do I see the > relevance of "derived from libjvm.so" given that the paths are not > relative to the location of libjvm.so. > > Thanks, > David > > > + > > +1. /lib//libhsdis-.so > > +2. /lib//hsdis-.so > > +3. /lib/hsdis-.so > >? ?4. hsdis-.so? (using LD_LIBRARY_PATH) > > > > -Note that there's a bug in hotspot versions prior to hs22 that > causes > > -steps 2 and 3 to fail when used with JDK7. > > - > >? ?Now test: > > > >? ?? export LD_LIBRARY_PATH > .../hsdis/build/$OS-$LIBARCH:$LD_LIBRARY_PATH > > > > > > > > Thanks, > > Sergei > > > Thanks, > Sergei > From merkel05 at gmail.com Wed Jan 2 01:02:00 2019 From: merkel05 at gmail.com (Sergei Ustimenko) Date: Wed, 2 Jan 2019 02:02:00 +0100 Subject: [PATCH] hsdis installation documentation update In-Reply-To: <7f57621f-138f-21bd-e9e0-7c07a0d98c1f@oracle.com> References: <79927d73-9592-4510-0157-93dd01f95536@oracle.com> <7c4dfeb2-4e76-a2d2-7e01-63f502804bb8@oracle.com> <7f57621f-138f-21bd-e9e0-7c07a0d98c1f@oracle.com> Message-ID: David, Change looks great, thanks for taking care of it! Cheers, Sergei On Wed, Jan 2, 2019, 01:57 David Holmes Hi Sergei, > > I've filed: > > https://bugs.openjdk.java.net/browse/JDK-8215977 > > for this. (Not sure what subcomponent it really belongs to). I've also > hosted the patch at: > > http://cr.openjdk.java.net/~dholmes/8215977/webrev/ > > Please check it and I'll sponsor the change. > > Thanks, > David > > On 2/01/2019 9:39 am, Sergei Ustimenko wrote: > > Hi David, > > > > On Tue, Jan 1, 2019, 23:55 David Holmes > wrote: > > > > Hi Sergei, > > > > On 31/12/2018 8:42 am, Sergei Ustimenko wrote: > > > Hi David, > > > > > > > "server" should still say . While most platforms only have > > sevrer VM > > > > these days some do still have client and there is also minimal > VM > > > > > > Ah, right. I though there are no client VMs out there already. > > > I've updated the patch, hope it is good now. > > > > > > > > > diff --git a/src/utils/hsdis/README b/src/utils/hsdis/README > > > --- a/src/utils/hsdis/README > > > +++ b/src/utils/hsdis/README > > > @@ -114,18 +114,16 @@ > > > > > > * Installing > > > > > > -Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so. > > You can > > > -install them on your LD_LIBRARY_PATH, or inside of your > > JRE/JDK. The > > > -search path in the JVM is: > > > - > > > -1. /jre/lib///libhsdis-.so > > > -2. /jre/lib///hsdis-.so > > > -3. /jre/lib//hsdis-.so > > > +Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so. > > You can > > > +install them next to your libjvm.so inside JRE/JDK or > alternatively > > > +put it anywhere on your LD_LIBRARY_PATH. JVM looks up several > paths > > > +derived from libjvm.so in the following order: > > > > For this last part I think you should keep the original text: > > > > > > Sure, I just want to make sure libjvm.so is mentioned at all > > > > > > "The search path in the JVM is:" > > > > The new text isn't quite grammatically correct, nor do I see the > > relevance of "derived from libjvm.so" given that the paths are not > > relative to the location of libjvm.so. > > > > Thanks, > > David > > > > > + > > > +1. /lib//libhsdis-.so > > > +2. /lib//hsdis-.so > > > +3. /lib/hsdis-.so > > > 4. hsdis-.so (using LD_LIBRARY_PATH) > > > > > > -Note that there's a bug in hotspot versions prior to hs22 that > > causes > > > -steps 2 and 3 to fail when used with JDK7. > > > - > > > Now test: > > > > > > export LD_LIBRARY_PATH > > .../hsdis/build/$OS-$LIBARCH:$LD_LIBRARY_PATH > > > > > > > > > > > > Thanks, > > > Sergei > > > > > > Thanks, > > Sergei > > > From david.holmes at oracle.com Wed Jan 2 01:09:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Jan 2019 11:09:36 +1000 Subject: [PATCH] hsdis installation documentation update In-Reply-To: References: <79927d73-9592-4510-0157-93dd01f95536@oracle.com> <7c4dfeb2-4e76-a2d2-7e01-63f502804bb8@oracle.com> <7f57621f-138f-21bd-e9e0-7c07a0d98c1f@oracle.com> Message-ID: Pushed. Cheers, David On 2/01/2019 11:02 am, Sergei Ustimenko wrote: > David, > > Change looks great, thanks for taking care of it! > > Cheers, > Sergei > > On Wed, Jan 2, 2019, 01:57 David Holmes wrote: > > Hi Sergei, > > I've filed: > > https://bugs.openjdk.java.net/browse/JDK-8215977 > > for this. (Not sure what subcomponent it really belongs to). I've also > hosted the patch at: > > http://cr.openjdk.java.net/~dholmes/8215977/webrev/ > > Please check it and I'll sponsor the change. > > Thanks, > David > > On 2/01/2019 9:39 am, Sergei Ustimenko wrote: > > Hi David, > > > > On Tue, Jan 1, 2019, 23:55 David Holmes > > > > wrote: > > > >? ? ?Hi Sergei, > > > >? ? ?On 31/12/2018 8:42 am, Sergei Ustimenko wrote: > >? ? ? > Hi David, > >? ? ? > > >? ? ? >? > "server" should still say . While most platforms > only have > >? ? ?sevrer VM > >? ? ? >? > these days some do still have client and there is also > minimal VM > >? ? ? > > >? ? ? > Ah, right. I though there are no client VMs out there already. > >? ? ? > I've updated the patch, hope it is good now. > >? ? ? > > >? ? ? > > >? ? ? > diff --git a/src/utils/hsdis/README b/src/utils/hsdis/README > >? ? ? > --- a/src/utils/hsdis/README > >? ? ? > +++ b/src/utils/hsdis/README > >? ? ? > @@ -114,18 +114,16 @@ > >? ? ? > > >? ? ? >? ?* Installing > >? ? ? > > >? ? ? > -Products are named like > build/$OS-$LIBARCH/hsdis-$LIBARCH.so. > >? ? ?You can > >? ? ? > -install them on your LD_LIBRARY_PATH, or inside of your > >? ? ?JRE/JDK.? The > >? ? ? > -search path in the JVM is: > >? ? ? > - > >? ? ? > -1. /jre/lib///libhsdis-.so > >? ? ? > -2. /jre/lib///hsdis-.so > >? ? ? > -3. /jre/lib//hsdis-.so > >? ? ? > +Products are named like build/$OS-$LIBARCH/hsdis-$LIBARCH.so. > >? ? ?You can > >? ? ? > +install them next to your libjvm.so inside JRE/JDK or > alternatively > >? ? ? > +put it anywhere on your LD_LIBRARY_PATH. JVM looks up > several paths > >? ? ? > +derived from libjvm.so in the following order: > > > >? ? ?For this last part I think you should keep the original text: > > > > > > Sure, I just want to make sure libjvm.so is mentioned at all > > > > > >? ? ?"The search path in the JVM is:" > > > >? ? ?The new text isn't quite grammatically correct, nor do I see the > >? ? ?relevance of "derived from libjvm.so" given that the paths > are not > >? ? ?relative to the location of libjvm.so. > > > >? ? ?Thanks, > >? ? ?David > > > >? ? ? > + > >? ? ? > +1. /lib//libhsdis-.so > >? ? ? > +2. /lib//hsdis-.so > >? ? ? > +3. /lib/hsdis-.so > >? ? ? >? ?4. hsdis-.so? (using LD_LIBRARY_PATH) > >? ? ? > > >? ? ? > -Note that there's a bug in hotspot versions prior to hs22 > that > >? ? ?causes > >? ? ? > -steps 2 and 3 to fail when used with JDK7. > >? ? ? > - > >? ? ? >? ?Now test: > >? ? ? > > >? ? ? >? ?? export LD_LIBRARY_PATH > >? ? ?.../hsdis/build/$OS-$LIBARCH:$LD_LIBRARY_PATH > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > Thanks, > >? ? ? > Sergei > > > > > > Thanks, > > Sergei > > > From matthias.baesken at sap.com Wed Jan 2 09:11:32 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 2 Jan 2019 09:11:32 +0000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd Message-ID: Hello , please review the following patch . Currently, when ThreadPriorityPolicy is set to 1 (so called "Aggressive mode"), on linux and bsd(+Mac) a root-user-check (geteuid() != 0)) is done. See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp int prio_init(). However the root-user-check has a few drawbacks: - it blocks the capabilities feature available on current Linux distros (CAP_SYS_NICE capability) that can be used to allow setting lower niceness also for non-root - setting a higher "niceness" (lower priority) is not possible on Linux for non-root because of the geteuid check We had a discussion about this in "ThreadPriorityPolicy settings for non-root users" , with this suggestion : https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-December/035986.html .... > Just drop the root check for ThreadPriorityPolicy=1 and let the underlying system > permissions control success or failure. I did the change in this webrev : Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8215962 http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ Best Regards , Matthias From sgehwolf at redhat.com Wed Jan 2 09:12:26 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 02 Jan 2019 10:12:26 +0100 Subject: Zero JVM segfaulting on linux-sparc In-Reply-To: References: <18c4b28e-9e94-9ee0-d2ac-cb8308607630@physik.fu-berlin.de> <401d0030-7681-1cf5-c557-024daf6d5e20@oracle.com> <48cbf65f09472c2e233a8ee64105467f14679a2d.camel@redhat.com> <82335ba5d00fe8f73fde83fa0bb67d4c1d5270f6.camel@redhat.com> Message-ID: <26530c5d6411edf5a1ced114e15ebfd5c4d21ccb.camel@redhat.com> Hi Adrian, On Sat, 2018-12-29 at 13:41 +0100, John Paul Adrian Glaubitz wrote: > Hi Severin! > > On 12/21/18 2:34 PM, John Paul Adrian Glaubitz wrote: > > On 12/21/18 1:52 PM, Severin Gehwolf wrote: > > > OK. It's rather curious. What are the different flags being used? > > > Pre/post patch? It looks as if this happens (on Linux sparc): > > > > > > Zero and '-ffp-contract=off -O2' => bad build > > > Zero and '-ffp-contract=off -O0' => good build > > > Server and '-ffp-contract=off -O2' => good build > > > > Logs here: https://people.debian.org/~glaubitz/openjdk-sparc/ > > Any suggestions how to move forward with this? Should I open an RFR > with your suggested patch which works fine for me? Or do you want to > do that? I have filed a bug report in Jira now [1]. Thanks for getting that bug filed. Before we push the work-around it would be good to know what's actually happening. You could try recompiling/running with an UB sanitizer and see whether there is a problem. If there is, it might be a Zero bug which we should fix. I have no way of testing this issue myself, though. > Shall I file a bug in gcc upstream as well? Possibly. Do you know for sure this is a GCC bug? If so, please do. Thanks, Severin > Adrian > > > [1] https://bugs.openjdk.java.net/browse/JDK-8215969 > > From glaubitz at physik.fu-berlin.de Wed Jan 2 10:02:20 2019 From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz) Date: Wed, 2 Jan 2019 11:02:20 +0100 Subject: Zero JVM segfaulting on linux-sparc In-Reply-To: <26530c5d6411edf5a1ced114e15ebfd5c4d21ccb.camel@redhat.com> References: <18c4b28e-9e94-9ee0-d2ac-cb8308607630@physik.fu-berlin.de> <401d0030-7681-1cf5-c557-024daf6d5e20@oracle.com> <48cbf65f09472c2e233a8ee64105467f14679a2d.camel@redhat.com> <82335ba5d00fe8f73fde83fa0bb67d4c1d5270f6.camel@redhat.com> <26530c5d6411edf5a1ced114e15ebfd5c4d21ccb.camel@redhat.com> Message-ID: <536c6d08-8338-32df-7e28-8df75097f0ee@physik.fu-berlin.de> Hi Severin! On 1/2/19 10:12 AM, Severin Gehwolf wrote: > Thanks for getting that bug filed. Before we push the work-around it > would be good to know what's actually happening. You could try > recompiling/running with an UB sanitizer and see whether there is a > problem. I have to admit that I have never used a UB sanitizer before, so it may take a while until I get a usable result. > If there is, it might be a Zero bug which we should fix. I > have no way of testing this issue myself, though. Actually, you can :-). We have added a SPARC T5 instance to the gcc compile farm. So, if you want, you can just apply for an account there, see: https://gcc.gnu.org/wiki/CompileFarm and get started. Any developer can apply for such an account. There are SPARC boxes running Linux (Debian unstable) as well as Solaris. I have root access to the Linux instance (gc202), so I can install any extra package if necessary. >> Shall I file a bug in gcc upstream as well? > > Possibly. Do you know for sure this is a GCC bug? If so, please do. No, no idea yet. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz at debian.org `. `' Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 From sgehwolf at redhat.com Wed Jan 2 10:16:33 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 02 Jan 2019 11:16:33 +0100 Subject: SIGSEGV error on 11.0.1+13, libjvm.dylib In-Reply-To: References: Message-ID: Hi Neil, On Fri, 2018-12-28 at 07:34 +0000, Neil Stevenson wrote: > Hi > Hope this is the right way to submit bug notifications, new here > > A bug has been found on AdoptOpenJDK and Azul Zulu, > that the AdoptOpenJDK folks thought should be submitted here. > Occurs at least on Darwin > > Steps to recreate and fuller history is here > https://github.com/AdoptOpenJDK/openjdk-build/issues/814. Gives a > segmentation fault fairly consistently. > > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x0000000104853d23, pid=23222, tid=953467 > # > # JRE version: OpenJDK Runtime Environment (11.0.1+13) (build 11.0.1+13) > # Java VM: OpenJDK 64-Bit Server VM (11.0.1+13, mixed mode, tiered, > compressed oops, g1 gc, bsd-amd64) > # Problematic frame: > # V [libjvm.dylib+0x20dd23] ClassLoaderData::loader_name_and_id() const+0x7 > # Looking at https://github.com/hazelcast/hazelcast/issues/14319 it appears this could be related to JDK-8212937[1] which got backported to JDK 11 and should be fixed in 11.0.3[2]. Could you re-test with a fixed build with [3]? Is SEGV happening in an error-path by any chance? Thanks, Severin [1] https://bugs.openjdk.java.net/browse/JDK-8212937 [2] https://bugs.openjdk.java.net/browse/JDK-8214528 [3] http://hg.openjdk.java.net/jdk-updates/jdk11u/rev/8687668b33da From Alan.Bateman at oracle.com Wed Jan 2 10:44:12 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 2 Jan 2019 10:44:12 +0000 Subject: SIGSEGV error on 11.0.1+13, libjvm.dylib In-Reply-To: References: Message-ID: <8ca77712-fee9-de83-76fd-2d5473fa888c@oracle.com> On 02/01/2019 10:16, Severin Gehwolf wrote: > : > Looking at https://github.com/hazelcast/hazelcast/issues/14319 it > appears this could be related to JDK-8212937[1] which got backported to > JDK 11 and should be fixed in 11.0.3[2]. Could you re-test with a fixed > build with [3]? I think you may be right as it does not duplicate with a JDK 12 build containing this fix. One other thing about this test case is that something in com.hazelcast.util.FilteringClassLoader seem to be hacking direcly on the private final ClassLoader.parent field.? That may be worthy of a bug report to the maintainers of that code as it creates the potential for a lot of side effects and cannot be guaranteed to work from release to release (even an update release). -Alan From david.holmes at oracle.com Wed Jan 2 11:25:12 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 2 Jan 2019 21:25:12 +1000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: Message-ID: Hi Matthias, On 2/01/2019 7:11 pm, Baesken, Matthias wrote: > Hello , please? review the following patch . > > Currently, when ThreadPriorityPolicy is set to 1 (so called ?Aggressive > mode?), on linux and bsd(+Mac) a root-user-check (geteuid() != 0)) is done. > See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp int > prio_init(). > > However the root-user-check has a few drawbacks: > - it blocks the capabilities feature available on current Linux distros > (CAP_SYS_NICE capability) that can be used to allow setting lower > niceness also for non-root > - setting a higher "niceness" (lower priority) is not possible on Linux > for non-root because of the geteuid check > > We? had a discussion about this in ?ThreadPriorityPolicy settings for > non-root users?? ,? with this suggestion : > > https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-December/035986.html > > ?. > >> Just drop the root check for ThreadPriorityPolicy=1 and let the underlying system > >> permissions control success or failure. > > I? did? the change? in this? webrev : > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8215962 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ This seems reasonable. I'm a little unsure about the warning when you might have the SYS_CAP_NICE capability. Ideally you'd only warn if you ask for something that can't be done ... but we need libcap it seems to ask that question. Thanks, David > Best Regards , ?Matthias > From matthias.baesken at sap.com Wed Jan 2 16:08:52 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 2 Jan 2019 16:08:52 +0000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: Message-ID: Hi David, thanks for looking into it . > ... but we need libcap it seems to ask that question. Yes I think so too, we would need libcap and I don't want to introduce this dependency . Should I change/remove the warning or keep as it is in my webrev ? Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 2. Januar 2019 12:25 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' > Cc: Lindenmaier, Goetz > Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root > users on linux/bsd > > Hi Matthias, > > On 2/01/2019 7:11 pm, Baesken, Matthias wrote: > > Hello , please? review the following patch . > > > > Currently, when ThreadPriorityPolicy is set to 1 (so called "Aggressive > > mode"), on linux and bsd(+Mac) a root-user-check (geteuid() != 0)) is > done. > > See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp int > > prio_init(). > > > > However the root-user-check has a few drawbacks: > > - it blocks the capabilities feature available on current Linux distros > > (CAP_SYS_NICE capability) that can be used to allow setting lower > > niceness also for non-root > > - setting a higher "niceness" (lower priority) is not possible on Linux > > for non-root because of the geteuid check > > > > We? had a discussion about this in "ThreadPriorityPolicy settings for > > non-root users"? ,? with this suggestion : > > > > https://mail.openjdk.java.net/pipermail/hotspot-dev/2018- > December/035986.html > > > > .. > > > >> Just drop the root check for ThreadPriorityPolicy=1 and let the underlying > system > > > >> permissions control success or failure. > > > > I? did? the change? in this? webrev : > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8215962 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ > > This seems reasonable. I'm a little unsure about the warning when you > might have the SYS_CAP_NICE capability. Ideally you'd only warn if you > ask for something that can't be done ... but we need libcap it seems to > ask that question. > > Thanks, > David > > > Best Regards , ?Matthias > > From daniel.daugherty at oracle.com Wed Jan 2 16:50:19 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 2 Jan 2019 11:50:19 -0500 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: Message-ID: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> On 1/2/19 4:11 AM, Baesken, Matthias wrote: > Hello , please review the following patch . > > Currently, when ThreadPriorityPolicy is set to 1 (so called "Aggressive mode"), on linux and bsd(+Mac) a root-user-check (geteuid() != 0)) is done. > See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp int prio_init(). > > However the root-user-check has a few drawbacks: > - it blocks the capabilities feature available on current Linux distros (CAP_SYS_NICE capability) that can be used to allow setting lower niceness also for non-root > - setting a higher "niceness" (lower priority) is not possible on Linux for non-root because of the geteuid check > > We had a discussion about this in "ThreadPriorityPolicy settings for non-root users" , with this suggestion : > > https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-December/035986.html > > .... > >> Just drop the root check for ThreadPriorityPolicy=1 and let the underlying system >> permissions control success or failure. > I did the change in this webrev : > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8215962 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ General ? - Please update Copyright years to 2019 before pushing. src/hotspot/share/runtime/globals.hpp ??? L2045: ????????? "??? Linux/BSD/Mac this policy requires root privilege or??????? "\ ??????? Typo: "Mac" should be "macOS". ??? L2046: ????????? "??? extended capabilites.")????????????????????????????????????? \ ??????? Typo: "capabilites" -> "capabilities." ??????? Please consider: "an extended capability." since it is normal ??????? for only a single capability to be required for a specific ??????? policy override. src/hotspot/os/bsd/os_bsd.cpp ??? L2260: // CAP_SYS_NICE capabilities ??????? Typo: "capabilities" -> "capability." (note the added period) ??? L2306: ? if (ThreadPriorityPolicy == 1) { ??? L2307: ??? // root and threads with capability CAP_SYS_NICE can raise the thread priority ??? L2308: ??? // however testing the CAP_SYS_NICE capability would require libcap.so ??? L2309: ??? if (geteuid() != 0) { ??? L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { ??? L2311: ??????? warning("-XX:ThreadPriorityPolicy requires root privilege or CAP_SYS_NICE capability on Bsd"); ??? L2312: ????? } ??? L2313: ??? } ??? L2314: ? } ??????? Sorry this whole block is the wrong thing to do. It makes the ??????? assumption that it "knows" the underlying security policy and ??????? tries to provide a "helpful" message in anticipation that the ??????? underlying security policy will reject the operation. ??????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be deleted. ??? If you want to output a helpful warning, then you need to do it in ??? the code that will actually get a policy failure: ??? static void do_set_native_prio_warning() { ??????? static bool has_warned = false; ??????? if (!has_warned) { ????????? warning("-XX:ThreadPriorityPolicy requires root privilege or the CAP_SYS_NICE capability."); ????????? has_warned = true; ??????? } ??? } ??? L2321: OSReturn os::set_native_priority(Thread* thread, int newpri) { ??? L2322: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) return OS_OK; ??? L2323: ??? L2324: #ifdef __OpenBSD__ ??? L2325: ? // OpenBSD pthread_setprio starves low priority threads ??? L2326: ? return OS_OK; ??? L2327: #elif defined(__FreeBSD__) ??? L2328: ? int ret = pthread_setprio(thread->osthread()->pthread_id(), newpri); // Note the __FreeBSD__ branch here is broken; it is missing the return sequence. ? ? ???????? if (ret != 0) { ?? ? ????????? do_set_native_prio_warning(); ?? ? ????????? return OS_ERR; ? ? ???????? } ?? ? ??????? return OS_OK; ??? L2329: #elif defined(__APPLE__) || defined(__NetBSD__) ??? L2330: ? struct sched_param sp; ??? L2331: ? int policy; ??? L2332: ??? L2333: ? if (pthread_getschedparam(thread->osthread()->pthread_id(), &policy, &sp) != 0) { ???????? ? ??? do_set_native_prio_warning(); ??? L2334: ??? return OS_ERR; ??? L2335: ? } ??? L2336: ??? L2337: ? sp.sched_priority = newpri; ??? L2338: ? if (pthread_setschedparam(thread->osthread()->pthread_id(), policy, &sp) != 0) { ??????????? ?? do_set_native_prio_warning(); ??? L2339: ??? return OS_ERR; ??? L2340: ? } ??? L2341: ??? L2342: ? return OS_OK; ??? L2343: #else ??? L2344: ? int ret = setpriority(PRIO_PROCESS, thread->osthread()->thread_id(), newpri); ??? ? ?????? if (ret != 0) { ??? ? ???????? do_set_native_prio_warning(); ?????????? ? ? return OS_ERR; ????????? ?? } ???????? ? ? return OS_OK;? // replace L2345 with this line. ??? L2345: ? return (ret == 0) ? OS_OK : OS_ERR; ??? L2346: #endif ??? L2347: } src/hotspot/os/linux/os_linux.cpp ??? L4080: // CAP_SYS_NICE capabilities ??????? Typo: "capabilities" -> "capability." (note the added period) ??? The same comment about prio_init() applies here: ??? L4103-L4111: ??????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be deleted. ??? static void do_set_native_prio_warning() { ??????? static bool has_warned = false; ??????? if (!has_warned) { ????????? warning("-XX:ThreadPriorityPolicy requires root privilege or the CAP_SYS_NICE capability."); ????????? has_warned = true; ??????? } ??? } ??? But the changes to os::set_native_priority() are much simpler: ??? L4118: OSReturn os::set_native_priority(Thread* thread, int newpri) { ??? L4119: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) return OS_OK; ??? L4120: ??? L4121: ? int ret = setpriority(PRIO_PROCESS, thread->osthread()->thread_id(), newpri); ???????????? if (ret != 0) { ?????????????? do_set_native_prio_warning(); ?????????????? return OS_ERR; ???????????? } ???????????? return OS_OK;? // replace L4122 with this line. ??? L4122: ? return (ret == 0) ? OS_OK : OS_ERR; ??? L4123: } In both os/bsd/os_bsd.cpp and os/linux/os_linux.cpp, the os::get_native_priority() code allows for the possibility of getting an error condition for getpriority(). I don't think we need a do_get_native_prio_warning() function here since the only threads we should be querying belong to the Java process so they should not fail the policy check. Dan > > > Best Regards , Matthias From david.holmes at oracle.com Wed Jan 2 22:35:34 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Jan 2019 08:35:34 +1000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> Message-ID: <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> Hi Dan, On 3/01/2019 2:50 am, Daniel D. Daugherty wrote: > On 1/2/19 4:11 AM, Baesken, Matthias wrote: >> Hello , please? review the following patch . >> >> Currently, when ThreadPriorityPolicy is set to 1 (so called >> "Aggressive mode"), on linux and bsd(+Mac) a root-user-check >> (geteuid() != 0)) is done. >> See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp >> int prio_init(). >> >> However the root-user-check has a few drawbacks: >> - it blocks the capabilities feature available on current Linux >> distros (CAP_SYS_NICE capability) that can be used to allow setting >> lower niceness also for non-root >> - setting a higher "niceness" (lower priority) is not possible on >> Linux for non-root because of the geteuid check >> >> We? had a discussion about this in "ThreadPriorityPolicy settings for >> non-root users"? ,? with this suggestion : >> >> https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-December/035986.html >> >> >> .... >> >>> Just drop the root check for ThreadPriorityPolicy=1 and let the >>> underlying system >>> permissions control success or failure. >> I? did? the change? in this? webrev : >> >> >> Bug/webrev : >> >> https://bugs.openjdk.java.net/browse/JDK-8215962 >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ > > General > ? - Please update Copyright years to 2019 before pushing. > > src/hotspot/share/runtime/globals.hpp > ??? L2045: ????????? "??? Linux/BSD/Mac this policy requires root > privilege or??????? "\ > ??????? Typo: "Mac" should be "macOS". > > ??? L2046: ????????? "??? extended > capabilites.")????????????????????????????????????? \ > ??????? Typo: "capabilites" -> "capabilities." > > ??????? Please consider: "an extended capability." since it is normal > ??????? for only a single capability to be required for a specific > ??????? policy override. > > src/hotspot/os/bsd/os_bsd.cpp > ??? L2260: // CAP_SYS_NICE capabilities > ??????? Typo: "capabilities" -> "capability." (note the added period) > > ??? L2306: ? if (ThreadPriorityPolicy == 1) { > ??? L2307: ??? // root and threads with capability CAP_SYS_NICE can > raise the thread priority > ??? L2308: ??? // however testing the CAP_SYS_NICE capability would > require libcap.so > ??? L2309: ??? if (geteuid() != 0) { > ??? L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { > ??? L2311: ??????? warning("-XX:ThreadPriorityPolicy requires root > privilege or CAP_SYS_NICE capability on Bsd"); > ??? L2312: ????? } > ??? L2313: ??? } > ??? L2314: ? } > ??????? Sorry this whole block is the wrong thing to do. It makes the > ??????? assumption that it "knows" the underlying security policy and > ??????? tries to provide a "helpful" message in anticipation that the > ??????? underlying security policy will reject the operation. > > ??????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be deleted. > > ??? If you want to output a helpful warning, then you need to do it in > ??? the code that will actually get a policy failure: > > ??? static void do_set_native_prio_warning() { > ??????? static bool has_warned = false; > ??????? if (!has_warned) { > ????????? warning("-XX:ThreadPriorityPolicy requires root privilege or > the CAP_SYS_NICE capability."); > ????????? has_warned = true; > ??????? } > ??? } Sorry I disagree. The existing code checks for policy and whether root and issues a warning then resets policy. The new code does exactly the same thing except it doesn't reset the policy. The placement of the warning was fine before so it is fine now. Yes it could go on use (and add a only-warn-once hack) but why force such a disruptive change that has no benefit? Cheers, David > ??? L2321: OSReturn os::set_native_priority(Thread* thread, int newpri) { > ??? L2322: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) > return OS_OK; > ??? L2323: > ??? L2324: #ifdef __OpenBSD__ > ??? L2325: ? // OpenBSD pthread_setprio starves low priority threads > ??? L2326: ? return OS_OK; > ??? L2327: #elif defined(__FreeBSD__) > ??? L2328: ? int ret = > pthread_setprio(thread->osthread()->pthread_id(), newpri); > // Note the __FreeBSD__ branch here is broken; it is missing the return > sequence. > ? ? ???????? if (ret != 0) { > ?? ? ????????? do_set_native_prio_warning(); > ?? ? ????????? return OS_ERR; > ? ? ???????? } > ?? ? ??????? return OS_OK; > ??? L2329: #elif defined(__APPLE__) || defined(__NetBSD__) > ??? L2330: ? struct sched_param sp; > ??? L2331: ? int policy; > ??? L2332: > ??? L2333: ? if > (pthread_getschedparam(thread->osthread()->pthread_id(), &policy, &sp) > != 0) { > ???????? ? ??? do_set_native_prio_warning(); > ??? L2334: ??? return OS_ERR; > ??? L2335: ? } > ??? L2336: > ??? L2337: ? sp.sched_priority = newpri; > ??? L2338: ? if > (pthread_setschedparam(thread->osthread()->pthread_id(), policy, &sp) != > 0) { > ??????????? ?? do_set_native_prio_warning(); > ??? L2339: ??? return OS_ERR; > ??? L2340: ? } > ??? L2341: > ??? L2342: ? return OS_OK; > ??? L2343: #else > ??? L2344: ? int ret = setpriority(PRIO_PROCESS, > thread->osthread()->thread_id(), newpri); > ??? ? ?????? if (ret != 0) { > ??? ? ???????? do_set_native_prio_warning(); > ?????????? ? ? return OS_ERR; > ????????? ?? } > ???????? ? ? return OS_OK;? // replace L2345 with this line. > ??? L2345: ? return (ret == 0) ? OS_OK : OS_ERR; > ??? L2346: #endif > ??? L2347: } > > > src/hotspot/os/linux/os_linux.cpp > ??? L4080: // CAP_SYS_NICE capabilities > ??????? Typo: "capabilities" -> "capability." (note the added period) > > ??? The same comment about prio_init() applies here: > > ??? L4103-L4111: > ??????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be deleted. > > ??? static void do_set_native_prio_warning() { > ??????? static bool has_warned = false; > ??????? if (!has_warned) { > ????????? warning("-XX:ThreadPriorityPolicy requires root privilege or > the CAP_SYS_NICE capability."); > ????????? has_warned = true; > ??????? } > ??? } > > ??? But the changes to os::set_native_priority() are much simpler: > > ??? L4118: OSReturn os::set_native_priority(Thread* thread, int newpri) { > ??? L4119: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) > return OS_OK; > ??? L4120: > ??? L4121: ? int ret = setpriority(PRIO_PROCESS, > thread->osthread()->thread_id(), newpri); > ???????????? if (ret != 0) { > ?????????????? do_set_native_prio_warning(); > ?????????????? return OS_ERR; > ???????????? } > ???????????? return OS_OK;? // replace L4122 with this line. > ??? L4122: ? return (ret == 0) ? OS_OK : OS_ERR; > ??? L4123: } > > In both os/bsd/os_bsd.cpp and os/linux/os_linux.cpp, the > os::get_native_priority() code allows for the possibility > of getting an error condition for getpriority(). I don't > think we need a do_get_native_prio_warning() function here > since the only threads we should be querying belong to the > Java process so they should not fail the policy check. > > Dan > > >> >> >> Best Regards ,? Matthias > From daniel.daugherty at oracle.com Wed Jan 2 23:36:18 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 2 Jan 2019 18:36:18 -0500 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> Message-ID: <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> On 1/2/19 5:35 PM, David Holmes wrote: > Hi Dan, > > On 3/01/2019 2:50 am, Daniel D. Daugherty wrote: >> On 1/2/19 4:11 AM, Baesken, Matthias wrote: >>> Hello , please? review the following patch . >>> >>> Currently, when ThreadPriorityPolicy is set to 1 (so called >>> "Aggressive mode"), on linux and bsd(+Mac) a root-user-check >>> (geteuid() != 0)) is done. >>> See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp >>> int prio_init(). >>> >>> However the root-user-check has a few drawbacks: >>> - it blocks the capabilities feature available on current Linux >>> distros (CAP_SYS_NICE capability) that can be used to allow setting >>> lower niceness also for non-root >>> - setting a higher "niceness" (lower priority) is not possible on >>> Linux for non-root because of the geteuid check >>> >>> We? had a discussion about this in "ThreadPriorityPolicy settings >>> for non-root users"? ,? with this suggestion : >>> >>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-December/035986.html >>> >>> >>> .... >>> >>>> Just drop the root check for ThreadPriorityPolicy=1 and let the >>>> underlying system >>>> permissions control success or failure. >>> I? did? the change? in this? webrev : >>> >>> >>> Bug/webrev : >>> >>> https://bugs.openjdk.java.net/browse/JDK-8215962 >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ >> >> General >> ?? - Please update Copyright years to 2019 before pushing. >> >> src/hotspot/share/runtime/globals.hpp >> ???? L2045: ????????? "??? Linux/BSD/Mac this policy requires root >> privilege or??????? "\ >> ???????? Typo: "Mac" should be "macOS". >> >> ???? L2046: ????????? "??? extended >> capabilites.")????????????????????????????????????? \ >> ???????? Typo: "capabilites" -> "capabilities." >> >> ???????? Please consider: "an extended capability." since it is normal >> ???????? for only a single capability to be required for a specific >> ???????? policy override. >> >> src/hotspot/os/bsd/os_bsd.cpp >> ???? L2260: // CAP_SYS_NICE capabilities >> ???????? Typo: "capabilities" -> "capability." (note the added period) >> >> ???? L2306: ? if (ThreadPriorityPolicy == 1) { >> ???? L2307: ??? // root and threads with capability CAP_SYS_NICE can >> raise the thread priority >> ???? L2308: ??? // however testing the CAP_SYS_NICE capability would >> require libcap.so >> ???? L2309: ??? if (geteuid() != 0) { >> ???? L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { >> ???? L2311: ??????? warning("-XX:ThreadPriorityPolicy requires root >> privilege or CAP_SYS_NICE capability on Bsd"); >> ???? L2312: ????? } >> ???? L2313: ??? } >> ???? L2314: ? } >> ???????? Sorry this whole block is the wrong thing to do. It makes the >> ???????? assumption that it "knows" the underlying security policy and >> ???????? tries to provide a "helpful" message in anticipation that the >> ???????? underlying security policy will reject the operation. >> >> ???????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be >> deleted. >> >> ???? If you want to output a helpful warning, then you need to do it in >> ???? the code that will actually get a policy failure: >> >> ???? static void do_set_native_prio_warning() { >> ???????? static bool has_warned = false; >> ???????? if (!has_warned) { >> ?????????? warning("-XX:ThreadPriorityPolicy requires root privilege >> or the CAP_SYS_NICE capability."); >> ?????????? has_warned = true; >> ???????? } >> ???? } > > Sorry I disagree. The existing code checks for policy and whether root > and issues a warning then resets policy. You have to be careful with the word 'policy' here. You are actually talking about the 'ThreadPriorityPolicy' option here and not the security policy associated with the setpriority() call. One thing the application code cannot do here is reset the underlying security policy. > The new code does exactly the same thing except it doesn't reset the > policy. The placement of the warning was fine before so it is fine now. Actually I disagree that the placement of the warning was fine before. As I said, in my original review comment: > ???????? Sorry this whole block is the wrong thing to do. It makes the > ???????? assumption that it "knows" the underlying security policy and > ???????? tries to provide a "helpful" message in anticipation that the > ???????? underlying security policy will reject the operation. The application code should not make the assumption that it knows the underlying security policy of the setpriority() call. That is a basic principle of Trusted Systems design. The best you can do is try the operation and if it fails, then try to issue a possibly helpful message based on the error that is returned. There are some Trusted Systems folks that don't believe in trying to interpret errno values either. Why? Because you can only code the errno values that you know today that are security policy related. You can't know if some system down the road will add a new errno value that's security related... My proposed do_set_native_prio_warning() function should actually take an errno parameter and it should only issue the warning if the errno is EACCES or EPERM. Another problem with the new code is that: ? warning("-XX:ThreadPriorityPolicy requires root privilege or the CAP_SYS_NICE capability."); will be issued when the user != root and the thread has the CAP_SYS_NICE capability so we'll be issuing a warning even though the setpriority() call should succeed. I don't think a false warning is acceptable. By moving the warning to where setpriority() has failed, we no longer would have the problem of a false warning. > Yes it could go on use (and add a only-warn-once hack) but why force > such a disruptive change that has no benefit? As I've pointed out above, I do think it has benefit and it meets a Trusted System design principle. As for the mechanism to only-warn-once, I'm sorry you consider it a hack. I consider it to be a useful way to avoid swamping the warning output with the same message. I know HotSpot does the same thing in other places so if it is a hack, then it is in good company. :-) Dan > > Cheers, > David > >> ???? L2321: OSReturn os::set_native_priority(Thread* thread, int >> newpri) { >> ???? L2322: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) >> return OS_OK; >> ???? L2323: >> ???? L2324: #ifdef __OpenBSD__ >> ???? L2325: ? // OpenBSD pthread_setprio starves low priority threads >> ???? L2326: ? return OS_OK; >> ???? L2327: #elif defined(__FreeBSD__) >> ???? L2328: ? int ret = >> pthread_setprio(thread->osthread()->pthread_id(), newpri); >> // Note the __FreeBSD__ branch here is broken; it is missing the >> return sequence. >> ?? ? ???????? if (ret != 0) { >> ??? ? ????????? do_set_native_prio_warning(); >> ??? ? ????????? return OS_ERR; >> ?? ? ???????? } >> ??? ? ??????? return OS_OK; >> ???? L2329: #elif defined(__APPLE__) || defined(__NetBSD__) >> ???? L2330: ? struct sched_param sp; >> ???? L2331: ? int policy; >> ???? L2332: >> ???? L2333: ? if >> (pthread_getschedparam(thread->osthread()->pthread_id(), &policy, >> &sp) != 0) { >> ????????? ? ??? do_set_native_prio_warning(); >> ???? L2334: ??? return OS_ERR; >> ???? L2335: ? } >> ???? L2336: >> ???? L2337: ? sp.sched_priority = newpri; >> ???? L2338: ? if >> (pthread_setschedparam(thread->osthread()->pthread_id(), policy, &sp) >> != 0) { >> ???????????? ?? do_set_native_prio_warning(); >> ???? L2339: ??? return OS_ERR; >> ???? L2340: ? } >> ???? L2341: >> ???? L2342: ? return OS_OK; >> ???? L2343: #else >> ???? L2344: ? int ret = setpriority(PRIO_PROCESS, >> thread->osthread()->thread_id(), newpri); >> ???? ? ?????? if (ret != 0) { >> ???? ? ???????? do_set_native_prio_warning(); >> ??????????? ? ? return OS_ERR; >> ?????????? ?? } >> ????????? ? ? return OS_OK;? // replace L2345 with this line. >> ???? L2345: ? return (ret == 0) ? OS_OK : OS_ERR; >> ???? L2346: #endif >> ???? L2347: } >> >> >> src/hotspot/os/linux/os_linux.cpp >> ???? L4080: // CAP_SYS_NICE capabilities >> ???????? Typo: "capabilities" -> "capability." (note the added period) >> >> ???? The same comment about prio_init() applies here: >> >> ???? L4103-L4111: >> ???????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be >> deleted. >> >> ???? static void do_set_native_prio_warning() { >> ???????? static bool has_warned = false; >> ???????? if (!has_warned) { >> ?????????? warning("-XX:ThreadPriorityPolicy requires root privilege >> or the CAP_SYS_NICE capability."); >> ?????????? has_warned = true; >> ???????? } >> ???? } >> >> ???? But the changes to os::set_native_priority() are much simpler: >> >> ???? L4118: OSReturn os::set_native_priority(Thread* thread, int >> newpri) { >> ???? L4119: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) >> return OS_OK; >> ???? L4120: >> ???? L4121: ? int ret = setpriority(PRIO_PROCESS, >> thread->osthread()->thread_id(), newpri); >> ????????????? if (ret != 0) { >> ??????????????? do_set_native_prio_warning(); >> ??????????????? return OS_ERR; >> ????????????? } >> ????????????? return OS_OK;? // replace L4122 with this line. >> ???? L4122: ? return (ret == 0) ? OS_OK : OS_ERR; >> ???? L4123: } >> >> In both os/bsd/os_bsd.cpp and os/linux/os_linux.cpp, the >> os::get_native_priority() code allows for the possibility >> of getting an error condition for getpriority(). I don't >> think we need a do_get_native_prio_warning() function here >> since the only threads we should be querying belong to the >> Java process so they should not fail the policy check. >> >> Dan >> >> >>> >>> >>> Best Regards ,? Matthias >> From coleen.phillimore at oracle.com Thu Jan 3 02:16:59 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Jan 2019 21:16:59 -0500 Subject: RFR (tedious) 8216022: Use #pragma once Message-ID: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> Summary: change include guards to #pragma once, except in generated header files. Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, windows-x64, built aarch64 with cross compiler, and zero. Ran tier1 and 2 tests. The webrev is huge but there are only 3 lines changed in each header file.? So click on the patch. I'll update the copyright headers with a script with the commit. Also, will do this after the shenandoah copyright headers are fixed. Adrian: I included you to check your platforms. Happy New Year! Coleen From coleen.phillimore at oracle.com Thu Jan 3 02:31:40 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 2 Jan 2019 21:31:40 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> Message-ID: Here is the webrev and bug link. open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8216022 On 1/2/19 9:16 PM, coleen.phillimore at oracle.com wrote: > Summary: change include guards to #pragma once, except in generated > header files. > > Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, > windows-x64, built aarch64 with cross compiler, and zero. > > Ran tier1 and 2 tests. > > The webrev is huge but there are only 3 lines changed in each header > file.? So click on the patch. > > I'll update the copyright headers with a script with the commit. Also, > will do this after the shenandoah copyright headers are fixed. > > Adrian: I included you to check your platforms. > > Happy New Year! > Coleen From kim.barrett at oracle.com Thu Jan 3 07:48:15 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Jan 2019 02:48:15 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> Message-ID: <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> > On Jan 2, 2019, at 9:31 PM, coleen.phillimore at oracle.com wrote: > > > Here is the webrev and bug link. > > open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8216022 > > On 1/2/19 9:16 PM, coleen.phillimore at oracle.com wrote: >> Summary: change include guards to #pragma once, except in generated header files. >> >> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, windows-x64, built aarch64 with cross compiler, and zero. >> >> Ran tier1 and 2 tests. >> >> The webrev is huge but there are only 3 lines changed in each header file. So click on the patch. >> >> I'll update the copyright headers with a script with the commit. Also, will do this after the shenandoah copyright headers are fixed. >> >> Adrian: I included you to check your platforms. >> >> Happy New Year! >> Coleen I think we shouldn't make this change without considering the impact of the following bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58770 GCC very slow compiling with #pragma once According to comment 3, using "#pragma once" introduces N^2 behavior on the number of included files, because the duplicate check uses a list rather than a hashtable. At the very least, a performance comparison should be made to find out what the impact of that bug is. A couple of comments regarding the problems with existing #include guards, and continuing to use #include guards instead of "#pragma once": (1) I wonder if a round trip using the following could fix the #include guards: https://github.com/cgmb/guardonce Utilities for converting from C/C++ include guards to #pragma once and back again. (2) For maintaining #include guards, clang has -Wheader-guard, which warns about mismatches between the #ifndef name and the #define. Of course, that doesn't solve all the issues with #include guards. And now for a more general discussion of "#pragma once". The question of whether to use "#pragma once" comes up pretty often in various open source projects and Q&A sites. So far, I haven't found any large cross-platform projects that provide headers to clients that have decided to go that way. But most of HotSpot is self-contained, which limits exposure to some of the issues below. An exception to that are the C headers providing interfaces to the VM, e.g. the files in src/hotspot/share/include. This suggests that perhaps these files should perhaps be excluded from the change, since they get used in whatever build environment a client uses. It also suggests the #include guard names for these files need careful namespace consideration, which clearly didn't happen with cds.h. For many compilers there isn't a good performance argument for using "#pragma once". gcc (for a long time), clang (always?), VS2015+ all do the #include guard optimization. (I think Solaris Studio might still not? And I have no idea about XLC++.) So the primary question seems to be the reduced clutter and avoidance of mistakes in #include guards, vs the possibility of cases where "#pragma once" doesn't work properly. Here's a list of some of the discussions I found: https://lists.boost.org/Archives/boost/2018/11/244423.php https://lists.qt-project.org/pipermail/development/2018-October/067452.html https://lists.qt-project.org/pipermail/development/2018-January/063932.html https://stackoverflow.com/questions/1143936/pragma-once-vs-include-guards https://www.reddit.com/r/cpp/comments/4cjjwe/come_on_guys_put_pragma_once_in_the_standard/d1j04te/ The main argument against "#pragma once" (besides being non-standard, so possibly not sufficiently portable, though we think all platforms supported by HotSpot have this feature) is that it is "unreliable". Unfortunately, details are hard to come by. I've seen claims that combining "#pragma once" with precompiled headers can cause problems, though the fact that Visual Studio has long supported both and they are commonly used together argues contrary. But perhaps there are additional factors needed for problems to arise, and those don't happen on Windows? My impression is that having sources spread across different file systems might be a source of problems, possibly in conjunction with other factors. Before you say "multiple file systems" is not a possible configuration for JDK builds, consider an out-of-tree build with the source and build directory on different file systems (and remember that our generated sources are in the build directory). I've seen suggestions that network file systems can also mess things up, though I didn't find details. I think an additional factor that might be relevant is the typical (but not specified by the standard) behavior of #include "..." first searching with respect to the current directory. I think this is at least potentially a concern for JDK builds on (perhaps odd) file system configurations. Some examples are discussed in the following messages. Having a bind-mount involved can mess things up, for example. I don't know if that's a realistic scenario for building the JDK or HotSpot. https://lists.qt-project.org/pipermail/development/2018-October/067467.html https://lists.qt-project.org/pipermail/development/2018-October/067471.html So there seems to be some risk with this change that it will result in build failures or bad builds in someone's build environment, but it is hard to characterize what a problematic build environment looks like, so hard to know how "reasonable" or "sane" such a build environment might be. Local testing is obviously inadequate for this change. Even running it through the Oracle build and test system doesn't seem sufficient to me. Having it checked by the various known build farms (SAP, Debian, Red Hat, maybe others) seems called for. It's good that the RFR specifically called out Debian to be checked. From david.holmes at oracle.com Thu Jan 3 07:53:29 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Jan 2019 17:53:29 +1000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> Message-ID: <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> On 3/01/2019 9:36 am, Daniel D. Daugherty wrote: > On 1/2/19 5:35 PM, David Holmes wrote: >> Hi Dan, >> >> On 3/01/2019 2:50 am, Daniel D. Daugherty wrote: >>> On 1/2/19 4:11 AM, Baesken, Matthias wrote: >>>> Hello , please? review the following patch . >>>> >>>> Currently, when ThreadPriorityPolicy is set to 1 (so called >>>> "Aggressive mode"), on linux and bsd(+Mac) a root-user-check >>>> (geteuid() != 0)) is done. >>>> See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp >>>> int prio_init(). >>>> >>>> However the root-user-check has a few drawbacks: >>>> - it blocks the capabilities feature available on current Linux >>>> distros (CAP_SYS_NICE capability) that can be used to allow setting >>>> lower niceness also for non-root >>>> - setting a higher "niceness" (lower priority) is not possible on >>>> Linux for non-root because of the geteuid check >>>> >>>> We? had a discussion about this in "ThreadPriorityPolicy settings >>>> for non-root users"? ,? with this suggestion : >>>> >>>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-December/035986.html >>>> >>>> >>>> .... >>>> >>>>> Just drop the root check for ThreadPriorityPolicy=1 and let the >>>>> underlying system >>>>> permissions control success or failure. >>>> I? did? the change? in this? webrev : >>>> >>>> >>>> Bug/webrev : >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8215962 >>>> >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ >>> >>> General >>> ?? - Please update Copyright years to 2019 before pushing. >>> >>> src/hotspot/share/runtime/globals.hpp >>> ???? L2045: ????????? "??? Linux/BSD/Mac this policy requires root >>> privilege or??????? "\ >>> ???????? Typo: "Mac" should be "macOS". >>> >>> ???? L2046: ????????? "??? extended >>> capabilites.")????????????????????????????????????? \ >>> ???????? Typo: "capabilites" -> "capabilities." >>> >>> ???????? Please consider: "an extended capability." since it is normal >>> ???????? for only a single capability to be required for a specific >>> ???????? policy override. >>> >>> src/hotspot/os/bsd/os_bsd.cpp >>> ???? L2260: // CAP_SYS_NICE capabilities >>> ???????? Typo: "capabilities" -> "capability." (note the added period) >>> >>> ???? L2306: ? if (ThreadPriorityPolicy == 1) { >>> ???? L2307: ??? // root and threads with capability CAP_SYS_NICE can >>> raise the thread priority >>> ???? L2308: ??? // however testing the CAP_SYS_NICE capability would >>> require libcap.so >>> ???? L2309: ??? if (geteuid() != 0) { >>> ???? L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { >>> ???? L2311: ??????? warning("-XX:ThreadPriorityPolicy requires root >>> privilege or CAP_SYS_NICE capability on Bsd"); >>> ???? L2312: ????? } >>> ???? L2313: ??? } >>> ???? L2314: ? } >>> ???????? Sorry this whole block is the wrong thing to do. It makes the >>> ???????? assumption that it "knows" the underlying security policy and >>> ???????? tries to provide a "helpful" message in anticipation that the >>> ???????? underlying security policy will reject the operation. >>> >>> ???????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be >>> deleted. >>> >>> ???? If you want to output a helpful warning, then you need to do it in >>> ???? the code that will actually get a policy failure: >>> >>> ???? static void do_set_native_prio_warning() { >>> ???????? static bool has_warned = false; >>> ???????? if (!has_warned) { >>> ?????????? warning("-XX:ThreadPriorityPolicy requires root privilege >>> or the CAP_SYS_NICE capability."); >>> ?????????? has_warned = true; >>> ???????? } >>> ???? } >> >> Sorry I disagree. The existing code checks for policy and whether root >> and issues a warning then resets policy. > > You have to be careful with the word 'policy' here. You are actually > talking about the 'ThreadPriorityPolicy' option here and not the > security policy associated with the setpriority() call. One thing > the application code cannot do here is reset the underlying > security policy. I'm only talking about ThreadPriorityPolicy. >> The new code does exactly the same thing except it doesn't reset the >> policy. The placement of the warning was fine before so it is fine now. > > Actually I disagree that the placement of the warning was fine before. Fine - but it seemed a little unfair to force a re-working of the overall approach taken by this code for many years just to effectively delete one line that resets the ThreadPriorityPolicy value. But see below ... > As I said, in my original review comment: > >> ???????? Sorry this whole block is the wrong thing to do. It makes the >> ???????? assumption that it "knows" the underlying security policy and >> ???????? tries to provide a "helpful" message in anticipation that the >> ???????? underlying security policy will reject the operation. > > The application code should not make the assumption that it knows the > underlying security policy of the setpriority() call. That is a basic > principle of Trusted Systems design. The best you can do is try the > operation and if it fails, then try to issue a possibly helpful message > based on the error that is returned. Sure and if this was being designed now rather than just being tweaked then a lot of things would be different. > There are some Trusted Systems > folks that don't believe in trying to interpret errno values either. > Why? Because you can only code the errno values that you know today that > are security policy related. You can't know if some system down the road > will add a new errno value that's security related... Not to get side-tracked but that's why you use standards that don't keep adding new error values - there's a fundamental incompatibility if you add new values that effectively overlap with the meaning of existing ones. > My proposed do_set_native_prio_warning() function should actually take > an errno parameter and it should only issue the warning if the errno is > EACCES or EPERM. Agreed. > Another problem with the new code is that: > > ? warning("-XX:ThreadPriorityPolicy requires root privilege or the > CAP_SYS_NICE capability."); > > will be issued when the user != root and the thread has the > CAP_SYS_NICE capability so we'll be issuing a warning even > though the setpriority() call should succeed. I don't think > a false warning is acceptable. It's not nice but I don't think it warrants being completely unacceptable. But I'll also note that this is even more complex than outlined because it's not just affected by being root or not, nor by CAP_SYS_NICE but also by the setting of RLIMIT_RTPRIO (on Linux). > By moving the warning to where setpriority() has failed, we no longer > would have the problem of a false warning. > >> Yes it could go on use (and add a only-warn-once hack) but why force >> such a disruptive change that has no benefit? > > As I've pointed out above, I do think it has benefit and it meets > a Trusted System design principle. As for the mechanism to > only-warn-once, I'm sorry you consider it a hack. I consider it > to be a useful way to avoid swamping the warning output with the > same message. I know HotSpot does the same thing in other places > so if it is a hack, then it is in good company. :-) Yes but it's typically done when there isn't a single place where we otherwise issue the warning. Ideally you detect capabilities upfront (ie when seeing ThreadPriorityPolicy has been set) and issue a warning then if warranted. But as noted we don't have a way to detect the appropriate capabilities without adding even more code. That said a single warning may not be appropriate here anyway as whether or not the change to priority fails is not just a function of the "permissions" but also whether the priority is being raised or lowered. Maybe this shouldn't be a warning at all, but materialize as a Java-level exception? See more below ... >>> ???? L2321: OSReturn os::set_native_priority(Thread* thread, int >>> newpri) { >>> ???? L2322: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) >>> return OS_OK; >>> ???? L2323: >>> ???? L2324: #ifdef __OpenBSD__ >>> ???? L2325: ? // OpenBSD pthread_setprio starves low priority threads >>> ???? L2326: ? return OS_OK; >>> ???? L2327: #elif defined(__FreeBSD__) >>> ???? L2328: ? int ret = >>> pthread_setprio(thread->osthread()->pthread_id(), newpri); >>> // Note the __FreeBSD__ branch here is broken; it is missing the >>> return sequence. Hmmm - okay this code is even more broken than I thought. I did warn Matthias that there is a good change these code paths may never have been used - seems this one hasn't even been built with a decent compiler (that would complain about the missing return). This does make me question how the calling code for this will actually handle the OS_ERR return? Will we see a nice Java exception thrown from Thread.start? Or will some other part of the VM code abort when seeing the error? (That would render the warning somewhat moot.) More below ... >>> ?? ? ???????? if (ret != 0) { >>> ??? ? ????????? do_set_native_prio_warning(); >>> ??? ? ????????? return OS_ERR; >>> ?? ? ???????? } >>> ??? ? ??????? return OS_OK; >>> ???? L2329: #elif defined(__APPLE__) || defined(__NetBSD__) >>> ???? L2330: ? struct sched_param sp; >>> ???? L2331: ? int policy; >>> ???? L2332: >>> ???? L2333: ? if >>> (pthread_getschedparam(thread->osthread()->pthread_id(), &policy, >>> &sp) != 0) { >>> ????????? ? ??? do_set_native_prio_warning(); >>> ???? L2334: ??? return OS_ERR; >>> ???? L2335: ? } >>> ???? L2336: >>> ???? L2337: ? sp.sched_priority = newpri; >>> ???? L2338: ? if >>> (pthread_setschedparam(thread->osthread()->pthread_id(), policy, &sp) >>> != 0) { >>> ???????????? ?? do_set_native_prio_warning(); >>> ???? L2339: ??? return OS_ERR; >>> ???? L2340: ? } >>> ???? L2341: >>> ???? L2342: ? return OS_OK; >>> ???? L2343: #else >>> ???? L2344: ? int ret = setpriority(PRIO_PROCESS, >>> thread->osthread()->thread_id(), newpri); >>> ???? ? ?????? if (ret != 0) { >>> ???? ? ???????? do_set_native_prio_warning(); >>> ??????????? ? ? return OS_ERR; >>> ?????????? ?? } >>> ????????? ? ? return OS_OK;? // replace L2345 with this line. >>> ???? L2345: ? return (ret == 0) ? OS_OK : OS_ERR; >>> ???? L2346: #endif >>> ???? L2347: } >>> >>> >>> src/hotspot/os/linux/os_linux.cpp >>> ???? L4080: // CAP_SYS_NICE capabilities >>> ???????? Typo: "capabilities" -> "capability." (note the added period) >>> >>> ???? The same comment about prio_init() applies here: >>> >>> ???? L4103-L4111: >>> ???????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be >>> deleted. >>> >>> ???? static void do_set_native_prio_warning() { >>> ???????? static bool has_warned = false; >>> ???????? if (!has_warned) { >>> ?????????? warning("-XX:ThreadPriorityPolicy requires root privilege >>> or the CAP_SYS_NICE capability."); The warning text doesn't cover all the possibilities and I don't think it should. Something more generic like: "-XX:ThreadPriorityPolicy=1 is affected by underlying system permissions and may trigger errors if priority is changed in ways that are not allowed" or as I said maybe this shouldn't be a warning at all ... Cheers, David ------ >>> ?????????? has_warned = true; >>> ???????? } >>> ???? } >>> >>> ???? But the changes to os::set_native_priority() are much simpler: >>> >>> ???? L4118: OSReturn os::set_native_priority(Thread* thread, int >>> newpri) { >>> ???? L4119: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) >>> return OS_OK; >>> ???? L4120: >>> ???? L4121: ? int ret = setpriority(PRIO_PROCESS, >>> thread->osthread()->thread_id(), newpri); >>> ????????????? if (ret != 0) { >>> ??????????????? do_set_native_prio_warning(); >>> ??????????????? return OS_ERR; >>> ????????????? } >>> ????????????? return OS_OK;? // replace L4122 with this line. >>> ???? L4122: ? return (ret == 0) ? OS_OK : OS_ERR; >>> ???? L4123: } >>> >>> In both os/bsd/os_bsd.cpp and os/linux/os_linux.cpp, the >>> os::get_native_priority() code allows for the possibility >>> of getting an error condition for getpriority(). I don't >>> think we need a do_get_native_prio_warning() function here >>> since the only threads we should be querying belong to the >>> Java process so they should not fail the policy check. >>> >>> Dan >>> >>> >>>> >>>> >>>> Best Regards ,? Matthias >>> > From aph at redhat.com Thu Jan 3 10:51:52 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 3 Jan 2019 10:51:52 +0000 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> Message-ID: <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> On 1/3/19 7:48 AM, Kim Barrett wrote: > I think we shouldn't make this change without considering the impact > of the following bug: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58770 > GCC very slow compiling with #pragma once > > According to comment 3, using "#pragma once" introduces N^2 behavior > on the number of included files, because the duplicate check uses a > list rather than a hashtable. > > At the very least, a performance comparison should be made to find out > what the impact of that bug is. Thank you for a very detailed analysis. I agree with you that this shouldn't be changed, at least for now. I've been following the discussion in the GCC lists and others for years, and the overall feeling I get is that #pragma once isn't quite what we need for header file inclusion. Instead, GCC and other compilers have a very high performance implementation of include guards. Nevertheless, I am still a GCC maintainer and I'll fix the bug in #pragma once if it's really going to be useful. But based on some of the known issues with #pragma once, I don't believe that it will be. #pragma once is a nice idea, but I think that's all it is. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From matthias.baesken at sap.com Thu Jan 3 11:13:55 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 3 Jan 2019 11:13:55 +0000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> Message-ID: Hello David and Dan , here is a second webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ - adjusted copyright years + fixed some typos - added the missing return for FreeBSD (pointed out by Dan) - removed the warning message completely Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 3. Januar 2019 08:53 > To: daniel.daugherty at oracle.com; Baesken, Matthias > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net> > Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root > users on linux/bsd > > On 3/01/2019 9:36 am, Daniel D. Daugherty wrote: > > On 1/2/19 5:35 PM, David Holmes wrote: > >> Hi Dan, > >> > >> On 3/01/2019 2:50 am, Daniel D. Daugherty wrote: > >>> On 1/2/19 4:11 AM, Baesken, Matthias wrote: > >>>> Hello , please? review the following patch . > >>>> > >>>> Currently, when ThreadPriorityPolicy is set to 1 (so called > >>>> "Aggressive mode"), on linux and bsd(+Mac) a root-user-check > >>>> (geteuid() != 0)) is done. > >>>> See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp > >>>> int prio_init(). > >>>> > >>>> However the root-user-check has a few drawbacks: > >>>> - it blocks the capabilities feature available on current Linux > >>>> distros (CAP_SYS_NICE capability) that can be used to allow setting > >>>> lower niceness also for non-root > >>>> - setting a higher "niceness" (lower priority) is not possible on > >>>> Linux for non-root because of the geteuid check > >>>> > >>>> We? had a discussion about this in "ThreadPriorityPolicy settings > >>>> for non-root users"? ,? with this suggestion : > >>>> > >>>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2018- > December/035986.html > >>>> > >>>> > >>>> .... > >>>> > >>>>> Just drop the root check for ThreadPriorityPolicy=1 and let the > >>>>> underlying system > >>>>> permissions control success or failure. > >>>> I? did? the change? in this? webrev : > >>>> > >>>> > >>>> Bug/webrev : > >>>> > >>>> https://bugs.openjdk.java.net/browse/JDK-8215962 > >>>> > >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ > >>> > >>> General > >>> ?? - Please update Copyright years to 2019 before pushing. > >>> > >>> src/hotspot/share/runtime/globals.hpp > >>> ???? L2045: ????????? "??? Linux/BSD/Mac this policy requires root > >>> privilege or??????? "\ > >>> ???????? Typo: "Mac" should be "macOS". > >>> > >>> ???? L2046: ????????? "??? extended > >>> capabilites.")????????????????????????????????????? \ > >>> ???????? Typo: "capabilites" -> "capabilities." > >>> > >>> ???????? Please consider: "an extended capability." since it is normal > >>> ???????? for only a single capability to be required for a specific > >>> ???????? policy override. > >>> > >>> src/hotspot/os/bsd/os_bsd.cpp > >>> ???? L2260: // CAP_SYS_NICE capabilities > >>> ???????? Typo: "capabilities" -> "capability." (note the added period) > >>> > >>> ???? L2306: ? if (ThreadPriorityPolicy == 1) { > >>> ???? L2307: ??? // root and threads with capability CAP_SYS_NICE can > >>> raise the thread priority > >>> ???? L2308: ??? // however testing the CAP_SYS_NICE capability would > >>> require libcap.so > >>> ???? L2309: ??? if (geteuid() != 0) { > >>> ???? L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { > >>> ???? L2311: ??????? warning("-XX:ThreadPriorityPolicy requires root > >>> privilege or CAP_SYS_NICE capability on Bsd"); > >>> ???? L2312: ????? } > >>> ???? L2313: ??? } > >>> ???? L2314: ? } > >>> ???????? Sorry this whole block is the wrong thing to do. It makes the > >>> ???????? assumption that it "knows" the underlying security policy and > >>> ???????? tries to provide a "helpful" message in anticipation that the > >>> ???????? underlying security policy will reject the operation. > >>> > >>> ???????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be > >>> deleted. > >>> > >>> ???? If you want to output a helpful warning, then you need to do it in > >>> ???? the code that will actually get a policy failure: > >>> > >>> ???? static void do_set_native_prio_warning() { > >>> ???????? static bool has_warned = false; > >>> ???????? if (!has_warned) { > >>> ?????????? warning("-XX:ThreadPriorityPolicy requires root privilege > >>> or the CAP_SYS_NICE capability."); > >>> ?????????? has_warned = true; > >>> ???????? } > >>> ???? } > >> > >> Sorry I disagree. The existing code checks for policy and whether root > >> and issues a warning then resets policy. > > > > You have to be careful with the word 'policy' here. You are actually > > talking about the 'ThreadPriorityPolicy' option here and not the > > security policy associated with the setpriority() call. One thing > > the application code cannot do here is reset the underlying > > security policy. > > I'm only talking about ThreadPriorityPolicy. > > >> The new code does exactly the same thing except it doesn't reset the > >> policy. The placement of the warning was fine before so it is fine now. > > > > Actually I disagree that the placement of the warning was fine before. > > Fine - but it seemed a little unfair to force a re-working of the > overall approach taken by this code for many years just to effectively > delete one line that resets the ThreadPriorityPolicy value. > > But see below ... > > > > As I said, in my original review comment: > > > >> ???????? Sorry this whole block is the wrong thing to do. It makes the > >> ???????? assumption that it "knows" the underlying security policy and > >> ???????? tries to provide a "helpful" message in anticipation that the > >> ???????? underlying security policy will reject the operation. > > > > The application code should not make the assumption that it knows the > > underlying security policy of the setpriority() call. That is a basic > > principle of Trusted Systems design. The best you can do is try the > > operation and if it fails, then try to issue a possibly helpful message > > based on the error that is returned. > > Sure and if this was being designed now rather than just being tweaked > then a lot of things would be different. > > > There are some Trusted Systems > > folks that don't believe in trying to interpret errno values either. > > Why? Because you can only code the errno values that you know today > that > > are security policy related. You can't know if some system down the road > > will add a new errno value that's security related... > > Not to get side-tracked but that's why you use standards that don't keep > adding new error values - there's a fundamental incompatibility if you > add new values that effectively overlap with the meaning of existing ones. > > > My proposed do_set_native_prio_warning() function should actually take > > an errno parameter and it should only issue the warning if the errno is > > EACCES or EPERM. > > Agreed. > > > Another problem with the new code is that: > > > > ? warning("-XX:ThreadPriorityPolicy requires root privilege or the > > CAP_SYS_NICE capability."); > > > > will be issued when the user != root and the thread has the > > CAP_SYS_NICE capability so we'll be issuing a warning even > > though the setpriority() call should succeed. I don't think > > a false warning is acceptable. > > It's not nice but I don't think it warrants being completely unacceptable. > > But I'll also note that this is even more complex than outlined because > it's not just affected by being root or not, nor by CAP_SYS_NICE but > also by the setting of RLIMIT_RTPRIO (on Linux). > > > By moving the warning to where setpriority() has failed, we no longer > > would have the problem of a false warning. > > > >> Yes it could go on use (and add a only-warn-once hack) but why force > >> such a disruptive change that has no benefit? > > > > As I've pointed out above, I do think it has benefit and it meets > > a Trusted System design principle. As for the mechanism to > > only-warn-once, I'm sorry you consider it a hack. I consider it > > to be a useful way to avoid swamping the warning output with the > > same message. I know HotSpot does the same thing in other places > > so if it is a hack, then it is in good company. :-) > > Yes but it's typically done when there isn't a single place where we > otherwise issue the warning. Ideally you detect capabilities upfront (ie > when seeing ThreadPriorityPolicy has been set) and issue a warning then > if warranted. But as noted we don't have a way to detect the appropriate > capabilities without adding even more code. > > That said a single warning may not be appropriate here anyway as whether > or not the change to priority fails is not just a function of the > "permissions" but also whether the priority is being raised or lowered. > Maybe this shouldn't be a warning at all, but materialize as a > Java-level exception? > > See more below ... > > >>> ???? L2321: OSReturn os::set_native_priority(Thread* thread, int > >>> newpri) { > >>> ???? L2322: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) > >>> return OS_OK; > >>> ???? L2323: > >>> ???? L2324: #ifdef __OpenBSD__ > >>> ???? L2325: ? // OpenBSD pthread_setprio starves low priority threads > >>> ???? L2326: ? return OS_OK; > >>> ???? L2327: #elif defined(__FreeBSD__) > >>> ???? L2328: ? int ret = > >>> pthread_setprio(thread->osthread()->pthread_id(), newpri); > >>> // Note the __FreeBSD__ branch here is broken; it is missing the > >>> return sequence. > > Hmmm - okay this code is even more broken than I thought. I did warn > Matthias that there is a good change these code paths may never have > been used - seems this one hasn't even been built with a decent compiler > (that would complain about the missing return). > > This does make me question how the calling code for this will actually > handle the OS_ERR return? Will we see a nice Java exception thrown from > Thread.start? Or will some other part of the VM code abort when seeing > the error? (That would render the warning somewhat moot.) > > More below ... > > >>> ?? ? ???????? if (ret != 0) { > >>> ??? ? ????????? do_set_native_prio_warning(); > >>> ??? ? ????????? return OS_ERR; > >>> ?? ? ???????? } > >>> ??? ? ??????? return OS_OK; > >>> ???? L2329: #elif defined(__APPLE__) || defined(__NetBSD__) > >>> ???? L2330: ? struct sched_param sp; > >>> ???? L2331: ? int policy; > >>> ???? L2332: > >>> ???? L2333: ? if > >>> (pthread_getschedparam(thread->osthread()->pthread_id(), &policy, > >>> &sp) != 0) { > >>> ????????? ? ??? do_set_native_prio_warning(); > >>> ???? L2334: ??? return OS_ERR; > >>> ???? L2335: ? } > >>> ???? L2336: > >>> ???? L2337: ? sp.sched_priority = newpri; > >>> ???? L2338: ? if > >>> (pthread_setschedparam(thread->osthread()->pthread_id(), policy, > &sp) > >>> != 0) { > >>> ???????????? ?? do_set_native_prio_warning(); > >>> ???? L2339: ??? return OS_ERR; > >>> ???? L2340: ? } > >>> ???? L2341: > >>> ???? L2342: ? return OS_OK; > >>> ???? L2343: #else > >>> ???? L2344: ? int ret = setpriority(PRIO_PROCESS, > >>> thread->osthread()->thread_id(), newpri); > >>> ???? ? ?????? if (ret != 0) { > >>> ???? ? ???????? do_set_native_prio_warning(); > >>> ??????????? ? ? return OS_ERR; > >>> ?????????? ?? } > >>> ????????? ? ? return OS_OK;? // replace L2345 with this line. > >>> ???? L2345: ? return (ret == 0) ? OS_OK : OS_ERR; > >>> ???? L2346: #endif > >>> ???? L2347: } > >>> > >>> > >>> src/hotspot/os/linux/os_linux.cpp > >>> ???? L4080: // CAP_SYS_NICE capabilities > >>> ???????? Typo: "capabilities" -> "capability." (note the added period) > >>> > >>> ???? The same comment about prio_init() applies here: > >>> > >>> ???? L4103-L4111: > >>> ???????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be > >>> deleted. > >>> > >>> ???? static void do_set_native_prio_warning() { > >>> ???????? static bool has_warned = false; > >>> ???????? if (!has_warned) { > >>> ?????????? warning("-XX:ThreadPriorityPolicy requires root privilege > >>> or the CAP_SYS_NICE capability."); > > The warning text doesn't cover all the possibilities and I don't think > it should. Something more generic like: > > "-XX:ThreadPriorityPolicy=1 is affected by underlying system permissions > and may trigger errors if priority is changed in ways that are not allowed" > > or as I said maybe this shouldn't be a warning at all ... > > Cheers, > David > ------ > > >>> ?????????? has_warned = true; > >>> ???????? } > >>> ???? } > >>> > >>> ???? But the changes to os::set_native_priority() are much simpler: > >>> > >>> ???? L4118: OSReturn os::set_native_priority(Thread* thread, int > >>> newpri) { > >>> ???? L4119: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) > >>> return OS_OK; > >>> ???? L4120: > >>> ???? L4121: ? int ret = setpriority(PRIO_PROCESS, > >>> thread->osthread()->thread_id(), newpri); > >>> ????????????? if (ret != 0) { > >>> ??????????????? do_set_native_prio_warning(); > >>> ??????????????? return OS_ERR; > >>> ????????????? } > >>> ????????????? return OS_OK;? // replace L4122 with this line. > >>> ???? L4122: ? return (ret == 0) ? OS_OK : OS_ERR; > >>> ???? L4123: } > >>> > >>> In both os/bsd/os_bsd.cpp and os/linux/os_linux.cpp, the > >>> os::get_native_priority() code allows for the possibility > >>> of getting an error condition for getpriority(). I don't > >>> think we need a do_get_native_prio_warning() function here > >>> since the only threads we should be querying belong to the > >>> Java process so they should not fail the policy check. > >>> > >>> Dan > >>> > >>> > >>>> > >>>> > >>>> Best Regards ,? Matthias > >>> > > From david.holmes at oracle.com Thu Jan 3 13:02:21 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 3 Jan 2019 23:02:21 +1000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> Message-ID: Hi Matthias, On 3/01/2019 9:13 pm, Baesken, Matthias wrote: > Hello David and Dan , here is a second webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ > > - adjusted copyright years + fixed some typos > - added the missing return for FreeBSD (pointed out by Dan) > - removed the warning message completely I still want to know how the OS_ERR gets handled by all the higher level code. How will this failure at runtime get reported back to application code? ! // It is only used when ThreadPriorityPolicy=1 and requires root privilege or ! // CAP_SYS_NICE capability. As I stated this is not a complete statement as on Linux at least you also have to account for RLIMIT_RTPRIO. Thanks, David ----- > Best regards, Matthias > > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 3. Januar 2019 08:53 >> To: daniel.daugherty at oracle.com; Baesken, Matthias >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net> >> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root >> users on linux/bsd >> >> On 3/01/2019 9:36 am, Daniel D. Daugherty wrote: >>> On 1/2/19 5:35 PM, David Holmes wrote: >>>> Hi Dan, >>>> >>>> On 3/01/2019 2:50 am, Daniel D. Daugherty wrote: >>>>> On 1/2/19 4:11 AM, Baesken, Matthias wrote: >>>>>> Hello , please? review the following patch . >>>>>> >>>>>> Currently, when ThreadPriorityPolicy is set to 1 (so called >>>>>> "Aggressive mode"), on linux and bsd(+Mac) a root-user-check >>>>>> (geteuid() != 0)) is done. >>>>>> See for example the coding in jdk/src/hotspot/os/linux/os_linux.cpp >>>>>> int prio_init(). >>>>>> >>>>>> However the root-user-check has a few drawbacks: >>>>>> - it blocks the capabilities feature available on current Linux >>>>>> distros (CAP_SYS_NICE capability) that can be used to allow setting >>>>>> lower niceness also for non-root >>>>>> - setting a higher "niceness" (lower priority) is not possible on >>>>>> Linux for non-root because of the geteuid check >>>>>> >>>>>> We? had a discussion about this in "ThreadPriorityPolicy settings >>>>>> for non-root users"? ,? with this suggestion : >>>>>> >>>>>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2018- >> December/035986.html >>>>>> >>>>>> >>>>>> .... >>>>>> >>>>>>> Just drop the root check for ThreadPriorityPolicy=1 and let the >>>>>>> underlying system >>>>>>> permissions control success or failure. >>>>>> I? did? the change? in this? webrev : >>>>>> >>>>>> >>>>>> Bug/webrev : >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8215962 >>>>>> >>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.0/ >>>>> >>>>> General >>>>> ?? - Please update Copyright years to 2019 before pushing. >>>>> >>>>> src/hotspot/share/runtime/globals.hpp >>>>> ???? L2045: ????????? "??? Linux/BSD/Mac this policy requires root >>>>> privilege or??????? "\ >>>>> ???????? Typo: "Mac" should be "macOS". >>>>> >>>>> ???? L2046: ????????? "??? extended >>>>> capabilites.")????????????????????????????????????? \ >>>>> ???????? Typo: "capabilites" -> "capabilities." >>>>> >>>>> ???????? Please consider: "an extended capability." since it is normal >>>>> ???????? for only a single capability to be required for a specific >>>>> ???????? policy override. >>>>> >>>>> src/hotspot/os/bsd/os_bsd.cpp >>>>> ???? L2260: // CAP_SYS_NICE capabilities >>>>> ???????? Typo: "capabilities" -> "capability." (note the added period) >>>>> >>>>> ???? L2306: ? if (ThreadPriorityPolicy == 1) { >>>>> ???? L2307: ??? // root and threads with capability CAP_SYS_NICE can >>>>> raise the thread priority >>>>> ???? L2308: ??? // however testing the CAP_SYS_NICE capability would >>>>> require libcap.so >>>>> ???? L2309: ??? if (geteuid() != 0) { >>>>> ???? L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { >>>>> ???? L2311: ??????? warning("-XX:ThreadPriorityPolicy requires root >>>>> privilege or CAP_SYS_NICE capability on Bsd"); >>>>> ???? L2312: ????? } >>>>> ???? L2313: ??? } >>>>> ???? L2314: ? } >>>>> ???????? Sorry this whole block is the wrong thing to do. It makes the >>>>> ???????? assumption that it "knows" the underlying security policy and >>>>> ???????? tries to provide a "helpful" message in anticipation that the >>>>> ???????? underlying security policy will reject the operation. >>>>> >>>>> ???????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be >>>>> deleted. >>>>> >>>>> ???? If you want to output a helpful warning, then you need to do it in >>>>> ???? the code that will actually get a policy failure: >>>>> >>>>> ???? static void do_set_native_prio_warning() { >>>>> ???????? static bool has_warned = false; >>>>> ???????? if (!has_warned) { >>>>> ?????????? warning("-XX:ThreadPriorityPolicy requires root privilege >>>>> or the CAP_SYS_NICE capability."); >>>>> ?????????? has_warned = true; >>>>> ???????? } >>>>> ???? } >>>> >>>> Sorry I disagree. The existing code checks for policy and whether root >>>> and issues a warning then resets policy. >>> >>> You have to be careful with the word 'policy' here. You are actually >>> talking about the 'ThreadPriorityPolicy' option here and not the >>> security policy associated with the setpriority() call. One thing >>> the application code cannot do here is reset the underlying >>> security policy. >> >> I'm only talking about ThreadPriorityPolicy. >> >>>> The new code does exactly the same thing except it doesn't reset the >>>> policy. The placement of the warning was fine before so it is fine now. >>> >>> Actually I disagree that the placement of the warning was fine before. >> >> Fine - but it seemed a little unfair to force a re-working of the >> overall approach taken by this code for many years just to effectively >> delete one line that resets the ThreadPriorityPolicy value. >> >> But see below ... >> >> >>> As I said, in my original review comment: >>> >>>> ???????? Sorry this whole block is the wrong thing to do. It makes the >>>> ???????? assumption that it "knows" the underlying security policy and >>>> ???????? tries to provide a "helpful" message in anticipation that the >>>> ???????? underlying security policy will reject the operation. >>> >>> The application code should not make the assumption that it knows the >>> underlying security policy of the setpriority() call. That is a basic >>> principle of Trusted Systems design. The best you can do is try the >>> operation and if it fails, then try to issue a possibly helpful message >>> based on the error that is returned. >> >> Sure and if this was being designed now rather than just being tweaked >> then a lot of things would be different. >> >>> There are some Trusted Systems >>> folks that don't believe in trying to interpret errno values either. >>> Why? Because you can only code the errno values that you know today >> that >>> are security policy related. You can't know if some system down the road >>> will add a new errno value that's security related... >> >> Not to get side-tracked but that's why you use standards that don't keep >> adding new error values - there's a fundamental incompatibility if you >> add new values that effectively overlap with the meaning of existing ones. >> >>> My proposed do_set_native_prio_warning() function should actually take >>> an errno parameter and it should only issue the warning if the errno is >>> EACCES or EPERM. >> >> Agreed. >> >>> Another problem with the new code is that: >>> >>> ? warning("-XX:ThreadPriorityPolicy requires root privilege or the >>> CAP_SYS_NICE capability."); >>> >>> will be issued when the user != root and the thread has the >>> CAP_SYS_NICE capability so we'll be issuing a warning even >>> though the setpriority() call should succeed. I don't think >>> a false warning is acceptable. >> >> It's not nice but I don't think it warrants being completely unacceptable. >> >> But I'll also note that this is even more complex than outlined because >> it's not just affected by being root or not, nor by CAP_SYS_NICE but >> also by the setting of RLIMIT_RTPRIO (on Linux). >> >>> By moving the warning to where setpriority() has failed, we no longer >>> would have the problem of a false warning. >>> >>>> Yes it could go on use (and add a only-warn-once hack) but why force >>>> such a disruptive change that has no benefit? >>> >>> As I've pointed out above, I do think it has benefit and it meets >>> a Trusted System design principle. As for the mechanism to >>> only-warn-once, I'm sorry you consider it a hack. I consider it >>> to be a useful way to avoid swamping the warning output with the >>> same message. I know HotSpot does the same thing in other places >>> so if it is a hack, then it is in good company. :-) >> >> Yes but it's typically done when there isn't a single place where we >> otherwise issue the warning. Ideally you detect capabilities upfront (ie >> when seeing ThreadPriorityPolicy has been set) and issue a warning then >> if warranted. But as noted we don't have a way to detect the appropriate >> capabilities without adding even more code. >> >> That said a single warning may not be appropriate here anyway as whether >> or not the change to priority fails is not just a function of the >> "permissions" but also whether the priority is being raised or lowered. >> Maybe this shouldn't be a warning at all, but materialize as a >> Java-level exception? >> >> See more below ... >> >>>>> ???? L2321: OSReturn os::set_native_priority(Thread* thread, int >>>>> newpri) { >>>>> ???? L2322: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) >>>>> return OS_OK; >>>>> ???? L2323: >>>>> ???? L2324: #ifdef __OpenBSD__ >>>>> ???? L2325: ? // OpenBSD pthread_setprio starves low priority threads >>>>> ???? L2326: ? return OS_OK; >>>>> ???? L2327: #elif defined(__FreeBSD__) >>>>> ???? L2328: ? int ret = >>>>> pthread_setprio(thread->osthread()->pthread_id(), newpri); >>>>> // Note the __FreeBSD__ branch here is broken; it is missing the >>>>> return sequence. >> >> Hmmm - okay this code is even more broken than I thought. I did warn >> Matthias that there is a good change these code paths may never have >> been used - seems this one hasn't even been built with a decent compiler >> (that would complain about the missing return). >> >> This does make me question how the calling code for this will actually >> handle the OS_ERR return? Will we see a nice Java exception thrown from >> Thread.start? Or will some other part of the VM code abort when seeing >> the error? (That would render the warning somewhat moot.) >> >> More below ... >> >>>>> ?? ? ???????? if (ret != 0) { >>>>> ??? ? ????????? do_set_native_prio_warning(); >>>>> ??? ? ????????? return OS_ERR; >>>>> ?? ? ???????? } >>>>> ??? ? ??????? return OS_OK; >>>>> ???? L2329: #elif defined(__APPLE__) || defined(__NetBSD__) >>>>> ???? L2330: ? struct sched_param sp; >>>>> ???? L2331: ? int policy; >>>>> ???? L2332: >>>>> ???? L2333: ? if >>>>> (pthread_getschedparam(thread->osthread()->pthread_id(), &policy, >>>>> &sp) != 0) { >>>>> ????????? ? ??? do_set_native_prio_warning(); >>>>> ???? L2334: ??? return OS_ERR; >>>>> ???? L2335: ? } >>>>> ???? L2336: >>>>> ???? L2337: ? sp.sched_priority = newpri; >>>>> ???? L2338: ? if >>>>> (pthread_setschedparam(thread->osthread()->pthread_id(), policy, >> &sp) >>>>> != 0) { >>>>> ???????????? ?? do_set_native_prio_warning(); >>>>> ???? L2339: ??? return OS_ERR; >>>>> ???? L2340: ? } >>>>> ???? L2341: >>>>> ???? L2342: ? return OS_OK; >>>>> ???? L2343: #else >>>>> ???? L2344: ? int ret = setpriority(PRIO_PROCESS, >>>>> thread->osthread()->thread_id(), newpri); >>>>> ???? ? ?????? if (ret != 0) { >>>>> ???? ? ???????? do_set_native_prio_warning(); >>>>> ??????????? ? ? return OS_ERR; >>>>> ?????????? ?? } >>>>> ????????? ? ? return OS_OK;? // replace L2345 with this line. >>>>> ???? L2345: ? return (ret == 0) ? OS_OK : OS_ERR; >>>>> ???? L2346: #endif >>>>> ???? L2347: } >>>>> >>>>> >>>>> src/hotspot/os/linux/os_linux.cpp >>>>> ???? L4080: // CAP_SYS_NICE capabilities >>>>> ???????? Typo: "capabilities" -> "capability." (note the added period) >>>>> >>>>> ???? The same comment about prio_init() applies here: >>>>> >>>>> ???? L4103-L4111: >>>>> ???????? The "(ThreadPriorityPolicy == 1) {" if-block needs to be >>>>> deleted. >>>>> >>>>> ???? static void do_set_native_prio_warning() { >>>>> ???????? static bool has_warned = false; >>>>> ???????? if (!has_warned) { >>>>> ?????????? warning("-XX:ThreadPriorityPolicy requires root privilege >>>>> or the CAP_SYS_NICE capability."); >> >> The warning text doesn't cover all the possibilities and I don't think >> it should. Something more generic like: >> >> "-XX:ThreadPriorityPolicy=1 is affected by underlying system permissions >> and may trigger errors if priority is changed in ways that are not allowed" >> >> or as I said maybe this shouldn't be a warning at all ... >> >> Cheers, >> David >> ------ >> >>>>> ?????????? has_warned = true; >>>>> ???????? } >>>>> ???? } >>>>> >>>>> ???? But the changes to os::set_native_priority() are much simpler: >>>>> >>>>> ???? L4118: OSReturn os::set_native_priority(Thread* thread, int >>>>> newpri) { >>>>> ???? L4119: ? if (!UseThreadPriorities || ThreadPriorityPolicy == 0) >>>>> return OS_OK; >>>>> ???? L4120: >>>>> ???? L4121: ? int ret = setpriority(PRIO_PROCESS, >>>>> thread->osthread()->thread_id(), newpri); >>>>> ????????????? if (ret != 0) { >>>>> ??????????????? do_set_native_prio_warning(); >>>>> ??????????????? return OS_ERR; >>>>> ????????????? } >>>>> ????????????? return OS_OK;? // replace L4122 with this line. >>>>> ???? L4122: ? return (ret == 0) ? OS_OK : OS_ERR; >>>>> ???? L4123: } >>>>> >>>>> In both os/bsd/os_bsd.cpp and os/linux/os_linux.cpp, the >>>>> os::get_native_priority() code allows for the possibility >>>>> of getting an error condition for getpriority(). I don't >>>>> think we need a do_get_native_prio_warning() function here >>>>> since the only threads we should be querying belong to the >>>>> Java process so they should not fail the policy check. >>>>> >>>>> Dan >>>>> >>>>> >>>>>> >>>>>> >>>>>> Best Regards ,? Matthias >>>>> >>> From goetz.lindenmaier at sap.com Thu Jan 3 13:41:14 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 3 Jan 2019 13:41:14 +0000 Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java fails on AArch64 In-Reply-To: References: Message-ID: <3c433e428a7f4e9aa4ed8fdf2643d64f@sap.com> Hi Matthias, the change looks good to me. But looking at the code, I saw that s390 says "zArch" there. We use the string "s390" throughout the code to name the platform, so I think this should say "s390". In documentation, we use "z/Architecture", as well as in some version messages. So this would also be an option. Could you fix this too, please? And adapt the test? @Lutz, what do you think? Best regards, Goetz. > -----Original Message----- > From: hotspot-dev On Behalf Of > Baesken, Matthias > Sent: Freitag, 28. Dezember 2018 14:36 > To: 'hotspot-dev at openjdk.java.net' > Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java fails on > AArch64 > > Hello, please review this small fix . > > At the moment, the test jdk/jfr/event/os/TestCPUInformation.java fails > on AArch64 with the following error : > > > java.lang.RuntimeException: Value not in (Intel, AMD, Unknown x86, SPARC, > ARM, PPC, PowerPC, AArch64, zArch), field='description', > value='0x50:0x0:0x000:1, simd' > > > Reason is that the jdk.CPUInformation event misses a known CPU > identifier value in the description, see the description part of it : > > Event: jdk.CPUInformation { > .... > description = "0x50:0x0:0x000:1, simd" > sockets = 8 > .... > } > > > The patch adds the CPU identifier info to the _cpu_desc string where it is > taken from . > Please compare also with the ppc - implementation where the info (PPC) is > already added . > > vm_version_ext_ppc.cpp > > 50 snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "PPC %s", > features_string()); > > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8215961 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215961.0/ > > > Thanks, Matthias From lutz.schmidt at sap.com Thu Jan 3 14:47:20 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 3 Jan 2019 14:47:20 +0000 Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java fails on AArch64 In-Reply-To: <3c433e428a7f4e9aa4ed8fdf2643d64f@sap.com> References: <3c433e428a7f4e9aa4ed8fdf2643d64f@sap.com> Message-ID: Hi, I would suggest to replace "zArch" with "s390" to use the same term everywhere. There is reason for some hope this change will avoid confusion in the future. Regards, Lutz ?On 03.01.19, 14:41, "Lindenmaier, Goetz" wrote: Hi Matthias, the change looks good to me. But looking at the code, I saw that s390 says "zArch" there. We use the string "s390" throughout the code to name the platform, so I think this should say "s390". In documentation, we use "z/Architecture", as well as in some version messages. So this would also be an option. Could you fix this too, please? And adapt the test? @Lutz, what do you think? Best regards, Goetz. > -----Original Message----- > From: hotspot-dev On Behalf Of > Baesken, Matthias > Sent: Freitag, 28. Dezember 2018 14:36 > To: 'hotspot-dev at openjdk.java.net' > Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java fails on > AArch64 > > Hello, please review this small fix . > > At the moment, the test jdk/jfr/event/os/TestCPUInformation.java fails > on AArch64 with the following error : > > > java.lang.RuntimeException: Value not in (Intel, AMD, Unknown x86, SPARC, > ARM, PPC, PowerPC, AArch64, zArch), field='description', > value='0x50:0x0:0x000:1, simd' > > > Reason is that the jdk.CPUInformation event misses a known CPU > identifier value in the description, see the description part of it : > > Event: jdk.CPUInformation { > .... > description = "0x50:0x0:0x000:1, simd" > sockets = 8 > .... > } > > > The patch adds the CPU identifier info to the _cpu_desc string where it is > taken from . > Please compare also with the ppc - implementation where the info (PPC) is > already added . > > vm_version_ext_ppc.cpp > > 50 snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "PPC %s", > features_string()); > > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8215961 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215961.0/ > > > Thanks, Matthias From matthias.baesken at sap.com Thu Jan 3 14:56:33 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 3 Jan 2019 14:56:33 +0000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> Message-ID: > I still want to know how the OS_ERR gets handled by all the higher level > code. How will this failure at runtime get reported back to application code? Hi David , the "best practice" currently is to just ignore the return code of os::set_native_priority , it is called and we hope for the best , for example : jdk/src/hotspot/share/runtime/vmThread.cpp 299 int prio = (VMThreadPriority == -1) 300 ? os::java_to_os_priority[NearMaxPriority] 301 : VMThreadPriority; 302 // Note that I cannot call os::set_priority because it expects Java 303 // priorities and I am *explicitly* using OS priorities so that it's 304 // possible to set the VM thread priority higher than any Java thread. 305 os::set_native_priority( this, prio ); jdk/src/hotspot/share/compiler/compileBroker.cpp 783 int native_prio = CompilerThreadPriority; 784 if (native_prio == -1) { 785 if (UseCriticalCompilerThreadPriority) { 786 native_prio = os::java_to_os_priority[CriticalPriority]; 787 } else { 788 native_prio = os::java_to_os_priority[NearMaxPriority]; 789 } 790 } 791 os::set_native_priority(thread, native_prio); A difference is jdk/src/hotspot/share/runtime/os.cpp 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { 218 debug_only(Thread::check_for_dangling_thread_pointer(thread);) 219 220 if (p >= MinPriority && p <= MaxPriority) { 221 int priority = java_to_os_priority[p]; 222 return set_native_priority(thread, priority); Where the return code of set_native_priority() is returned (however then it is later usually not handled by the callers of os::set_priority. > > As I stated this is not a complete statement as on Linux at least you > also have to account for RLIMIT_RTPRIO. > If you want me to do, I can for course add a short statement about this . Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 3. Januar 2019 14:02 > To: Baesken, Matthias ; > daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net> > Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root > users on linux/bsd > > Hi Matthias, > > On 3/01/2019 9:13 pm, Baesken, Matthias wrote: > > Hello David and Dan , here is a second webrev : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ > > > > - adjusted copyright years + fixed some typos > > - added the missing return for FreeBSD (pointed out by Dan) > > - removed the warning message completely > > I still want to know how the OS_ERR gets handled by all the higher level > code. How will this failure at runtime get reported back to application > code? > > ! // It is only used when ThreadPriorityPolicy=1 and requires root > privilege or > ! // CAP_SYS_NICE capability. > > As I stated this is not a complete statement as on Linux at least you > also have to account for RLIMIT_RTPRIO. > > Thanks, > David > ----- > From coleen.phillimore at oracle.com Thu Jan 3 14:55:11 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 3 Jan 2019 09:55:11 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> Message-ID: <1178e8db-3ab5-d3dc-9321-827a5fbd777b@oracle.com> On 1/3/19 2:48 AM, Kim Barrett wrote: >> On Jan 2, 2019, at 9:31 PM, coleen.phillimore at oracle.com wrote: >> >> >> Here is the webrev and bug link. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8216022 >> >> On 1/2/19 9:16 PM, coleen.phillimore at oracle.com wrote: >>> Summary: change include guards to #pragma once, except in generated header files. >>> >>> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, windows-x64, built aarch64 with cross compiler, and zero. >>> >>> Ran tier1 and 2 tests. >>> >>> The webrev is huge but there are only 3 lines changed in each header file. So click on the patch. >>> >>> I'll update the copyright headers with a script with the commit. Also, will do this after the shenandoah copyright headers are fixed. >>> >>> Adrian: I included you to check your platforms. >>> >>> Happy New Year! >>> Coleen > I think we shouldn't make this change without considering the impact > of the following bug: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58770 > GCC very slow compiling with #pragma once > > According to comment 3, using "#pragma once" introduces N^2 behavior > on the number of included files, because the duplicate check uses a > list rather than a hashtable. > > At the very least, a performance comparison should be made to find out > what the impact of that bug is. I did a performance test of the build with and without this change and it took the same time to build hotspot on my machine (fastdebug) with and without.? Both without precompiled headers. without: STARTED:Thu Jan? 3 08:52:45 EST 2019 ? ?? DONE:Thu Jan? 3 08:57:20 EST 2019 with pragma once: STARTED:Thu Jan? 3 08:58:44 EST 2019 ?? ? DONE:Thu Jan? 3 09:03:19 EST 2019 It may be that more than 1700 include files might cause a slower build.? We've done a good job fixing our includes to not include everything. > > A couple of comments regarding the problems with existing #include > guards, and continuing to use #include guards instead of "#pragma once": > > (1) I wonder if a round trip using the following could fix the > #include guards: > https://github.com/cgmb/guardonce > Utilities for converting from C/C++ include guards to #pragma once and > back again. > > (2) For maintaining #include guards, clang has -Wheader-guard, which > warns about mismatches between the #ifndef name and the #define. Of > course, that doesn't solve all the issues with #include guards. Right.? It doesn't solve the ongoing maintenance problem of always having to edit the guards if you change a files location. I was thinking this isn't worth doing, but maybe it is, if we don't use pragma once, it would be nice if the file names matched at least for one point in time. > > And now for a more general discussion of "#pragma once". > > The question of whether to use "#pragma once" comes up pretty often in > various open source projects and Q&A sites. So far, I haven't found > any large cross-platform projects that provide headers to clients that > have decided to go that way. But most of HotSpot is self-contained, > which limits exposure to some of the issues below. Yes, the #pragma once issues in the general discussion weren't that convincing to me.? Maybe a general project in the general case can't use pragma once, didn't seem to apply to us and our build.? Maybe someone from our build system can comment, but I don't think we have symbolic links or hard links? in our build that would mess this up, or other issues. > > An exception to that are the C headers providing interfaces to the VM, > e.g. the files in src/hotspot/share/include. This suggests that > perhaps these files should perhaps be excluded from the change, since > they get used in whatever build environment a client uses. It also > suggests the #include guard names for these files need careful > namespace consideration, which clearly didn't happen with cds.h. I can revert these.? I was on the fence about these files. > > For many compilers there isn't a good performance argument for using > "#pragma once". gcc (for a long time), clang (always?), VS2015+ all > do the #include guard optimization. (I think Solaris Studio might > still not? And I have no idea about XLC++.) > > So the primary question seems to be the reduced clutter and avoidance > of mistakes in #include guards, vs the possibility of cases where > "#pragma once" doesn't work properly. > > Here's a list of some of the discussions I found: > > https://lists.boost.org/Archives/boost/2018/11/244423.php > https://lists.qt-project.org/pipermail/development/2018-October/067452.html > https://lists.qt-project.org/pipermail/development/2018-January/063932.html > https://stackoverflow.com/questions/1143936/pragma-once-vs-include-guards > https://www.reddit.com/r/cpp/comments/4cjjwe/come_on_guys_put_pragma_once_in_the_standard/d1j04te/ > > The main argument against "#pragma once" (besides being non-standard, > so possibly not sufficiently portable, though we think all platforms > supported by HotSpot have this feature) is that it is "unreliable". > Unfortunately, details are hard to come by. Yeah, the details seemed vague and I was unconvinced that they were applicable.? The reward seems greater than the risk. > > I've seen claims that combining "#pragma once" with precompiled > headers can cause problems, though the fact that Visual Studio has > long supported both and they are commonly used together argues > contrary. But perhaps there are additional factors needed for > problems to arise, and those don't happen on Windows? > > My impression is that having sources spread across different file > systems might be a source of problems, possibly in conjunction with > other factors. Before you say "multiple file systems" is not a > possible configuration for JDK builds, consider an out-of-tree build > with the source and build directory on different file systems (and > remember that our generated sources are in the build directory). > > I've seen suggestions that network file systems can also mess things > up, though I didn't find details. > > I think an additional factor that might be relevant is the typical > (but not specified by the standard) behavior of #include "..." first > searching with respect to the current directory. I think this is at > least potentially a concern for JDK builds on (perhaps odd) file > system configurations. > > Some examples are discussed in the following messages. Having a > bind-mount involved can mess things up, for example. I don't know if > that's a realistic scenario for building the JDK or HotSpot. > https://lists.qt-project.org/pipermail/development/2018-October/067467.html > https://lists.qt-project.org/pipermail/development/2018-October/067471.html > > So there seems to be some risk with this change that it will result in > build failures or bad builds in someone's build environment, but it is > hard to characterize what a problematic build environment looks like, > so hard to know how "reasonable" or "sane" such a build environment > might be. Local testing is obviously inadequate for this change. > Even running it through the Oracle build and test system doesn't seem > sufficient to me. Having it checked by the various known build farms > (SAP, Debian, Red Hat, maybe others) seems called for. It's good that > the RFR specifically called out Debian to be checked. > Right.? I did copy Adrian because I think he has a Debian build. I'm not sure what to look for as a "bad build".? I would hope it would be catastrophic.? How do we evaluate any build changes? Thank you for this detailed reply.? Can you put this in the bug report, so it doesn't get lost (and is more easily searchable)? Thanks, Coleen From matthias.baesken at sap.com Thu Jan 3 15:39:18 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 3 Jan 2019 15:39:18 +0000 Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java fails on AArch64 In-Reply-To: References: <3c433e428a7f4e9aa4ed8fdf2643d64f@sap.com> Message-ID: Hello, here is the second webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8215961.1/ I adjusted s390 as well . Best regards, Matthias > -----Original Message----- > From: Schmidt, Lutz > Sent: Donnerstag, 3. Januar 2019 15:47 > To: Lindenmaier, Goetz ; Baesken, Matthias > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net> > Subject: Re: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java > fails on AArch64 > > Hi, > I would suggest to replace "zArch" with "s390" to use the same term > everywhere. > There is reason for some hope this change will avoid confusion in the future. > Regards, > Lutz > > ?On 03.01.19, 14:41, "Lindenmaier, Goetz" > wrote: > > Hi Matthias, > > the change looks good to me. > > But looking at the code, I saw that s390 says "zArch" there. > We use the string "s390" throughout the code to name the platform, > so I think this should say "s390". In documentation, we use > "z/Architecture", > as well as in some version messages. So this would also be an option. > Could you fix this too, please? And adapt the test? > @Lutz, what do you think? > > Best regards, > Goetz. > > > -----Original Message----- > > From: hotspot-dev On > Behalf Of > > Baesken, Matthias > > Sent: Freitag, 28. Dezember 2018 14:36 > > To: 'hotspot-dev at openjdk.java.net' > > Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java > fails on > > AArch64 > > > > Hello, please review this small fix . > > > > At the moment, the test jdk/jfr/event/os/TestCPUInformation.java > fails > > on AArch64 with the following error : > > > > > > java.lang.RuntimeException: Value not in (Intel, AMD, Unknown x86, > SPARC, > > ARM, PPC, PowerPC, AArch64, zArch), field='description', > > value='0x50:0x0:0x000:1, simd' > > > > > > Reason is that the jdk.CPUInformation event misses a known CPU > > identifier value in the description, see the description part of it : > > > > Event: jdk.CPUInformation { > > .... > > description = "0x50:0x0:0x000:1, simd" > > sockets = 8 > > .... > > } > > > > > > The patch adds the CPU identifier info to the _cpu_desc string where > it is > > taken from . > > Please compare also with the ppc - implementation where the info (PPC) > is > > already added . > > > > vm_version_ext_ppc.cpp > > > > 50 snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "PPC %s", > > features_string()); > > > > > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8215961 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215961.0/ > > > > > > Thanks, Matthias > > From erik.osterlund at oracle.com Thu Jan 3 15:50:41 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 3 Jan 2019 16:50:41 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> Message-ID: <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> Hi, I think the value of moving to #pragma once instead of using include guards is that it is easy to mess up include guards. Admittedly, at least I tend to get them wrong quite often, and every time I get annoyed that we don't use #pragma once. Especially in platform specific files, that I can't even build to see if I messed up. And getting includes wrong could be potentially dangerous, apart from being a waste of time. I think it's a waste of time for reviewers to go through and point out errors in include guards, and similarly it is a waste of time for contributors to stare at their include guards until their eyes start bleeding before reviews to make sure they didn't mess up, and upload new webrevs when they mess them up. Might be exaggerating a bit... but still. So basically, I for one, would really appreciate if we could start using #pragma once, and I have been dreaming about the bright #pragma once future for years, and was hoping this would be the right time to make the move. Will shed a few tears if we decide not to do it. Thanks, /Erik On 2019-01-03 11:51, Andrew Haley wrote: > On 1/3/19 7:48 AM, Kim Barrett wrote: > >> I think we shouldn't make this change without considering the impact >> of the following bug: >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58770 >> GCC very slow compiling with #pragma once >> >> According to comment 3, using "#pragma once" introduces N^2 behavior >> on the number of included files, because the duplicate check uses a >> list rather than a hashtable. >> >> At the very least, a performance comparison should be made to find out >> what the impact of that bug is. > > Thank you for a very detailed analysis. I agree with you that this > shouldn't be changed, at least for now. I've been following the > discussion in the GCC lists and others for years, and the overall > feeling I get is that #pragma once isn't quite what we need for header > file inclusion. Instead, GCC and other compilers have a very high > performance implementation of include guards. > > Nevertheless, I am still a GCC maintainer and I'll fix the bug in > #pragma once if it's really going to be useful. But based on some of > the known issues with #pragma once, I don't believe that it will > be. #pragma once is a nice idea, but I think that's all it is. > From kim.barrett at oracle.com Thu Jan 3 16:09:24 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Jan 2019 11:09:24 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <1178e8db-3ab5-d3dc-9321-827a5fbd777b@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <1178e8db-3ab5-d3dc-9321-827a5fbd777b@oracle.com> Message-ID: > On Jan 3, 2019, at 9:55 AM, coleen.phillimore at oracle.com wrote: > On 1/3/19 2:48 AM, Kim Barrett wrote: >>>> >> I think we shouldn't make this change without considering the impact >> of the following bug: >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58770 >> GCC very slow compiling with #pragma once >> >> According to comment 3, using "#pragma once" introduces N^2 behavior >> on the number of included files, because the duplicate check uses a >> list rather than a hashtable. >> >> At the very least, a performance comparison should be made to find out >> what the impact of that bug is. > > I did a performance test of the build with and without this change and it took the same time to build hotspot on my machine (fastdebug) with and without. Both without precompiled headers. > > without: > STARTED:Thu Jan 3 08:52:45 EST 2019 > DONE:Thu Jan 3 08:57:20 EST 2019 > with pragma once: > STARTED:Thu Jan 3 08:58:44 EST 2019 > DONE:Thu Jan 3 09:03:19 EST 2019 > > > It may be that more than 1700 include files might cause a slower build. We've done a good job fixing our includes to not include everything. I think the precompiled headers case needs to be checked. The recent heavy trimming of precompiled.hpp probably helps a lot, but it would be unfortunate to lose all and then some of the speedup obtained from that trimming. From aph at redhat.com Thu Jan 3 16:28:00 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 3 Jan 2019 16:28:00 +0000 Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java fails on AArch64 In-Reply-To: References: <3c433e428a7f4e9aa4ed8fdf2643d64f@sap.com> Message-ID: <60a4e47c-4ebf-3656-e9b0-5da6665147e4@redhat.com> On 1/3/19 3:39 PM, Baesken, Matthias wrote: > Hello, here is the second webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215961.1/ > > I adjusted s390 as well . OK, thanks. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From goetz.lindenmaier at sap.com Thu Jan 3 16:45:21 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 3 Jan 2019 16:45:21 +0000 Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java fails on AArch64 In-Reply-To: References: <3c433e428a7f4e9aa4ed8fdf2643d64f@sap.com> , Message-ID: Looks good, thanks! Best G?tz > Am 03.01.2019 um 16:39 schrieb Baesken, Matthias : > > Hello, here is the second webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215961.1/ > > I adjusted s390 as well . > > Best regards, Matthias > > > >> -----Original Message----- >> From: Schmidt, Lutz >> Sent: Donnerstag, 3. Januar 2019 15:47 >> To: Lindenmaier, Goetz ; Baesken, Matthias >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net> >> Subject: Re: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java >> fails on AArch64 >> >> Hi, >> I would suggest to replace "zArch" with "s390" to use the same term >> everywhere. >> There is reason for some hope this change will avoid confusion in the future. >> Regards, >> Lutz >> >> ?On 03.01.19, 14:41, "Lindenmaier, Goetz" >> wrote: >> >> Hi Matthias, >> >> the change looks good to me. >> >> But looking at the code, I saw that s390 says "zArch" there. >> We use the string "s390" throughout the code to name the platform, >> so I think this should say "s390". In documentation, we use >> "z/Architecture", >> as well as in some version messages. So this would also be an option. >> Could you fix this too, please? And adapt the test? >> @Lutz, what do you think? >> >> Best regards, >> Goetz. >> >>> -----Original Message----- >>> From: hotspot-dev On >> Behalf Of >>> Baesken, Matthias >>> Sent: Freitag, 28. Dezember 2018 14:36 >>> To: 'hotspot-dev at openjdk.java.net' >>> Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java >> fails on >>> AArch64 >>> >>> Hello, please review this small fix . >>> >>> At the moment, the test jdk/jfr/event/os/TestCPUInformation.java >> fails >>> on AArch64 with the following error : >>> >>> >>> java.lang.RuntimeException: Value not in (Intel, AMD, Unknown x86, >> SPARC, >>> ARM, PPC, PowerPC, AArch64, zArch), field='description', >>> value='0x50:0x0:0x000:1, simd' >>> >>> >>> Reason is that the jdk.CPUInformation event misses a known CPU >>> identifier value in the description, see the description part of it : >>> >>> Event: jdk.CPUInformation { >>> .... >>> description = "0x50:0x0:0x000:1, simd" >>> sockets = 8 >>> .... >>> } >>> >>> >>> The patch adds the CPU identifier info to the _cpu_desc string where >> it is >>> taken from . >>> Please compare also with the ppc - implementation where the info (PPC) >> is >>> already added . >>> >>> vm_version_ext_ppc.cpp >>> >>> 50 snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "PPC %s", >>> features_string()); >>> >>> >>> >>> Bug/webrev : >>> >>> https://bugs.openjdk.java.net/browse/JDK-8215961 >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215961.0/ >>> >>> >>> Thanks, Matthias >> >> > From kim.barrett at oracle.com Thu Jan 3 17:44:38 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Jan 2019 12:44:38 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <1178e8db-3ab5-d3dc-9321-827a5fbd777b@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <1178e8db-3ab5-d3dc-9321-827a5fbd777b@oracle.com> Message-ID: <6E70E4E3-6141-4C3A-AA8B-E40C2ADD79D5@oracle.com> > On Jan 3, 2019, at 9:55 AM, coleen.phillimore at oracle.com wrote: > Thank you for this detailed reply. Can you put this in the bug report, so it doesn't get lost (and is more easily searchable)? Done. From coleen.phillimore at oracle.com Thu Jan 3 18:24:58 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 3 Jan 2019 13:24:58 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <1178e8db-3ab5-d3dc-9321-827a5fbd777b@oracle.com> Message-ID: <8515aa38-440f-9114-fd21-d35db8bd81ee@oracle.com> On 1/3/19 11:09 AM, Kim Barrett wrote: >> On Jan 3, 2019, at 9:55 AM, coleen.phillimore at oracle.com wrote: >> On 1/3/19 2:48 AM, Kim Barrett wrote: >>> I think we shouldn't make this change without considering the impact >>> of the following bug: >>> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58770 >>> GCC very slow compiling with #pragma once >>> >>> According to comment 3, using "#pragma once" introduces N^2 behavior >>> on the number of included files, because the duplicate check uses a >>> list rather than a hashtable. >>> >>> At the very least, a performance comparison should be made to find out >>> what the impact of that bug is. >> I did a performance test of the build with and without this change and it took the same time to build hotspot on my machine (fastdebug) with and without. Both without precompiled headers. >> >> without: >> STARTED:Thu Jan 3 08:52:45 EST 2019 >> DONE:Thu Jan 3 08:57:20 EST 2019 >> with pragma once: >> STARTED:Thu Jan 3 08:58:44 EST 2019 >> DONE:Thu Jan 3 09:03:19 EST 2019 >> >> >> It may be that more than 1700 include files might cause a slower build. We've done a good job fixing our includes to not include everything. > I think the precompiled headers case needs to be checked. The recent heavy trimming > of precompiled.hpp probably helps a lot, but it would be unfortunate to lose all and then > some of the speedup obtained from that trimming. > configure + make hotspot fastdebug with precompiled headers on linux-x64. before: STARTED:Thu Jan? 3 12:39:34 EST 2019 ?? DONE:Thu Jan? 3 12:43:18 EST 2019 with pragma once: STARTED:Thu Jan? 3 12:44:22 EST 2019 ?? DONE:Thu Jan? 3 12:48:02 EST 2019 Coleen From aph at redhat.com Thu Jan 3 18:39:01 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 3 Jan 2019 18:39:01 +0000 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> Message-ID: <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> On 1/3/19 3:50 PM, Erik ?sterlund wrote: > I think the value of moving to #pragma once instead of using include > guards is that it is easy to mess up include guards. Admittedly, at > least I tend to get them wrong quite often, and every time I get > annoyed that we don't use #pragma once > ... Might be exaggerating a bit... but still. Maybe. :-) But seriously, how often do you actually have to create a new header file, or even edit the part of a header file that touched the include guards? The reason this bug hasn't got much love at GCC is that over there nobody (including, until today, me) could ever see the point of include guards. It seems to me that they are a rather fragile way of fixing a nonexistent problem. But hey, if it's really important to you, go ahead. This issue isn't something I'm prepared to go to the barricades for. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Thu Jan 3 19:26:12 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 3 Jan 2019 20:26:12 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> Message-ID: <4ce579d4-9bab-1a9d-124c-7d20c4fc54d0@oracle.com> On 2019-01-03 19:39, Andrew Haley wrote: > On 1/3/19 3:50 PM, Erik ?sterlund wrote: > >> I think the value of moving to #pragma once instead of using include >> guards is that it is easy to mess up include guards. Admittedly, at >> least I tend to get them wrong quite often, and every time I get >> annoyed that we don't use #pragma once > > >> ... Might be exaggerating a bit... but still. > > Maybe. :-) > > But seriously, how often do you actually have to create a new header > file, or even edit the part of a header file that touched the include > guards? The reason this bug hasn't got much love at GCC is that over > there nobody (including, until today, me) could ever see the point of > include guards. It seems to me that they are a rather fragile way of > fixing a nonexistent problem. It happens quite a lot actually. For example, it looks like there are currently 126 barrier set files, of which 67 are header files. I had something to do with most of them. I will never forget the include guard horrors when *BarrierSetAssembler was renamed. Sometimes I think the include guard horrors make us grow files too large, because of that extra overhead of creating a new file. Same goes for renaming or moving files that obviously have the wrong name or are in the wrong directory, like memory/heap.hpp, that should arguably be code/codeHeap.hpp. Not to talk about when the "vm" directory was dropped. We are still living in the aftermath of that, and include guards now inconsistently sometimes have the VM_ prefix and sometimes not depending on how old the file is and how the author of new files since then have thought about it. Did they want to blend in with the old (wrong) prefix, or use the correct prefix? Or when the gc_interface/gc_implementation directories were removed, making all GC files move. Ouch. > But hey, if it's really important to you, go ahead. This issue isn't > something I'm prepared to go to the barricades for. > Okay, thanks Andrew! /Erik From coleen.phillimore at oracle.com Thu Jan 3 21:20:48 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 3 Jan 2019 16:20:48 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <4ce579d4-9bab-1a9d-124c-7d20c4fc54d0@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <4ce579d4-9bab-1a9d-124c-7d20c4fc54d0@oracle.com> Message-ID: <90f1af5d-718e-40df-6fd1-8902586f9c56@oracle.com> On 1/3/19 2:26 PM, Erik ?sterlund wrote: > > > On 2019-01-03 19:39, Andrew Haley wrote: >> On 1/3/19 3:50 PM, Erik ?sterlund wrote: >> >>> I think the value of moving to #pragma once instead of using include >>> guards is that it is easy to mess up include guards. Admittedly, at >>> least I tend to get them wrong quite often, and every time I get >>> annoyed that we don't use #pragma once >> >> >>> ... Might be exaggerating a bit... but still. >> >> Maybe. :-) >> >> But seriously, how often do you actually have to create a new header >> file, or even edit the part of a header file that touched the include >> guards? The reason this bug hasn't got much love at GCC is that over >> there nobody (including, until today, me) could ever see the point of >> include guards. It seems to me that they are a rather fragile way of >> fixing a nonexistent problem. > > It happens quite a lot actually. For example, it looks like there are > currently 126 barrier set files, of which 67 are header files. I had > something to do with most of them. I will never forget the include > guard horrors when *BarrierSetAssembler was renamed. > > Sometimes I think the include guard horrors make us grow files too > large, because of that extra overhead of creating a new file. Same > goes for renaming or moving files that obviously have the wrong name > or are in the wrong directory, like memory/heap.hpp, that should > arguably be code/codeHeap.hpp. > > Not to talk about when the "vm" directory was dropped. We are still > living in the aftermath of that, and include guards now inconsistently > sometimes have the VM_ prefix and sometimes not depending on how old > the file is and how the author of new files since then have thought > about it. Did they want to blend in with the old (wrong) prefix, or > use the correct prefix? Or when the gc_interface/gc_implementation > directories were removed, making all GC files move. Ouch. I have to admit that Erik was an instigator of this, but I've been cringing lately at the extra "_VM" in the include guards, and people asking me whether to include it or not in their new files.?? Maybe? At any rate, I've wanted to fix them all and this seems a lot better.? I might experiment with the tool pointed out to change them all back to the correct file name if this change is too discomforting. Plus, there have also been discussion about whether the coding standard should say that the end guard should have: #endif // SAME_NAME_AS_BEGIN or #endif // include guard or as Erik slipped up and wrote: #endif // pragma once *And* as Thomas Stuefe likes: #ifndef FILE_NAME_ with the trailing underscore. We can cut off this discussion by having pragma once. There have been many new header files added lately for jdk11 and 12, and I don't think we're quite done, so we expect more.? Refactoring a huge header file like thread.hpp might be really nice to do.? So it would be nice to have some smaller header files added. Thanks Erik and Andrew. Coleen > >> But hey, if it's really important to you, go ahead. This issue isn't >> something I'm prepared to go to the barricades for. >> > > Okay, thanks Andrew! > > /Erik From kim.barrett at oracle.com Thu Jan 3 21:30:43 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Jan 2019 16:30:43 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <4ce579d4-9bab-1a9d-124c-7d20c4fc54d0@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <4ce579d4-9bab-1a9d-124c-7d20c4fc54d0@oracle.com> Message-ID: > On Jan 3, 2019, at 2:26 PM, Erik ?sterlund wrote: > > It happens quite a lot actually. For example, it looks like there are currently 126 barrier set files, of which 67 are header files. I had something to do with most of them. I will never forget the include guard horrors when *BarrierSetAssembler was renamed. > > Sometimes I think the include guard horrors make us grow files too large, because of that extra overhead of creating a new file. Same goes for renaming or moving files that obviously have the wrong name or are in the wrong directory, like memory/heap.hpp, that should arguably be code/codeHeap.hpp. > > Not to talk about when the "vm" directory was dropped. We are still living in the aftermath of that, and include guards now inconsistently sometimes have the VM_ prefix and sometimes not depending on how old the file is and how the author of new files since then have thought about it. Did they want to blend in with the old (wrong) prefix, or use the correct prefix? Or when the gc_interface/gc_implementation directories were removed, making all GC files move. Ouch. I mentioned it in my earlier reply, but it may have been lost in the clutter. This looks pretty useful for such scenarios: https://github.com/cgmb/guardonce Utilities for converting from C/C++ include guards to #pragma once and back again. From david.holmes at oracle.com Thu Jan 3 21:49:02 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Jan 2019 07:49:02 +1000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> Message-ID: <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> Hi Matthias, On 4/01/2019 12:56 am, Baesken, Matthias wrote: > >> I still want to know how the OS_ERR gets handled by all the higher level >> code. How will this failure at runtime get reported back to application code? > > Hi David , the "best practice" currently is to just ignore the return code of os::set_native_priority , it is called and we hope for the best , Silent failure is not good. In that case I think it is appropriate to issue a warning whenever ThreadPriorityPolicy=1 is set, though compatibility dictates no warning in the case that you are root. Which brings us back to square one and the original patch with a different warning message: warning("-XX:ThreadPriorityPolicy=1 requires system level permissions to be applied. If these permissions do not exist, changes to priority will be silently ignored."); which then takes us back to Dan's comments that the warning should be at time of use. But to that I maintain that because use may or may not fail depending on both the available permissions and the requested priority value, that it is better to have a single generic warning in the existing place. > for example : > > jdk/src/hotspot/share/runtime/vmThread.cpp > > 299 int prio = (VMThreadPriority == -1) > 300 ? os::java_to_os_priority[NearMaxPriority] > 301 : VMThreadPriority; > 302 // Note that I cannot call os::set_priority because it expects Java > 303 // priorities and I am *explicitly* using OS priorities so that it's > 304 // possible to set the VM thread priority higher than any Java thread. > 305 os::set_native_priority( this, prio ); > > > jdk/src/hotspot/share/compiler/compileBroker.cpp > > 783 int native_prio = CompilerThreadPriority; > 784 if (native_prio == -1) { > 785 if (UseCriticalCompilerThreadPriority) { > 786 native_prio = os::java_to_os_priority[CriticalPriority]; > 787 } else { > 788 native_prio = os::java_to_os_priority[NearMaxPriority]; > 789 } > 790 } > 791 os::set_native_priority(thread, native_prio); > > > A difference is > > jdk/src/hotspot/share/runtime/os.cpp > > 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { > 218 debug_only(Thread::check_for_dangling_thread_pointer(thread);) > 219 > 220 if (p >= MinPriority && p <= MaxPriority) { > 221 int priority = java_to_os_priority[p]; > 222 return set_native_priority(thread, priority); > > Where the return code of set_native_priority() is returned (however then it is later usually not handled by the callers of os::set_priority. > >> >> As I stated this is not a complete statement as on Linux at least you >> also have to account for RLIMIT_RTPRIO. >> > > If you want me to do, I can for course add a short statement about this . Quite the opposite, I'd rather see a generic statement about permissions than try to cover all the possible situations. Thanks, David > > Best regards, Matthias > > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 3. Januar 2019 14:02 >> To: Baesken, Matthias ; >> daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net> >> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root >> users on linux/bsd >> >> Hi Matthias, >> >> On 3/01/2019 9:13 pm, Baesken, Matthias wrote: >>> Hello David and Dan , here is a second webrev : >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ >>> >>> - adjusted copyright years + fixed some typos >>> - added the missing return for FreeBSD (pointed out by Dan) >>> - removed the warning message completely >> >> I still want to know how the OS_ERR gets handled by all the higher level >> code. How will this failure at runtime get reported back to application >> code? >> >> ! // It is only used when ThreadPriorityPolicy=1 and requires root >> privilege or >> ! // CAP_SYS_NICE capability. >> >> As I stated this is not a complete statement as on Linux at least you >> also have to account for RLIMIT_RTPRIO. >> >> Thanks, >> David >> ----- >> > From harold.seigel at oracle.com Thu Jan 3 21:53:16 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Thu, 3 Jan 2019 16:53:16 -0500 Subject: RFR 8216010: Change callers of build_u2_from() to call Bytes::get_Java_u2() instead Message-ID: <1999803a-1184-1f62-649a-47afbab05723@oracle.com> Hi, Please review this small fix for JDK-8216010.? The fix removes function build_u2_from() and changes its callers to call function Bytes::get_Java_u2(). Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8216010/webrev/index.html JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216010 The fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-x64, Windows, and Mac OS X, running Mach5 tiers 3 - 5 on Linux-x64, and by running JCK-12 Lang and VM tests on Linux-x64. Thanks, Harold From daniel.daugherty at oracle.com Thu Jan 3 22:21:45 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 3 Jan 2019 17:21:45 -0500 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> Message-ID: <9b721b57-9b5b-c995-b4ed-668cf644c181@oracle.com> On 1/3/19 4:49 PM, David Holmes wrote: > Hi Matthias, > > On 4/01/2019 12:56 am, Baesken, Matthias wrote: >> >>> I still want to know how the OS_ERR gets handled by all the higher >>> level >>> code. How will this failure at runtime get reported back to >>> application? code? >> >> Hi David ,?? the "best practice"? currently is? to just ignore the >> return code? of? os::set_native_priority? , it is called and we hope >> for the best? , > > Silent failure is not good. Agreed. > In that case I think it is appropriate to issue a warning whenever > ThreadPriorityPolicy=1 is set, though compatibility dictates no > warning in the case that you are root. Agreed that you should not get the warning if you are root. Reluctantly agree to always issue the warning if -XX:ThreadPriorityPolicy=1 is specified and user != root. > Which brings us back to square one and the original patch with a > different warning message: > > warning("-XX:ThreadPriorityPolicy=1 requires system level permissions > to be applied. If these permissions do not exist, changes to priority > will be silently ignored."); Perhaps: warning("-XX:ThreadPriorityPolicy=1 may require system level permission, e.g., being the 'root' user. If the necessary permission is not possessed, changes to priority will be silently ignored."); I changed: - "requires" -> "may require" because you don't always need ? a special permission to do the operation. - "system level permissions to be applied" -> "system level permission" ? so switch to singular permission, dropped "to be applied" - added ", e.g., being the 'root' user" - "If these permissions do not exist" -> "If the necessary permission is not possessed" ? so switch to singular permission, switch from "do not exist" to ? "is not possessed" > which then takes us back to Dan's comments that the warning should be > at time of use. But to that I maintain that because use may or may not > fail depending on both the available permissions and the requested > priority value, that it is better to have a single generic warning in > the existing place. This is where David and I disagree. I do not think we should issue a warning unless the operation failed and David prefers the generic warning in one place. I will reluctantly agree to always issue the warning if -XX:ThreadPriorityPolicy=1 is specified and user != root. Dan > >> ? for example : >> >> jdk/src/hotspot/share/runtime/vmThread.cpp >> >> 299? int prio = (VMThreadPriority == -1) >> 300??? ? os::java_to_os_priority[NearMaxPriority] >> 301??? : VMThreadPriority; >> 302? // Note that I cannot call os::set_priority because it expects Java >> 303? // priorities and I am *explicitly* using OS priorities so that >> it's >> 304? // possible to set the VM thread priority higher than any Java >> thread. >> 305? os::set_native_priority( this, prio ); >> >> >> jdk/src/hotspot/share/compiler/compileBroker.cpp >> >> 783????? int native_prio = CompilerThreadPriority; >> 784????? if (native_prio == -1) { >> 785??????? if (UseCriticalCompilerThreadPriority) { >> 786????????? native_prio = os::java_to_os_priority[CriticalPriority]; >> 787??????? } else { >> 788????????? native_prio = os::java_to_os_priority[NearMaxPriority]; >> 789??????? } >> 790????? } >> 791????? os::set_native_priority(thread, native_prio); >> >> >> A difference is >> >> jdk/src/hotspot/share/runtime/os.cpp >> >> 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { >> 218 debug_only(Thread::check_for_dangling_thread_pointer(thread);) >> 219 >> 220? if (p >= MinPriority && p <= MaxPriority) { >> 221??? int priority = java_to_os_priority[p]; >> 222??? return set_native_priority(thread, priority); >> >> Where the return? code? of? set_native_priority()?? is returned?? >> (however then it is later usually? not handled by the callers of >> os::set_priority. >> >>> >>> As I stated this is not a complete statement as on Linux at least you >>> also have to account for RLIMIT_RTPRIO. >>> >> >> If you want me to do, I can for course add a short statement about >> this . > > Quite the opposite, I'd rather see a generic statement about > permissions than try to cover all the possible situations. > > Thanks, > David > >> >> Best regards, Matthias >> >> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Donnerstag, 3. Januar 2019 14:02 >>> To: Baesken, Matthias ; >>> daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' >> dev at openjdk.java.net> >>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for >>> non-root >>> users on linux/bsd >>> >>> Hi Matthias, >>> >>> On 3/01/2019 9:13 pm, Baesken, Matthias wrote: >>>> Hello David and Dan ,? here is a second webrev : >>>> >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ >>>> >>>> - adjusted copyright years + fixed some typos >>>> - added? the missing return for FreeBSD? (pointed out by Dan) >>>> - removed? the? warning message? completely >>> >>> I still want to know how the OS_ERR gets handled by all the higher >>> level >>> code. How will this failure at runtime get reported back to application >>> code? >>> >>> ! // It is only used when ThreadPriorityPolicy=1 and requires root >>> privilege or >>> ! // CAP_SYS_NICE capability. >>> >>> As I stated this is not a complete statement as on Linux at least you >>> also have to account for RLIMIT_RTPRIO. >>> >>> Thanks, >>> David >>> ----- >>> >> From david.holmes at oracle.com Thu Jan 3 22:51:56 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Jan 2019 08:51:56 +1000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: <9b721b57-9b5b-c995-b4ed-668cf644c181@oracle.com> References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> <9b721b57-9b5b-c995-b4ed-668cf644c181@oracle.com> Message-ID: Hi Dan, Thanks for your reluctant agreement. I like your updated wording for the warning. David On 4/01/2019 8:21 am, Daniel D. Daugherty wrote: > On 1/3/19 4:49 PM, David Holmes wrote: >> Hi Matthias, >> >> On 4/01/2019 12:56 am, Baesken, Matthias wrote: >>> >>>> I still want to know how the OS_ERR gets handled by all the higher >>>> level >>>> code. How will this failure at runtime get reported back to >>>> application? code? >>> >>> Hi David ,?? the "best practice"? currently is? to just ignore the >>> return code? of? os::set_native_priority? , it is called and we hope >>> for the best? , >> >> Silent failure is not good. > > Agreed. > > >> In that case I think it is appropriate to issue a warning whenever >> ThreadPriorityPolicy=1 is set, though compatibility dictates no >> warning in the case that you are root. > > Agreed that you should not get the warning if you are root. > Reluctantly agree to always issue the warning if > -XX:ThreadPriorityPolicy=1 is specified and user != root. > > >> Which brings us back to square one and the original patch with a >> different warning message: >> >> warning("-XX:ThreadPriorityPolicy=1 requires system level permissions >> to be applied. If these permissions do not exist, changes to priority >> will be silently ignored."); > > Perhaps: > > warning("-XX:ThreadPriorityPolicy=1 may require system level permission, > e.g., being the 'root' user. If the necessary permission is not > possessed, changes to priority will be silently ignored."); > > I changed: > > - "requires" -> "may require" because you don't always need > ? a special permission to do the operation. > - "system level permissions to be applied" -> "system level permission" > ? so switch to singular permission, dropped "to be applied" > - added ", e.g., being the 'root' user" > - "If these permissions do not exist" -> "If the necessary permission is > not possessed" > ? so switch to singular permission, switch from "do not exist" to > ? "is not possessed" > > >> which then takes us back to Dan's comments that the warning should be >> at time of use. But to that I maintain that because use may or may not >> fail depending on both the available permissions and the requested >> priority value, that it is better to have a single generic warning in >> the existing place. > > This is where David and I disagree. I do not think we should issue a > warning unless the operation failed and David prefers the generic > warning in one place. > > I will reluctantly agree to always issue the warning if > -XX:ThreadPriorityPolicy=1 is specified and user != root. > > Dan > >> >>> ? for example : >>> >>> jdk/src/hotspot/share/runtime/vmThread.cpp >>> >>> 299? int prio = (VMThreadPriority == -1) >>> 300??? ? os::java_to_os_priority[NearMaxPriority] >>> 301??? : VMThreadPriority; >>> 302? // Note that I cannot call os::set_priority because it expects Java >>> 303? // priorities and I am *explicitly* using OS priorities so that >>> it's >>> 304? // possible to set the VM thread priority higher than any Java >>> thread. >>> 305? os::set_native_priority( this, prio ); >>> >>> >>> jdk/src/hotspot/share/compiler/compileBroker.cpp >>> >>> 783????? int native_prio = CompilerThreadPriority; >>> 784????? if (native_prio == -1) { >>> 785??????? if (UseCriticalCompilerThreadPriority) { >>> 786????????? native_prio = os::java_to_os_priority[CriticalPriority]; >>> 787??????? } else { >>> 788????????? native_prio = os::java_to_os_priority[NearMaxPriority]; >>> 789??????? } >>> 790????? } >>> 791????? os::set_native_priority(thread, native_prio); >>> >>> >>> A difference is >>> >>> jdk/src/hotspot/share/runtime/os.cpp >>> >>> 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { >>> 218 debug_only(Thread::check_for_dangling_thread_pointer(thread);) >>> 219 >>> 220? if (p >= MinPriority && p <= MaxPriority) { >>> 221??? int priority = java_to_os_priority[p]; >>> 222??? return set_native_priority(thread, priority); >>> >>> Where the return? code? of? set_native_priority()?? is returned >>> (however then it is later usually? not handled by the callers of >>> os::set_priority. >>> >>>> >>>> As I stated this is not a complete statement as on Linux at least you >>>> also have to account for RLIMIT_RTPRIO. >>>> >>> >>> If you want me to do, I can for course add a short statement about >>> this . >> >> Quite the opposite, I'd rather see a generic statement about >> permissions than try to cover all the possible situations. >> >> Thanks, >> David >> >>> >>> Best regards, Matthias >>> >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Donnerstag, 3. Januar 2019 14:02 >>>> To: Baesken, Matthias ; >>>> daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' >>> dev at openjdk.java.net> >>>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for >>>> non-root >>>> users on linux/bsd >>>> >>>> Hi Matthias, >>>> >>>> On 3/01/2019 9:13 pm, Baesken, Matthias wrote: >>>>> Hello David and Dan ,? here is a second webrev : >>>>> >>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ >>>>> >>>>> - adjusted copyright years + fixed some typos >>>>> - added? the missing return for FreeBSD? (pointed out by Dan) >>>>> - removed? the? warning message? completely >>>> >>>> I still want to know how the OS_ERR gets handled by all the higher >>>> level >>>> code. How will this failure at runtime get reported back to application >>>> code? >>>> >>>> ! // It is only used when ThreadPriorityPolicy=1 and requires root >>>> privilege or >>>> ! // CAP_SYS_NICE capability. >>>> >>>> As I stated this is not a complete statement as on Linux at least you >>>> also have to account for RLIMIT_RTPRIO. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>> > From coleen.phillimore at oracle.com Fri Jan 4 03:20:22 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 3 Jan 2019 22:20:22 -0500 Subject: RFR 8215731: Move forward class definitions out of globalDefinitions.hpp Message-ID: <546b2f19-a166-df67-c05e-6b5d5c44f844@oracle.com> Summary: redistribute the forward declarations to the header files that need them. Tested with mach5 tier1 and 2, also tested aarch64 with cross compiler and zero. open webrev at http://cr.openjdk.java.net/~coleenp/8215731.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8215731 Thanks, Coleen From kim.barrett at oracle.com Fri Jan 4 03:47:01 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 3 Jan 2019 22:47:01 -0500 Subject: RFR 8216010: Change callers of build_u2_from() to call Bytes::get_Java_u2() instead In-Reply-To: <1999803a-1184-1f62-649a-47afbab05723@oracle.com> References: <1999803a-1184-1f62-649a-47afbab05723@oracle.com> Message-ID: <6C3801E8-7B79-4429-90DF-F4D5D134AF29@oracle.com> > On Jan 3, 2019, at 4:53 PM, Harold David Seigel wrote: > > Hi, > > Please review this small fix for JDK-8216010. The fix removes function build_u2_from() and changes its callers to call function Bytes::get_Java_u2(). > > Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8216010/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8216010 > > The fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-x64, Windows, and Mac OS X, running Mach5 tiers 3 - 5 on Linux-x64, and by running JCK-12 Lang and VM tests on Linux-x64. > > Thanks, Harold Looks good. From jiangli.zhou at oracle.com Fri Jan 4 04:23:53 2019 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Thu, 3 Jan 2019 20:23:53 -0800 Subject: RFR 8216010: Change callers of build_u2_from() to call Bytes::get_Java_u2() instead In-Reply-To: <6C3801E8-7B79-4429-90DF-F4D5D134AF29@oracle.com> References: <1999803a-1184-1f62-649a-47afbab05723@oracle.com> <6C3801E8-7B79-4429-90DF-F4D5D134AF29@oracle.com> Message-ID: +1 Thanks, Jiangli On 1/3/19 7:47 PM, Kim Barrett wrote: >> On Jan 3, 2019, at 4:53 PM, Harold David Seigel wrote: >> >> Hi, >> >> Please review this small fix for JDK-8216010. The fix removes function build_u2_from() and changes its callers to call function Bytes::get_Java_u2(). >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8216010/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8216010 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-x64, Windows, and Mac OS X, running Mach5 tiers 3 - 5 on Linux-x64, and by running JCK-12 Lang and VM tests on Linux-x64. >> >> Thanks, Harold > Looks good. > From david.holmes at oracle.com Fri Jan 4 05:08:20 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Jan 2019 15:08:20 +1000 Subject: RFR 8215731: Move forward class definitions out of globalDefinitions.hpp In-Reply-To: <546b2f19-a166-df67-c05e-6b5d5c44f844@oracle.com> References: <546b2f19-a166-df67-c05e-6b5d5c44f844@oracle.com> Message-ID: Hi Coleen, Seems fine. Thanks, David On 4/01/2019 1:20 pm, coleen.phillimore at oracle.com wrote: > Summary: redistribute the forward declarations to the header files that > need them. > > Tested with mach5 tier1 and 2, also tested aarch64 with cross compiler > and zero. > > open webrev at http://cr.openjdk.java.net/~coleenp/8215731.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8215731 > > Thanks, > Coleen > From matthias.baesken at sap.com Fri Jan 4 07:57:19 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 4 Jan 2019 07:57:19 +0000 Subject: RFR (tedious) 8216022: Use #pragma once Message-ID: Hello Coleen, on Solaris Sparc with Oracle Studio 12u4 Oct2017 version we get " line 25: Error: Unrecognized #pragma once " Which Oracle Studio version do you recommend to use ? (good news is that xlc 12.1 / AIX works ) Best regards, Matthias > > Message: 2 > Date: Wed, 2 Jan 2019 21:16:59 -0500 > From: coleen.phillimore at oracle.com > To: hotspot-dev developers , John Paul > Adrian Glaubitz > Subject: RFR (tedious) 8216022: Use #pragma once > Message-ID: <9250036e-8696-6103-6c3f-513fa11ffebd at oracle.com> > Content-Type: text/plain; charset=utf-8; format=flowed > > Summary: change include guards to #pragma once, except in generated > header files. > > Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, > windows-x64, built aarch64 with cross compiler, and zero. > > Ran tier1 and 2 tests. > > The webrev is huge but there are only 3 lines changed in each header > file.? So click on the patch. > > I'll update the copyright headers with a script with the commit. Also, > will do this after the shenandoah copyright headers are fixed. > > Adrian: I included you to check your platforms. > > Happy New Year! > Coleen > > > ------------------------------ .... > > Here is the webrev and bug link. > > open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8216022 > From erik.osterlund at oracle.com Fri Jan 4 09:38:01 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 4 Jan 2019 10:38:01 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: Message-ID: Hi Matthias, The #pragma once support was added in Oracle Studio 12u5, which is what I have been patiently been waiting for. Having said that, I would recommend using 12u6. Thanks for confirming that xlc 12.1 / AIX works. /Erik On 2019-01-04 08:57, Baesken, Matthias wrote: > Hello Coleen, on Solaris Sparc with Oracle Studio 12u4 Oct2017 version we get > > " line 25: Error: Unrecognized #pragma once" > > Which Oracle Studio version do you recommend to use ? > > > (good news is that xlc 12.1 / AIX works ) > > > Best regards, Matthias > > >> Message: 2 >> Date: Wed, 2 Jan 2019 21:16:59 -0500 >> From: coleen.phillimore at oracle.com >> To: hotspot-dev developers , John Paul >> Adrian Glaubitz >> Subject: RFR (tedious) 8216022: Use #pragma once >> Message-ID: <9250036e-8696-6103-6c3f-513fa11ffebd at oracle.com> >> Content-Type: text/plain; charset=utf-8; format=flowed >> >> Summary: change include guards to #pragma once, except in generated >> header files. >> >> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, >> windows-x64, built aarch64 with cross compiler, and zero. >> >> Ran tier1 and 2 tests. >> >> The webrev is huge but there are only 3 lines changed in each header >> file.? So click on the patch. >> >> I'll update the copyright headers with a script with the commit. Also, >> will do this after the shenandoah copyright headers are fixed. >> >> Adrian: I included you to check your platforms. >> >> Happy New Year! >> Coleen >> >> >> ------------------------------ > .... > >> Here is the webrev and bug link. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8216022 >> From fweimer at redhat.com Fri Jan 4 11:07:48 2019 From: fweimer at redhat.com (Florian Weimer) Date: Fri, 04 Jan 2019 12:07:48 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> (Andrew Haley's message of "Thu, 3 Jan 2019 18:39:01 +0000") References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> Message-ID: <8736q8smt7.fsf@oldenburg2.str.redhat.com> * Andrew Haley: > But seriously, how often do you actually have to create a new header > file, or even edit the part of a header file that touched the include > guards? The reason this bug hasn't got much love at GCC is that over > there nobody (including, until today, me) could ever see the point of > include guards. It seems to me that they are a rather fragile way of > fixing a nonexistent problem. The guards occasionally cause bugs because they are not globally unique or not spelled correctly in both places in the header file. For Hotspot, that's probably not a big problem because it's a single code base, and issues can be fixed easily enough if they arise. Thanks, Florian From coleen.phillimore at oracle.com Fri Jan 4 12:32:58 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 4 Jan 2019 07:32:58 -0500 Subject: RFR 8215731: Move forward class definitions out of globalDefinitions.hpp In-Reply-To: References: <546b2f19-a166-df67-c05e-6b5d5c44f844@oracle.com> Message-ID: <7639f4ca-7d54-53dd-fa8c-7ff872de81a6@oracle.com> Thanks, David. Coleen On 1/4/19 12:08 AM, David Holmes wrote: > Hi Coleen, > > Seems fine. > > Thanks, > David > > On 4/01/2019 1:20 pm, coleen.phillimore at oracle.com wrote: >> Summary: redistribute the forward declarations to the header files >> that need them. >> >> Tested with mach5 tier1 and 2, also tested aarch64 with cross >> compiler and zero. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8215731.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8215731 >> >> Thanks, >> Coleen >> From coleen.phillimore at oracle.com Fri Jan 4 12:45:46 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 4 Jan 2019 07:45:46 -0500 Subject: RFR 8216010: Change callers of build_u2_from() to call Bytes::get_Java_u2() instead In-Reply-To: References: <1999803a-1184-1f62-649a-47afbab05723@oracle.com> <6C3801E8-7B79-4429-90DF-F4D5D134AF29@oracle.com> Message-ID: Looks nice! Coleen On 1/3/19 11:23 PM, Jiangli Zhou wrote: > +1 > > Thanks, > > Jiangli > > > On 1/3/19 7:47 PM, Kim Barrett wrote: >>> On Jan 3, 2019, at 4:53 PM, Harold David Seigel >>> wrote: >>> >>> Hi, >>> >>> Please review this small fix for JDK-8216010.? The fix removes >>> function build_u2_from() and changes its callers to call function >>> Bytes::get_Java_u2(). >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8216010/webrev/index.html >>> >>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216010 >>> >>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>> and builds on Linux-x64, Windows, and Mac OS X, running Mach5 tiers >>> 3 - 5 on Linux-x64, and by running JCK-12 Lang and VM tests on >>> Linux-x64. >>> >>> Thanks, Harold >> Looks good. >> > From harold.seigel at oracle.com Fri Jan 4 12:45:01 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 4 Jan 2019 07:45:01 -0500 Subject: RFR 8216010: Change callers of build_u2_from() to call Bytes::get_Java_u2() instead In-Reply-To: <6C3801E8-7B79-4429-90DF-F4D5D134AF29@oracle.com> References: <1999803a-1184-1f62-649a-47afbab05723@oracle.com> <6C3801E8-7B79-4429-90DF-F4D5D134AF29@oracle.com> Message-ID: <28ec5553-bce0-b320-99e1-d1941d4d23fa@oracle.com> Thanks Kim! Harold On 1/3/2019 10:47 PM, Kim Barrett wrote: >> On Jan 3, 2019, at 4:53 PM, Harold David Seigel wrote: >> >> Hi, >> >> Please review this small fix for JDK-8216010. The fix removes function build_u2_from() and changes its callers to call function Bytes::get_Java_u2(). >> >> Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8216010/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8216010 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-x64, Windows, and Mac OS X, running Mach5 tiers 3 - 5 on Linux-x64, and by running JCK-12 Lang and VM tests on Linux-x64. >> >> Thanks, Harold > Looks good. > From harold.seigel at oracle.com Fri Jan 4 12:45:16 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 4 Jan 2019 07:45:16 -0500 Subject: RFR 8216010: Change callers of build_u2_from() to call Bytes::get_Java_u2() instead In-Reply-To: References: <1999803a-1184-1f62-649a-47afbab05723@oracle.com> <6C3801E8-7B79-4429-90DF-F4D5D134AF29@oracle.com> Message-ID: Thanks Jiangli! Harold On 1/3/2019 11:23 PM, Jiangli Zhou wrote: > +1 > > Thanks, > > Jiangli > > > On 1/3/19 7:47 PM, Kim Barrett wrote: >>> On Jan 3, 2019, at 4:53 PM, Harold David Seigel >>> wrote: >>> >>> Hi, >>> >>> Please review this small fix for JDK-8216010.? The fix removes >>> function build_u2_from() and changes its callers to call function >>> Bytes::get_Java_u2(). >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8216010/webrev/index.html >>> >>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216010 >>> >>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>> and builds on Linux-x64, Windows, and Mac OS X, running Mach5 tiers >>> 3 - 5 on Linux-x64, and by running JCK-12 Lang and VM tests on >>> Linux-x64. >>> >>> Thanks, Harold >> Looks good. >> > From harold.seigel at oracle.com Fri Jan 4 12:49:15 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Fri, 4 Jan 2019 07:49:15 -0500 Subject: RFR 8216010: Change callers of build_u2_from() to call Bytes::get_Java_u2() instead In-Reply-To: References: <1999803a-1184-1f62-649a-47afbab05723@oracle.com> <6C3801E8-7B79-4429-90DF-F4D5D134AF29@oracle.com> Message-ID: <81a75094-620b-8823-525a-2877e21a18cf@oracle.com> Thanks Coleen! Harold On 1/4/2019 7:45 AM, coleen.phillimore at oracle.com wrote: > Looks nice! > Coleen > > On 1/3/19 11:23 PM, Jiangli Zhou wrote: >> +1 >> >> Thanks, >> >> Jiangli >> >> >> On 1/3/19 7:47 PM, Kim Barrett wrote: >>>> On Jan 3, 2019, at 4:53 PM, Harold David Seigel >>>> wrote: >>>> >>>> Hi, >>>> >>>> Please review this small fix for JDK-8216010.? The fix removes >>>> function build_u2_from() and changes its callers to call function >>>> Bytes::get_Java_u2(). >>>> >>>> Open Webrev: >>>> http://cr.openjdk.java.net/~hseigel/bug_8216010/webrev/index.html >>>> >>>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216010 >>>> >>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>>> and builds on Linux-x64, Windows, and Mac OS X, running Mach5 tiers >>>> 3 - 5 on Linux-x64, and by running JCK-12 Lang and VM tests on >>>> Linux-x64. >>>> >>>> Thanks, Harold >>> Looks good. >>> >> > From aph at redhat.com Fri Jan 4 13:16:25 2019 From: aph at redhat.com (Andrew Haley) Date: Fri, 4 Jan 2019 13:16:25 +0000 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> Message-ID: On 1/3/19 6:39 PM, Andrew Haley wrote: > On 1/3/19 3:50 PM, Erik ?sterlund wrote: > >> I think the value of moving to #pragma once instead of using include >> guards is that it is easy to mess up include guards. Admittedly, at >> least I tend to get them wrong quite often, and every time I get >> annoyed that we don't use #pragma once > > >> ... Might be exaggerating a bit... but still. > > Maybe. :-) > > But seriously, how often do you actually have to create a new header > file, or even edit the part of a header file that touched the > include guards? The reason this bug hasn't got much love at GCC is > that over there nobody (including, until today, me) could ever see > the point of include guards. It seems to me that they are a rather > fragile way of fixing a nonexistent problem. Sorry, this should have been that pragma once is a rather fragile way of fixing a nonexistent problem. > But hey, if it's really important to you, go ahead. This issue isn't > something I'm prepared to go to the barricades for. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Fri Jan 4 13:19:12 2019 From: aph at redhat.com (Andrew Haley) Date: Fri, 4 Jan 2019 13:19:12 +0000 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <8736q8smt7.fsf@oldenburg2.str.redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> Message-ID: <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> On 1/4/19 11:07 AM, Florian Weimer wrote: > The guards occasionally cause bugs because they are not globally unique > or not spelled correctly in both places in the header file. Yeah, I know. That's the usual non-performance-related pro-#pragma once argument. After GCC and a bunch of other compilers implemented optimized include guards I thought that #pragma once would go away, but evidently not. :-) -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From coleen.phillimore at oracle.com Fri Jan 4 14:17:52 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 4 Jan 2019 09:17:52 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> Message-ID: <3b327133-ea1e-d8e3-fc14-3cc4c7275c02@oracle.com> On 1/4/19 8:19 AM, Andrew Haley wrote: > On 1/4/19 11:07 AM, Florian Weimer wrote: >> The guards occasionally cause bugs because they are not globally unique >> or not spelled correctly in both places in the header file. > Yeah, I know. That's the usual non-performance-related pro-#pragma > once argument. After GCC and a bunch of other compilers implemented > optimized include guards I thought that #pragma once would go away, > but evidently not. :-) > It's not really a performance consideration, since with and without pragma once, the performance of building hotspot is the same.? It's the annoyance of maintaining these guard strings, and the one-more-thing-to-remember if you move a file. Thanks, Coleen From daniel.daugherty at oracle.com Fri Jan 4 14:19:10 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Jan 2019 09:19:10 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: Message-ID: Since Solaris-X64 is currently stuck at Oracle Studio 12u4 due to the lack of a devkit, this change will break the ability to build and test on Solaris-X64. Dan On 1/4/19 4:38 AM, Erik ?sterlund wrote: > Hi Matthias, > > The #pragma once support was added in Oracle Studio 12u5, which is > what I have been patiently been waiting for. Having said that, I would > recommend using 12u6. > > Thanks for confirming that xlc 12.1 / AIX works. > > /Erik > > On 2019-01-04 08:57, Baesken, Matthias wrote: >> Hello?? Coleen,?? on Solaris? Sparc with? Oracle Studio? 12u4? >> Oct2017? version?? we get >> >> " line 25: Error: Unrecognized #pragma once" >> >> Which? Oracle Studio version do you recommend to use? ? >> >> >> (good news is? that xlc 12.1? / AIX? works ) >> >> >> Best regards, Matthias >> >> >>> Message: 2 >>> Date: Wed, 2 Jan 2019 21:16:59 -0500 >>> From: coleen.phillimore at oracle.com >>> To: hotspot-dev developers , John Paul >>> ????Adrian Glaubitz >>> Subject: RFR (tedious) 8216022: Use #pragma once >>> Message-ID: <9250036e-8696-6103-6c3f-513fa11ffebd at oracle.com> >>> Content-Type: text/plain; charset=utf-8; format=flowed >>> >>> Summary: change include guards to #pragma once, except in generated >>> header files. >>> >>> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, >>> windows-x64, built aarch64 with cross compiler, and zero. >>> >>> Ran tier1 and 2 tests. >>> >>> The webrev is huge but there are only 3 lines changed in each header >>> file.? So click on the patch. >>> >>> I'll update the copyright headers with a script with the commit. Also, >>> will do this after the shenandoah copyright headers are fixed. >>> >>> Adrian: I included you to check your platforms. >>> >>> Happy New Year! >>> Coleen >>> >>> >>> ------------------------------ >> .... >> >>> Here is the webrev and bug link. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8216022 >>> > From matthias.baesken at sap.com Fri Jan 4 14:59:29 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 4 Jan 2019 14:59:29 +0000 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: Message-ID: Hello, probably we need an update to jdk/doc/building.html as well because this still documents 12.4 ... Oracle Solaris Studio The minimum accepted version of the Solaris Studio compilers is 5.13 (corresponding to Solaris Studio 12.4). Older versions will not be accepted by configure. The Solaris Studio installation should contain at least these packages: ... .... Best regards, Matthias > -----Original Message----- > From: Daniel D. Daugherty > Sent: Freitag, 4. Januar 2019 15:19 > To: Erik ?sterlund ; Baesken, Matthias > ; hotspot-dev at openjdk.java.net > Subject: Re: RFR (tedious) 8216022: Use #pragma once > > Since Solaris-X64 is currently stuck at Oracle Studio 12u4 > due to the lack of a devkit, this change will break the > ability to build and test on Solaris-X64. > > Dan > > > On 1/4/19 4:38 AM, Erik ?sterlund wrote: > > Hi Matthias, > > > > The #pragma once support was added in Oracle Studio 12u5, which is > > what I have been patiently been waiting for. Having said that, I would > > recommend using 12u6. > > > > Thanks for confirming that xlc 12.1 / AIX works. > > > > /Erik > > > > On 2019-01-04 08:57, Baesken, Matthias wrote: > >> Hello?? Coleen,?? on Solaris? Sparc with? Oracle Studio? 12u4 > >> Oct2017? version?? we get > >> > >> " line 25: Error: Unrecognized #pragma once" > >> > >> Which? Oracle Studio version do you recommend to use? ? > >> > >> > >> (good news is? that xlc 12.1? / AIX? works ) > >> > >> > >> Best regards, Matthias > >> > >> > >>> Message: 2 > >>> Date: Wed, 2 Jan 2019 21:16:59 -0500 > >>> From: coleen.phillimore at oracle.com > >>> To: hotspot-dev developers , John > Paul > >>> ????Adrian Glaubitz > >>> Subject: RFR (tedious) 8216022: Use #pragma once > >>> Message-ID: <9250036e-8696-6103-6c3f-513fa11ffebd at oracle.com> > >>> Content-Type: text/plain; charset=utf-8; format=flowed > >>> > >>> Summary: change include guards to #pragma once, except in generated > >>> header files. > >>> > >>> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, > >>> windows-x64, built aarch64 with cross compiler, and zero. > >>> > >>> Ran tier1 and 2 tests. > >>> > >>> The webrev is huge but there are only 3 lines changed in each header > >>> file.? So click on the patch. > >>> > >>> I'll update the copyright headers with a script with the commit. Also, > >>> will do this after the shenandoah copyright headers are fixed. > >>> > >>> Adrian: I included you to check your platforms. > >>> > >>> Happy New Year! > >>> Coleen > >>> > >>> > >>> ------------------------------ > >> .... > >> > >>> Here is the webrev and bug link. > >>> > >>> open webrev at > http://cr.openjdk.java.net/~coleenp/8216022.01/webrev > >>> bug link https://bugs.openjdk.java.net/browse/JDK-8216022 > >>> > > From david.lloyd at redhat.com Fri Jan 4 14:59:38 2019 From: david.lloyd at redhat.com (David Lloyd) Date: Fri, 4 Jan 2019 08:59:38 -0600 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> Message-ID: On Fri, Jan 4, 2019 at 7:20 AM Andrew Haley wrote: > On 1/4/19 11:07 AM, Florian Weimer wrote: > > The guards occasionally cause bugs because they are not globally unique > > or not spelled correctly in both places in the header file. > > Yeah, I know. That's the usual non-performance-related pro-#pragma > once argument. After GCC and a bunch of other compilers implemented > optimized include guards I thought that #pragma once would go away, > but evidently not. :-) Not a reviewer, but... The whole thing is starting to seem like a really bad idea to me. I've been asking around and I can't find anyone who thinks this kind of change is a good idea; while this clearly isn't a scientific poll, it does not on the other hand raise my confidence any. It was pointed out to me that the GCC documentation has been recommending against this practice perhaps as early as 3.4. It boils down to this: #pragma once isn't standardized; there's simply no guarantee it will be supported on a given compiler. This problem seems *far* more significant to me than the risk of forgetting to update a macro name here or there (something that can be caught on code review), on the occasion that a header file is renamed or relocated such that it requires a change (and how often is this likely to happen anyway?). In addition, it was pointed out to me that if, for some reason, a header file ends up in more than one location on the include path, #pragma once will (probably, as it's not standardized) allow it to be included twice, which #ifdef guards avoid. This is perhaps not a real concern in this particular code base though. In the end though this is an example of the kind of change that I for one would never allow in one of my projects: it's large, potentially impacts portability, and yet in the end it's not really necessary, being really just a style issue when it comes right down to it. Include guards are standard and portable. '#pragma once' is not. -- - DML From matthias.baesken at sap.com Fri Jan 4 15:38:08 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 4 Jan 2019 15:38:08 +0000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> <9b721b57-9b5b-c995-b4ed-668cf644c181@oracle.com> Message-ID: Hi David/Dan , here is my new webrev, the warning (for non-root) is back and uses the improved wording : http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.2/ Best Regards , Matthias > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 3. Januar 2019 23:52 > To: daniel.daugherty at oracle.com; Baesken, Matthias > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net> > Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root > users on linux/bsd > > Hi Dan, > > Thanks for your reluctant agreement. I like your updated wording for the > warning. > > David > > On 4/01/2019 8:21 am, Daniel D. Daugherty wrote: > > On 1/3/19 4:49 PM, David Holmes wrote: > >> Hi Matthias, > >> > >> On 4/01/2019 12:56 am, Baesken, Matthias wrote: > >>> > >>>> I still want to know how the OS_ERR gets handled by all the higher > >>>> level > >>>> code. How will this failure at runtime get reported back to > >>>> application? code? > >>> > >>> Hi David ,?? the "best practice"? currently is? to just ignore the > >>> return code? of? os::set_native_priority? , it is called and we hope > >>> for the best? , > >> > >> Silent failure is not good. > > > > Agreed. > > > > > >> In that case I think it is appropriate to issue a warning whenever > >> ThreadPriorityPolicy=1 is set, though compatibility dictates no > >> warning in the case that you are root. > > > > Agreed that you should not get the warning if you are root. > > Reluctantly agree to always issue the warning if > > -XX:ThreadPriorityPolicy=1 is specified and user != root. > > > > > >> Which brings us back to square one and the original patch with a > >> different warning message: > >> > >> warning("-XX:ThreadPriorityPolicy=1 requires system level permissions > >> to be applied. If these permissions do not exist, changes to priority > >> will be silently ignored."); > > > > Perhaps: > > > > warning("-XX:ThreadPriorityPolicy=1 may require system level permission, > > e.g., being the 'root' user. If the necessary permission is not > > possessed, changes to priority will be silently ignored."); > > > > I changed: > > > > - "requires" -> "may require" because you don't always need > > ? a special permission to do the operation. > > - "system level permissions to be applied" -> "system level permission" > > ? so switch to singular permission, dropped "to be applied" > > - added ", e.g., being the 'root' user" > > - "If these permissions do not exist" -> "If the necessary permission is > > not possessed" > > ? so switch to singular permission, switch from "do not exist" to > > ? "is not possessed" > > > > > >> which then takes us back to Dan's comments that the warning should be > >> at time of use. But to that I maintain that because use may or may not > >> fail depending on both the available permissions and the requested > >> priority value, that it is better to have a single generic warning in > >> the existing place. > > > > This is where David and I disagree. I do not think we should issue a > > warning unless the operation failed and David prefers the generic > > warning in one place. > > > > I will reluctantly agree to always issue the warning if > > -XX:ThreadPriorityPolicy=1 is specified and user != root. > > > > Dan > > > >> > >>> ? for example : > >>> > >>> jdk/src/hotspot/share/runtime/vmThread.cpp > >>> > >>> 299? int prio = (VMThreadPriority == -1) > >>> 300??? ? os::java_to_os_priority[NearMaxPriority] > >>> 301??? : VMThreadPriority; > >>> 302? // Note that I cannot call os::set_priority because it expects Java > >>> 303? // priorities and I am *explicitly* using OS priorities so that > >>> it's > >>> 304? // possible to set the VM thread priority higher than any Java > >>> thread. > >>> 305? os::set_native_priority( this, prio ); > >>> > >>> > >>> jdk/src/hotspot/share/compiler/compileBroker.cpp > >>> > >>> 783????? int native_prio = CompilerThreadPriority; > >>> 784????? if (native_prio == -1) { > >>> 785??????? if (UseCriticalCompilerThreadPriority) { > >>> 786????????? native_prio = os::java_to_os_priority[CriticalPriority]; > >>> 787??????? } else { > >>> 788????????? native_prio = os::java_to_os_priority[NearMaxPriority]; > >>> 789??????? } > >>> 790????? } > >>> 791????? os::set_native_priority(thread, native_prio); > >>> > >>> > >>> A difference is > >>> > >>> jdk/src/hotspot/share/runtime/os.cpp > >>> > >>> 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { > >>> 218 debug_only(Thread::check_for_dangling_thread_pointer(thread);) > >>> 219 > >>> 220? if (p >= MinPriority && p <= MaxPriority) { > >>> 221??? int priority = java_to_os_priority[p]; > >>> 222??? return set_native_priority(thread, priority); > >>> > >>> Where the return? code? of? set_native_priority()?? is returned > >>> (however then it is later usually? not handled by the callers of > >>> os::set_priority. > >>> > >>>> > >>>> As I stated this is not a complete statement as on Linux at least you > >>>> also have to account for RLIMIT_RTPRIO. > >>>> > >>> > >>> If you want me to do, I can for course add a short statement about > >>> this . > >> > >> Quite the opposite, I'd rather see a generic statement about > >> permissions than try to cover all the possible situations. > >> > >> Thanks, > >> David > >> > >>> > >>> Best regards, Matthias > >>> > >>> > >>>> -----Original Message----- > >>>> From: David Holmes > >>>> Sent: Donnerstag, 3. Januar 2019 14:02 > >>>> To: Baesken, Matthias ; > >>>> daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' > >>>> dev at openjdk.java.net> > >>>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for > >>>> non-root > >>>> users on linux/bsd > >>>> > >>>> Hi Matthias, > >>>> > >>>> On 3/01/2019 9:13 pm, Baesken, Matthias wrote: > >>>>> Hello David and Dan ,? here is a second webrev : > >>>>> > >>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ > >>>>> > >>>>> - adjusted copyright years + fixed some typos > >>>>> - added? the missing return for FreeBSD? (pointed out by Dan) > >>>>> - removed? the? warning message? completely > >>>> > >>>> I still want to know how the OS_ERR gets handled by all the higher > >>>> level > >>>> code. How will this failure at runtime get reported back to application > >>>> code? > >>>> > >>>> ! // It is only used when ThreadPriorityPolicy=1 and requires root > >>>> privilege or > >>>> ! // CAP_SYS_NICE capability. > >>>> > >>>> As I stated this is not a complete statement as on Linux at least you > >>>> also have to account for RLIMIT_RTPRIO. > >>>> > >>>> Thanks, > >>>> David > >>>> ----- > >>>> > >>> > > From coleen.phillimore at oracle.com Fri Jan 4 15:36:41 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 4 Jan 2019 10:36:41 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories Message-ID: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> Summary: Use script and some manual fixup to fix directores names in include guards. Makes include guards match the current directory rooted at src/hotspot (removes VM_ in most cases). This should be low risk.? Tested with mach5 tier1 and tier2. https://bugs.openjdk.java.net/browse/JDK-8216167 http://cr.openjdk.java.net/~coleenp/8216167.01.diff I didn't generate a webrev as a space concern for cr.openjdk.java.net and nobody should click on it.? Script is posted in bug.? Will update and check copyright headers with hg commit. Thanks, Coleen From erik.osterlund at oracle.com Fri Jan 4 15:59:58 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 4 Jan 2019 16:59:58 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> Message-ID: <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> Hi David, On 2019-01-04 15:59, David Lloyd wrote: > On Fri, Jan 4, 2019 at 7:20 AM Andrew Haley wrote: >> On 1/4/19 11:07 AM, Florian Weimer wrote: >>> The guards occasionally cause bugs because they are not globally unique >>> or not spelled correctly in both places in the header file. >> Yeah, I know. That's the usual non-performance-related pro-#pragma >> once argument. After GCC and a bunch of other compilers implemented >> optimized include guards I thought that #pragma once would go away, >> but evidently not. :-) > Not a reviewer, but... > > The whole thing is starting to seem like a really bad idea to me. > I've been asking around and I can't find anyone who thinks this kind > of change is a good idea; while this clearly isn't a scientific poll, > it does not on the other hand raise my confidence any. It was pointed > out to me that the GCC documentation has been recommending against > this practice perhaps as early as 3.4. > > It boils down to this: #pragma once isn't standardized; there's simply > no guarantee it will be supported on a given compiler. This problem > seems *far* more significant to me than the risk of forgetting to > update a macro name here or there (something that can be caught on > code review), on the occasion that a header file is renamed or > relocated such that it requires a change (and how often is this likely > to happen anyway?). Hotspot relies on a whole bunch of implementation defined compiler features that are not standardized, that all of our compilers support to work. I think that seems okay as long as all compilers are covered that we build with. If we were to insist on using only completely standardized features, we would have decades of work to get there, if we could do it at all. > In addition, it was pointed out to me that if, for some reason, a > header file ends up in more than one location on the include path, > #pragma once will (probably, as it's not standardized) allow it to be > included twice, which #ifdef guards avoid. This is perhaps not a real > concern in this particular code base though. That sounds like a bug. Just because something is implementation defined, doesn't make it okay or expected to not work. Do you know how to reproduce this, and on which platform/compiler/version? Obviously, if this was an issue in our code base, one would quickly notice it doesn't build. > In the end though this is an example of the kind of change that I for > one would never allow in one of my projects: it's large, potentially > impacts portability, and yet in the end it's not really necessary, > being really just a style issue when it comes right down to it. > Include guards are standard and portable. '#pragma once' is not. But it's not just style. It's getting rid of pointless day-to-day manual (error prone) boilerplate work for something that should be automated (adding a new file, moving an existing file, renaming an existing file). And as I said in my reply to Andrew, I for example had to poke around at quite a lot of barrier set files (>100 barrier set files were added for the GC barrier interface), and rename them, and it was very tedious, and resulted in some pointless screwups in some include guards. I would not be insisting unless I perceived these issues as real, or if it was just a matter of style. So it seems to me that the to me very real problems that I do run into on a daily basis, are not real problems to you, while the to me hypothetical issues of portability of HotSpot in the future, to platforms that don't have compilers available with the features that have been around in GCC since 3.4, is a real problem (which BTW would according to the #pragma once compatibility matrix on wikipedia the "Cray C and C++" compiler as of today). Strictly speaking, it wasn't "necessary" to move away from include DB either, and that was also a large change. Yet I'm glad we did. Sometimes we gotta change things to make our lives better. Thanks, /Erik From daniel.daugherty at oracle.com Fri Jan 4 16:04:05 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Jan 2019 11:04:05 -0500 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> <9b721b57-9b5b-c995-b4ed-668cf644c181@oracle.com> Message-ID: <1addfd7a-77fd-ce6e-36a4-4dfc06382e76@oracle.com> On 1/4/19 10:38 AM, Baesken, Matthias wrote: > Hi David/Dan , here is my new webrev, the warning (for non-root) is back and uses the improved wording : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.2/ src/hotspot/os/bsd/os_bsd.cpp ??? L2260: // (e.g. root privilege or CAP_SYS_NICE capability). ??????? nit: please add a comma after 'e.g.'. ??? old L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { ??????? Why did you drop this check around the warning? The ??????? warning should only be issued if ThreadPriorityPolicy ??????? is not the default value, i.e., it was set on the cmd ??????? line to a non-default value. If some distro chooses to ??????? change the default of ThreadPriorityPolicy from 0 to 1, ??????? then this warning will always happen. src/hotspot/os/linux/os_linux.cpp ??? L4080: // (e.g. root privilege or CAP_SYS_NICE capability). ??????? nit: please add a comma after 'e.g.'. ??? old L4107: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { ??????? Why did you drop this check around the warning? src/hotspot/share/runtime/globals.hpp ??? No comments. Dan > > Best Regards , Matthias > > > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 3. Januar 2019 23:52 >> To: daniel.daugherty at oracle.com; Baesken, Matthias >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net> >> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root >> users on linux/bsd >> >> Hi Dan, >> >> Thanks for your reluctant agreement. I like your updated wording for the >> warning. >> >> David >> >> On 4/01/2019 8:21 am, Daniel D. Daugherty wrote: >>> On 1/3/19 4:49 PM, David Holmes wrote: >>>> Hi Matthias, >>>> >>>> On 4/01/2019 12:56 am, Baesken, Matthias wrote: >>>>>> I still want to know how the OS_ERR gets handled by all the higher >>>>>> level >>>>>> code. How will this failure at runtime get reported back to >>>>>> application? code? >>>>> Hi David ,?? the "best practice"? currently is? to just ignore the >>>>> return code? of? os::set_native_priority? , it is called and we hope >>>>> for the best? , >>>> Silent failure is not good. >>> Agreed. >>> >>> >>>> In that case I think it is appropriate to issue a warning whenever >>>> ThreadPriorityPolicy=1 is set, though compatibility dictates no >>>> warning in the case that you are root. >>> Agreed that you should not get the warning if you are root. >>> Reluctantly agree to always issue the warning if >>> -XX:ThreadPriorityPolicy=1 is specified and user != root. >>> >>> >>>> Which brings us back to square one and the original patch with a >>>> different warning message: >>>> >>>> warning("-XX:ThreadPriorityPolicy=1 requires system level permissions >>>> to be applied. If these permissions do not exist, changes to priority >>>> will be silently ignored."); >>> Perhaps: >>> >>> warning("-XX:ThreadPriorityPolicy=1 may require system level permission, >>> e.g., being the 'root' user. If the necessary permission is not >>> possessed, changes to priority will be silently ignored."); >>> >>> I changed: >>> >>> - "requires" -> "may require" because you don't always need >>> ? a special permission to do the operation. >>> - "system level permissions to be applied" -> "system level permission" >>> ? so switch to singular permission, dropped "to be applied" >>> - added ", e.g., being the 'root' user" >>> - "If these permissions do not exist" -> "If the necessary permission is >>> not possessed" >>> ? so switch to singular permission, switch from "do not exist" to >>> ? "is not possessed" >>> >>> >>>> which then takes us back to Dan's comments that the warning should be >>>> at time of use. But to that I maintain that because use may or may not >>>> fail depending on both the available permissions and the requested >>>> priority value, that it is better to have a single generic warning in >>>> the existing place. >>> This is where David and I disagree. I do not think we should issue a >>> warning unless the operation failed and David prefers the generic >>> warning in one place. >>> >>> I will reluctantly agree to always issue the warning if >>> -XX:ThreadPriorityPolicy=1 is specified and user != root. >>> >>> Dan >>> >>>>> ? for example : >>>>> >>>>> jdk/src/hotspot/share/runtime/vmThread.cpp >>>>> >>>>> 299? int prio = (VMThreadPriority == -1) >>>>> 300??? ? os::java_to_os_priority[NearMaxPriority] >>>>> 301??? : VMThreadPriority; >>>>> 302? // Note that I cannot call os::set_priority because it expects Java >>>>> 303? // priorities and I am *explicitly* using OS priorities so that >>>>> it's >>>>> 304? // possible to set the VM thread priority higher than any Java >>>>> thread. >>>>> 305? os::set_native_priority( this, prio ); >>>>> >>>>> >>>>> jdk/src/hotspot/share/compiler/compileBroker.cpp >>>>> >>>>> 783????? int native_prio = CompilerThreadPriority; >>>>> 784????? if (native_prio == -1) { >>>>> 785??????? if (UseCriticalCompilerThreadPriority) { >>>>> 786????????? native_prio = os::java_to_os_priority[CriticalPriority]; >>>>> 787??????? } else { >>>>> 788????????? native_prio = os::java_to_os_priority[NearMaxPriority]; >>>>> 789??????? } >>>>> 790????? } >>>>> 791????? os::set_native_priority(thread, native_prio); >>>>> >>>>> >>>>> A difference is >>>>> >>>>> jdk/src/hotspot/share/runtime/os.cpp >>>>> >>>>> 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { >>>>> 218 debug_only(Thread::check_for_dangling_thread_pointer(thread);) >>>>> 219 >>>>> 220? if (p >= MinPriority && p <= MaxPriority) { >>>>> 221??? int priority = java_to_os_priority[p]; >>>>> 222??? return set_native_priority(thread, priority); >>>>> >>>>> Where the return? code? of? set_native_priority()?? is returned >>>>> (however then it is later usually? not handled by the callers of >>>>> os::set_priority. >>>>> >>>>>> As I stated this is not a complete statement as on Linux at least you >>>>>> also have to account for RLIMIT_RTPRIO. >>>>>> >>>>> If you want me to do, I can for course add a short statement about >>>>> this . >>>> Quite the opposite, I'd rather see a generic statement about >>>> permissions than try to cover all the possible situations. >>>> >>>> Thanks, >>>> David >>>> >>>>> Best regards, Matthias >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes >>>>>> Sent: Donnerstag, 3. Januar 2019 14:02 >>>>>> To: Baesken, Matthias ; >>>>>> daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' >> >>>>> dev at openjdk.java.net> >>>>>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for >>>>>> non-root >>>>>> users on linux/bsd >>>>>> >>>>>> Hi Matthias, >>>>>> >>>>>> On 3/01/2019 9:13 pm, Baesken, Matthias wrote: >>>>>>> Hello David and Dan ,? here is a second webrev : >>>>>>> >>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ >>>>>>> >>>>>>> - adjusted copyright years + fixed some typos >>>>>>> - added? the missing return for FreeBSD? (pointed out by Dan) >>>>>>> - removed? the? warning message? completely >>>>>> I still want to know how the OS_ERR gets handled by all the higher >>>>>> level >>>>>> code. How will this failure at runtime get reported back to application >>>>>> code? >>>>>> >>>>>> ! // It is only used when ThreadPriorityPolicy=1 and requires root >>>>>> privilege or >>>>>> ! // CAP_SYS_NICE capability. >>>>>> >>>>>> As I stated this is not a complete statement as on Linux at least you >>>>>> also have to account for RLIMIT_RTPRIO. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> ----- >>>>>> From david.lloyd at redhat.com Fri Jan 4 16:29:33 2019 From: david.lloyd at redhat.com (David Lloyd) Date: Fri, 4 Jan 2019 10:29:33 -0600 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> Message-ID: On Fri, Jan 4, 2019 at 10:00 AM Erik ?sterlund wrote: > Hotspot relies on a whole bunch of implementation defined compiler > features that are not standardized, that all of our compilers support to > work. I think that seems okay as long as all compilers are covered that > we build with. If we were to insist on using only completely > standardized features, we would have decades of work to get there, if we > could do it at all. This is a weak logical argument: perhaps it's not 100% standards compliant, sure, but that doesn't mean that the idea of complying to the standard should be thrown away, or that standards compliance in a given situation is not a valid consideration. > That sounds like a bug. Just because something is implementation > defined, doesn't make it okay or expected to not work. Do you know how > to reproduce this, and on which platform/compiler/version? Obviously, if > this was an issue in our code base, one would quickly notice it doesn't > build. My understanding of the problem is that it's just a question of creating a header file, copying it to two locations (or having multiple apparent locations due to symlinking or similar), and then including it at both locations. The include guard can prevent this by virtue of having the same macro name in both header files, whereas #pragma once does not prevent this as it generally seems to use the file path as the identity of the file. An alternative implementation might hash and compare the file contents, but I strongly doubt we'd see such an implementation as it would be detrimental to performance (and also weak against small changes to the file). Another possibility I haven't tried is to put the same file on the -I path *directly* more than once, and then simply #include it twice in a row. A reasonable interpretation of "#pragma once" is that it includes the given *file* or *full file path* one time, as opposed to including the given *file name* one time. "Sounds like a bug" is the tricky phrase here, because it's not standardized, so who is to say what the "correct" behavior is? Mostly it comes down to a given person's subjective notion of "common sense"; if everyone had a truly confluent notion of common sense, then we'd only have one compiler implementation, and wouldn't need standards at all. Of course, such "logic" doesn't hold up to the real world for one nanosecond. -- - DML From matthias.baesken at sap.com Fri Jan 4 17:04:34 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 4 Jan 2019 17:04:34 +0000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: <1addfd7a-77fd-ce6e-36a4-4dfc06382e76@oracle.com> References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> <9b721b57-9b5b-c995-b4ed-668cf644c181@oracle.com> <1addfd7a-77fd-ce6e-36a4-4dfc06382e76@oracle.com> Message-ID: New webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.3/ - comma added - FLAG_IS_DEFAULT(...) check is back - I thought that the check is no longer needed but maybe I was wrong Best regards, Matthias > -----Original Message----- > From: Daniel D. Daugherty > Sent: Freitag, 4. Januar 2019 17:04 > To: Baesken, Matthias ; David Holmes > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net> > Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root > users on linux/bsd > > On 1/4/19 10:38 AM, Baesken, Matthias wrote: > > Hi David/Dan , here is my new webrev, the warning (for non-root) is > back and uses the improved wording : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.2/ > > src/hotspot/os/bsd/os_bsd.cpp > ??? L2260: // (e.g. root privilege or CAP_SYS_NICE capability). > ??????? nit: please add a comma after 'e.g.'. > > ??? old L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { > ??????? Why did you drop this check around the warning? The > ??????? warning should only be issued if ThreadPriorityPolicy > ??????? is not the default value, i.e., it was set on the cmd > ??????? line to a non-default value. If some distro chooses to > ??????? change the default of ThreadPriorityPolicy from 0 to 1, > ??????? then this warning will always happen. > > src/hotspot/os/linux/os_linux.cpp > ??? L4080: // (e.g. root privilege or CAP_SYS_NICE capability). > ??????? nit: please add a comma after 'e.g.'. > > ??? old L4107: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { > ??????? Why did you drop this check around the warning? > > src/hotspot/share/runtime/globals.hpp > ??? No comments. > > Dan > > > > > Best Regards , Matthias > > > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Donnerstag, 3. Januar 2019 23:52 > >> To: daniel.daugherty at oracle.com; Baesken, Matthias > >> ; 'hotspot-dev at openjdk.java.net' > >> dev at openjdk.java.net> > >> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non- > root > >> users on linux/bsd > >> > >> Hi Dan, > >> > >> Thanks for your reluctant agreement. I like your updated wording for the > >> warning. > >> > >> David > >> > >> On 4/01/2019 8:21 am, Daniel D. Daugherty wrote: > >>> On 1/3/19 4:49 PM, David Holmes wrote: > >>>> Hi Matthias, > >>>> > >>>> On 4/01/2019 12:56 am, Baesken, Matthias wrote: > >>>>>> I still want to know how the OS_ERR gets handled by all the higher > >>>>>> level > >>>>>> code. How will this failure at runtime get reported back to > >>>>>> application? code? > >>>>> Hi David ,?? the "best practice"? currently is? to just ignore the > >>>>> return code? of? os::set_native_priority? , it is called and we hope > >>>>> for the best? , > >>>> Silent failure is not good. > >>> Agreed. > >>> > >>> > >>>> In that case I think it is appropriate to issue a warning whenever > >>>> ThreadPriorityPolicy=1 is set, though compatibility dictates no > >>>> warning in the case that you are root. > >>> Agreed that you should not get the warning if you are root. > >>> Reluctantly agree to always issue the warning if > >>> -XX:ThreadPriorityPolicy=1 is specified and user != root. > >>> > >>> > >>>> Which brings us back to square one and the original patch with a > >>>> different warning message: > >>>> > >>>> warning("-XX:ThreadPriorityPolicy=1 requires system level permissions > >>>> to be applied. If these permissions do not exist, changes to priority > >>>> will be silently ignored."); > >>> Perhaps: > >>> > >>> warning("-XX:ThreadPriorityPolicy=1 may require system level > permission, > >>> e.g., being the 'root' user. If the necessary permission is not > >>> possessed, changes to priority will be silently ignored."); > >>> > >>> I changed: > >>> > >>> - "requires" -> "may require" because you don't always need > >>> ? a special permission to do the operation. > >>> - "system level permissions to be applied" -> "system level permission" > >>> ? so switch to singular permission, dropped "to be applied" > >>> - added ", e.g., being the 'root' user" > >>> - "If these permissions do not exist" -> "If the necessary permission is > >>> not possessed" > >>> ? so switch to singular permission, switch from "do not exist" to > >>> ? "is not possessed" > >>> > >>> > >>>> which then takes us back to Dan's comments that the warning should > be > >>>> at time of use. But to that I maintain that because use may or may not > >>>> fail depending on both the available permissions and the requested > >>>> priority value, that it is better to have a single generic warning in > >>>> the existing place. > >>> This is where David and I disagree. I do not think we should issue a > >>> warning unless the operation failed and David prefers the generic > >>> warning in one place. > >>> > >>> I will reluctantly agree to always issue the warning if > >>> -XX:ThreadPriorityPolicy=1 is specified and user != root. > >>> > >>> Dan > >>> > >>>>> ? for example : > >>>>> > >>>>> jdk/src/hotspot/share/runtime/vmThread.cpp > >>>>> > >>>>> 299? int prio = (VMThreadPriority == -1) > >>>>> 300??? ? os::java_to_os_priority[NearMaxPriority] > >>>>> 301??? : VMThreadPriority; > >>>>> 302? // Note that I cannot call os::set_priority because it expects Java > >>>>> 303? // priorities and I am *explicitly* using OS priorities so that > >>>>> it's > >>>>> 304? // possible to set the VM thread priority higher than any Java > >>>>> thread. > >>>>> 305? os::set_native_priority( this, prio ); > >>>>> > >>>>> > >>>>> jdk/src/hotspot/share/compiler/compileBroker.cpp > >>>>> > >>>>> 783????? int native_prio = CompilerThreadPriority; > >>>>> 784????? if (native_prio == -1) { > >>>>> 785??????? if (UseCriticalCompilerThreadPriority) { > >>>>> 786????????? native_prio = os::java_to_os_priority[CriticalPriority]; > >>>>> 787??????? } else { > >>>>> 788????????? native_prio = os::java_to_os_priority[NearMaxPriority]; > >>>>> 789??????? } > >>>>> 790????? } > >>>>> 791????? os::set_native_priority(thread, native_prio); > >>>>> > >>>>> > >>>>> A difference is > >>>>> > >>>>> jdk/src/hotspot/share/runtime/os.cpp > >>>>> > >>>>> 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { > >>>>> 218 > debug_only(Thread::check_for_dangling_thread_pointer(thread);) > >>>>> 219 > >>>>> 220? if (p >= MinPriority && p <= MaxPriority) { > >>>>> 221??? int priority = java_to_os_priority[p]; > >>>>> 222??? return set_native_priority(thread, priority); > >>>>> > >>>>> Where the return? code? of? set_native_priority()?? is returned > >>>>> (however then it is later usually? not handled by the callers of > >>>>> os::set_priority. > >>>>> > >>>>>> As I stated this is not a complete statement as on Linux at least you > >>>>>> also have to account for RLIMIT_RTPRIO. > >>>>>> > >>>>> If you want me to do, I can for course add a short statement about > >>>>> this . > >>>> Quite the opposite, I'd rather see a generic statement about > >>>> permissions than try to cover all the possible situations. > >>>> > >>>> Thanks, > >>>> David > >>>> > >>>>> Best regards, Matthias > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: David Holmes > >>>>>> Sent: Donnerstag, 3. Januar 2019 14:02 > >>>>>> To: Baesken, Matthias ; > >>>>>> daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' > >> >>>>>> dev at openjdk.java.net> > >>>>>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for > >>>>>> non-root > >>>>>> users on linux/bsd > >>>>>> > >>>>>> Hi Matthias, > >>>>>> > >>>>>> On 3/01/2019 9:13 pm, Baesken, Matthias wrote: > >>>>>>> Hello David and Dan ,? here is a second webrev : > >>>>>>> > >>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ > >>>>>>> > >>>>>>> - adjusted copyright years + fixed some typos > >>>>>>> - added? the missing return for FreeBSD? (pointed out by Dan) > >>>>>>> - removed? the? warning message? completely > >>>>>> I still want to know how the OS_ERR gets handled by all the higher > >>>>>> level > >>>>>> code. How will this failure at runtime get reported back to application > >>>>>> code? > >>>>>> > >>>>>> ! // It is only used when ThreadPriorityPolicy=1 and requires root > >>>>>> privilege or > >>>>>> ! // CAP_SYS_NICE capability. > >>>>>> > >>>>>> As I stated this is not a complete statement as on Linux at least you > >>>>>> also have to account for RLIMIT_RTPRIO. > >>>>>> > >>>>>> Thanks, > >>>>>> David > >>>>>> ----- > >>>>>> From daniel.daugherty at oracle.com Fri Jan 4 17:13:00 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 4 Jan 2019 12:13:00 -0500 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> <9b721b57-9b5b-c995-b4ed-668cf644c181@oracle.com> <1addfd7a-77fd-ce6e-36a4-4dfc06382e76@oracle.com> Message-ID: <6e4737b0-3521-8358-31af-3de5d4adda00@oracle.com> On 1/4/19 12:04 PM, Baesken, Matthias wrote: > New webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.3/ src/hotspot/os/bsd/os_bsd.cpp ??? No comments. src/hotspot/os/linux/os_linux.cpp ??? No comments. src/hotspot/share/runtime/globals.hpp ??? No comments. Thumbs up. Dan > > - comma added > - FLAG_IS_DEFAULT(...) check is back - I thought that the check is no longer needed but maybe I was wrong > > > Best regards, Matthias > > >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Freitag, 4. Januar 2019 17:04 >> To: Baesken, Matthias ; David Holmes >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net> >> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root >> users on linux/bsd >> >> On 1/4/19 10:38 AM, Baesken, Matthias wrote: >>> Hi David/Dan , here is my new webrev, the warning (for non-root) is >> back and uses the improved wording : >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.2/ >> src/hotspot/os/bsd/os_bsd.cpp >> ??? L2260: // (e.g. root privilege or CAP_SYS_NICE capability). >> ??????? nit: please add a comma after 'e.g.'. >> >> ??? old L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { >> ??????? Why did you drop this check around the warning? The >> ??????? warning should only be issued if ThreadPriorityPolicy >> ??????? is not the default value, i.e., it was set on the cmd >> ??????? line to a non-default value. If some distro chooses to >> ??????? change the default of ThreadPriorityPolicy from 0 to 1, >> ??????? then this warning will always happen. >> >> src/hotspot/os/linux/os_linux.cpp >> ??? L4080: // (e.g. root privilege or CAP_SYS_NICE capability). >> ??????? nit: please add a comma after 'e.g.'. >> >> ??? old L4107: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { >> ??????? Why did you drop this check around the warning? >> >> src/hotspot/share/runtime/globals.hpp >> ??? No comments. >> >> Dan >> >>> Best Regards , Matthias >>> >>> >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Donnerstag, 3. Januar 2019 23:52 >>>> To: daniel.daugherty at oracle.com; Baesken, Matthias >>>> ; 'hotspot-dev at openjdk.java.net' >> >>> dev at openjdk.java.net> >>>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non- >> root >>>> users on linux/bsd >>>> >>>> Hi Dan, >>>> >>>> Thanks for your reluctant agreement. I like your updated wording for the >>>> warning. >>>> >>>> David >>>> >>>> On 4/01/2019 8:21 am, Daniel D. Daugherty wrote: >>>>> On 1/3/19 4:49 PM, David Holmes wrote: >>>>>> Hi Matthias, >>>>>> >>>>>> On 4/01/2019 12:56 am, Baesken, Matthias wrote: >>>>>>>> I still want to know how the OS_ERR gets handled by all the higher >>>>>>>> level >>>>>>>> code. How will this failure at runtime get reported back to >>>>>>>> application? code? >>>>>>> Hi David ,?? the "best practice"? currently is? to just ignore the >>>>>>> return code? of? os::set_native_priority? , it is called and we hope >>>>>>> for the best? , >>>>>> Silent failure is not good. >>>>> Agreed. >>>>> >>>>> >>>>>> In that case I think it is appropriate to issue a warning whenever >>>>>> ThreadPriorityPolicy=1 is set, though compatibility dictates no >>>>>> warning in the case that you are root. >>>>> Agreed that you should not get the warning if you are root. >>>>> Reluctantly agree to always issue the warning if >>>>> -XX:ThreadPriorityPolicy=1 is specified and user != root. >>>>> >>>>> >>>>>> Which brings us back to square one and the original patch with a >>>>>> different warning message: >>>>>> >>>>>> warning("-XX:ThreadPriorityPolicy=1 requires system level permissions >>>>>> to be applied. If these permissions do not exist, changes to priority >>>>>> will be silently ignored."); >>>>> Perhaps: >>>>> >>>>> warning("-XX:ThreadPriorityPolicy=1 may require system level >> permission, >>>>> e.g., being the 'root' user. If the necessary permission is not >>>>> possessed, changes to priority will be silently ignored."); >>>>> >>>>> I changed: >>>>> >>>>> - "requires" -> "may require" because you don't always need >>>>> ? a special permission to do the operation. >>>>> - "system level permissions to be applied" -> "system level permission" >>>>> ? so switch to singular permission, dropped "to be applied" >>>>> - added ", e.g., being the 'root' user" >>>>> - "If these permissions do not exist" -> "If the necessary permission is >>>>> not possessed" >>>>> ? so switch to singular permission, switch from "do not exist" to >>>>> ? "is not possessed" >>>>> >>>>> >>>>>> which then takes us back to Dan's comments that the warning should >> be >>>>>> at time of use. But to that I maintain that because use may or may not >>>>>> fail depending on both the available permissions and the requested >>>>>> priority value, that it is better to have a single generic warning in >>>>>> the existing place. >>>>> This is where David and I disagree. I do not think we should issue a >>>>> warning unless the operation failed and David prefers the generic >>>>> warning in one place. >>>>> >>>>> I will reluctantly agree to always issue the warning if >>>>> -XX:ThreadPriorityPolicy=1 is specified and user != root. >>>>> >>>>> Dan >>>>> >>>>>>> ? for example : >>>>>>> >>>>>>> jdk/src/hotspot/share/runtime/vmThread.cpp >>>>>>> >>>>>>> 299? int prio = (VMThreadPriority == -1) >>>>>>> 300??? ? os::java_to_os_priority[NearMaxPriority] >>>>>>> 301??? : VMThreadPriority; >>>>>>> 302? // Note that I cannot call os::set_priority because it expects Java >>>>>>> 303? // priorities and I am *explicitly* using OS priorities so that >>>>>>> it's >>>>>>> 304? // possible to set the VM thread priority higher than any Java >>>>>>> thread. >>>>>>> 305? os::set_native_priority( this, prio ); >>>>>>> >>>>>>> >>>>>>> jdk/src/hotspot/share/compiler/compileBroker.cpp >>>>>>> >>>>>>> 783????? int native_prio = CompilerThreadPriority; >>>>>>> 784????? if (native_prio == -1) { >>>>>>> 785??????? if (UseCriticalCompilerThreadPriority) { >>>>>>> 786????????? native_prio = os::java_to_os_priority[CriticalPriority]; >>>>>>> 787??????? } else { >>>>>>> 788????????? native_prio = os::java_to_os_priority[NearMaxPriority]; >>>>>>> 789??????? } >>>>>>> 790????? } >>>>>>> 791????? os::set_native_priority(thread, native_prio); >>>>>>> >>>>>>> >>>>>>> A difference is >>>>>>> >>>>>>> jdk/src/hotspot/share/runtime/os.cpp >>>>>>> >>>>>>> 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { >>>>>>> 218 >> debug_only(Thread::check_for_dangling_thread_pointer(thread);) >>>>>>> 219 >>>>>>> 220? if (p >= MinPriority && p <= MaxPriority) { >>>>>>> 221??? int priority = java_to_os_priority[p]; >>>>>>> 222??? return set_native_priority(thread, priority); >>>>>>> >>>>>>> Where the return? code? of? set_native_priority()?? is returned >>>>>>> (however then it is later usually? not handled by the callers of >>>>>>> os::set_priority. >>>>>>> >>>>>>>> As I stated this is not a complete statement as on Linux at least you >>>>>>>> also have to account for RLIMIT_RTPRIO. >>>>>>>> >>>>>>> If you want me to do, I can for course add a short statement about >>>>>>> this . >>>>>> Quite the opposite, I'd rather see a generic statement about >>>>>> permissions than try to cover all the possible situations. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Best regards, Matthias >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: David Holmes >>>>>>>> Sent: Donnerstag, 3. Januar 2019 14:02 >>>>>>>> To: Baesken, Matthias ; >>>>>>>> daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' >>>> >>>>>>> dev at openjdk.java.net> >>>>>>>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for >>>>>>>> non-root >>>>>>>> users on linux/bsd >>>>>>>> >>>>>>>> Hi Matthias, >>>>>>>> >>>>>>>> On 3/01/2019 9:13 pm, Baesken, Matthias wrote: >>>>>>>>> Hello David and Dan ,? here is a second webrev : >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ >>>>>>>>> >>>>>>>>> - adjusted copyright years + fixed some typos >>>>>>>>> - added? the missing return for FreeBSD? (pointed out by Dan) >>>>>>>>> - removed? the? warning message? completely >>>>>>>> I still want to know how the OS_ERR gets handled by all the higher >>>>>>>> level >>>>>>>> code. How will this failure at runtime get reported back to application >>>>>>>> code? >>>>>>>> >>>>>>>> ! // It is only used when ThreadPriorityPolicy=1 and requires root >>>>>>>> privilege or >>>>>>>> ! // CAP_SYS_NICE capability. >>>>>>>> >>>>>>>> As I stated this is not a complete statement as on Linux at least you >>>>>>>> also have to account for RLIMIT_RTPRIO. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> ----- >>>>>>>> From thomas.stuefe at gmail.com Fri Jan 4 17:24:28 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 4 Jan 2019 18:24:28 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> Message-ID: On Fri, Jan 4, 2019 at 4:01 PM David Lloyd wrote: > In the end though this is an example of the kind of change that I for > one would never allow in one of my projects: it's large, potentially > impacts portability, and yet in the end it's not really necessary, > being really just a style issue when it comes right down to it. > Include guards are standard and portable. '#pragma once' is not. > > -- > - DML FWIW, I agree with David on this. Include guards are a simple mechanism, while the potential troubles surrounding #pragma once worry me. Include guard errors are easy to find and fix. But the potential issues with pragma once sound difficult to analyze and almost impossible to fix if the compiler turns out to be the culprit. Note that at SAP people tend to build out-of-tree and often across file system borders, with sources on a shared file system and a local output directory. So yes, that is a common usage scenario. That said, I can understand Erik's pain when creating/changing so many includes. But how common is this scenario? Changing so many include files happens usually in the course of major rewrites which I would hope do not occur so often that we need to optimize our workflow for them. After all, these changes also bring other disruption: file history gets broken, it is more difficult to compare code across JDK versions etc. Bottomline I would prefer keeping include guards and maybe add a tool to generate include guards automatically. Thanks, Thomas From erik.osterlund at oracle.com Fri Jan 4 17:58:09 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 4 Jan 2019 18:58:09 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> Message-ID: <0c202966-321a-b20b-c074-dbcb5a72fcc1@oracle.com> Hi David, On 2019-01-04 17:29, David Lloyd wrote: > On Fri, Jan 4, 2019 at 10:00 AM Erik ?sterlund > wrote: >> Hotspot relies on a whole bunch of implementation defined compiler >> features that are not standardized, that all of our compilers support to >> work. I think that seems okay as long as all compilers are covered that >> we build with. If we were to insist on using only completely >> standardized features, we would have decades of work to get there, if we >> could do it at all. > > This is a weak logical argument: perhaps it's not 100% standards > compliant, sure, but that doesn't mean that the idea of complying to > the standard should be thrown away, or that standards compliance in a > given situation is not a valid consideration. Sure. But as mentioned, it's only the Cray C/C++ compiler as of today that doesn't support #pragma once. And if a new platform in the future comes along that doesn't support #pragma once, that we want to support, we can easily switch back with a simple script, like the one Kim pointed out. > >> That sounds like a bug. Just because something is implementation >> defined, doesn't make it okay or expected to not work. Do you know how >> to reproduce this, and on which platform/compiler/version? Obviously, if >> this was an issue in our code base, one would quickly notice it doesn't >> build. > > My understanding of the problem is that it's just a question of > creating a header file, copying it to two locations (or having > multiple apparent locations due to symlinking or similar), and then > including it at both locations. The include guard can prevent this by > virtue of having the same macro name in both header files, whereas > #pragma once does not prevent this as it generally seems to use the > file path as the identity of the file. An alternative implementation > might hash and compare the file contents, but I strongly doubt we'd > see such an implementation as it would be detrimental to performance > (and also weak against small changes to the file). > > Another possibility I haven't tried is to put the same file on the -I > path *directly* more than once, and then simply #include it twice in a > row. A reasonable interpretation of "#pragma once" is that it > includes the given *file* or *full file path* one time, as opposed to > including the given *file name* one time. "Sounds like a bug" is the > tricky phrase here, because it's not standardized, so who is to say > what the "correct" behavior is? Mostly it comes down to a given > person's subjective notion of "common sense"; if everyone had a truly > confluent notion of common sense, then we'd only have one compiler > implementation, and wouldn't need standards at all. Of course, such > "logic" doesn't hold up to the real world for one nanosecond. > I tried all of these described scenarios locally. They all work fine with my GCC compiler, with different include paths and resolving symlinks. So I'm not sure what compiler/version has the bug you are talking about. Do you know? Naturally, if you have two copies of a file, then of course that won't compile the way it would with include guards. But trying to maintain two exact copies of a file, and relying on include guards to always mask away one of the two copies, sounds like a *really* bad, and potentially dangerous idea, and something we should avoid at any cost. So providing a compiler error in such a scenario sounds much safer than allowing that kind of setup. As for the argument that implementation dependent features may do whatever they please and hence should be avoided... that's not very helpful. For example, we couldn't even use reinterpret_cast without relying on implementation dependent features. So if we want to ride on our high horses and not use implementation dependent compiler features, we would have to rewrite HotSpot as well. In this case, if the path resolution was different in a strange compiler causing a file to be included twice, then we would simply not be able to build, and then decide if a conversion back is the step forward, or fixing the bug in that C++ compiler. It would never subtly be able to build but have different behaviour. So I'm not sure what the worry is here. Thanks, /Erik From lois.foltan at oracle.com Fri Jan 4 18:00:03 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Fri, 4 Jan 2019 13:00:03 -0500 Subject: RFR 8215731: Move forward class definitions out of globalDefinitions.hpp In-Reply-To: References: <546b2f19-a166-df67-c05e-6b5d5c44f844@oracle.com> Message-ID: <1b0067c3-3d40-b998-5743-39c1f569df15@oracle.com> +1. Lois On 1/4/2019 12:08 AM, David Holmes wrote: > Hi Coleen, > > Seems fine. > > Thanks, > David > > On 4/01/2019 1:20 pm, coleen.phillimore at oracle.com wrote: >> Summary: redistribute the forward declarations to the header files >> that need them. >> >> Tested with mach5 tier1 and 2, also tested aarch64 with cross >> compiler and zero. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8215731.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8215731 >> >> Thanks, >> Coleen >> From lois.foltan at oracle.com Fri Jan 4 18:03:27 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Fri, 4 Jan 2019 13:03:27 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> Message-ID: <1badee2a-3153-3674-f56f-d8dde91843ed@oracle.com> Looks good. Lois On 1/4/2019 10:36 AM, coleen.phillimore at oracle.com wrote: > Summary: Use script and some manual fixup to fix directores names in > include guards. > > Makes include guards match the current directory rooted at src/hotspot > (removes VM_ in most cases). > > This should be low risk.? Tested with mach5 tier1 and tier2. > > https://bugs.openjdk.java.net/browse/JDK-8216167 > http://cr.openjdk.java.net/~coleenp/8216167.01.diff > > I didn't generate a webrev as a space concern for cr.openjdk.java.net > and nobody should click on it.? Script is posted in bug.? Will update > and check copyright headers with hg commit. > > Thanks, > Coleen From sgehwolf at redhat.com Fri Jan 4 18:09:44 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 04 Jan 2019 19:09:44 +0100 Subject: [Containers] Reasoning for cpu shares limits Message-ID: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> Hi, Having come across this cloud foundry issue[1], I wonder why the cgroup cpu shares' value is being used in the JVM as a heuristic for available processors. >From the man page from docker-run: --------------------------------------------------------- --cpu-shares=0 CPU shares (relative weight) By default, all containers get the same proportion of CPU cycles. This proportion can be modified by changing the container's CPU share weighting relative to the weighting of all other running containers. To modify the proportion from the default of 1024, use the --cpu-shares flag to set the weighting to 2 or higher. The proportion will only apply when CPU-intensive processes are running. When tasks in one container are idle, other containers can use the left-over CPU time. The actual amount of CPU time will vary depending on the number of containers running on the system. For example, consider three containers, one has a cpu-share of 1024 and two others have a cpu-share setting of 512. When processes in all three containers attempt to use 100% of CPU, the first container would receive 50% of the total CPU time. If you add a fourth container with a cpu-share of 1024, the first container only gets 33% of the CPU. The remaining containers receive 16.5%, 16.5% and 33% of the CPU. On a multi-core system, the shares of CPU time are distributed over all CPU cores. Even if a container is limited to less than 100% of CPU time, it can use 100% of each individual CPU core. For example, consider a system with more than three cores. If you start one container {C0} with -c=512 running one process, and another container {C1} with -c=1024 running two processes, this can result in the following division of CPU shares: PID container CPU CPU share 100 {C0} 0 100% of CPU0 101 {C1} 1 100% of CPU1 102 {C1} 2 100% of CPU2 --------------------------------------------------------- So the cpu shares value (unlike --cpu-quota) is a relative weight. For example, those three cpu-shares settings are equivalent (C1-C4 are containers; '-c' is a short-cut for '--cpu-shares'): A[i] ------------- C1 => -c=122 C2 => -c=122 C3 => -c=61 C4 => -c=61 B[ii] ------------- C1 => -c=1026 C2 => -c=1026 C3 => -c=513 C4 => -c=513 C[iii] ------------- C1 => -c=2048 C2 => -c=2048 C3 => -c=1024 C4 => -c=1024 For A the container CPU heuristics will determine for the JVM to use 1 CPU for C1-C4. For B and C, the container CPU heuristics will determine for the JVM to use 2 CPUs for C1 and C2 and 1 CPU for C3 and C4 which seems rather inconsistent and arbitrary. The reason this is happening is that 1024 seems to have gotten a questionable meaning in [2]. I wonder why? The JVM cannot reasonably determine from the relative weight of --cpu- shares' value how many CPUs it should use. As it's a relative weight that's something for the container runtime to take into account. It appears to me that the container detection code should probably fall back to the host CPU value and only take CPU quotas into account. Am I missing something obvious here? All I could find was this in JDK- 8146115: """ If cpu_shares has been setup for the container, the number_of_cpus() will be calculated based on cpu_shares()/1024. 1024 is the default and standard unit for calculating relative cpu """ "1024 is the default and standard unit for calculating relative cpu" seems a wrong assumption to me. Thoughts? Thanks, Severin [1] https://github.com/cloudfoundry/java-buildpack/issues/650#issuecomment-441777166 [2] http://hg.openjdk.java.net/jdk/jdk/rev/7f22774a5f42#l4.43 [i]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c122.out.log http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c61.out.log [ii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1026.out.log http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c513.out.log [iii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c2048.out.log http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1024.out.log * Files produced with: $ for i in 1026 513 2048 1024 122 61; do sudo docker run -ti -c=$i --rm fedora28-jdks:v1 /jdk-head/bin/java -showversion -Xlog:os+container=trace RuntimeProc > container-resources-cpu/c${i}.out.log; done $ sudo docker run -ti --rm fedora28-jdks:v1 cat RuntimeProc.java public class RuntimeProc { public static void main(String[] args) { int availProc = Runtime.getRuntime().availableProcessors(); System.out.println(">>> Available processors: " + availProc + " <<<<"); } } From coleen.phillimore at oracle.com Fri Jan 4 18:35:47 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 4 Jan 2019 13:35:47 -0500 Subject: RFR 8215731: Move forward class definitions out of globalDefinitions.hpp In-Reply-To: <1b0067c3-3d40-b998-5743-39c1f569df15@oracle.com> References: <546b2f19-a166-df67-c05e-6b5d5c44f844@oracle.com> <1b0067c3-3d40-b998-5743-39c1f569df15@oracle.com> Message-ID: <69770ca1-2fca-9034-a4c2-b369357ce6fe@oracle.com> Thank you Lois! Coleen On 1/4/19 1:00 PM, Lois Foltan wrote: > +1. > Lois > > On 1/4/2019 12:08 AM, David Holmes wrote: >> Hi Coleen, >> >> Seems fine. >> >> Thanks, >> David >> >> On 4/01/2019 1:20 pm, coleen.phillimore at oracle.com wrote: >>> Summary: redistribute the forward declarations to the header files >>> that need them. >>> >>> Tested with mach5 tier1 and 2, also tested aarch64 with cross >>> compiler and zero. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8215731.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8215731 >>> >>> Thanks, >>> Coleen >>> > From coleen.phillimore at oracle.com Fri Jan 4 18:37:57 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 4 Jan 2019 13:37:57 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: <1badee2a-3153-3674-f56f-d8dde91843ed@oracle.com> References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <1badee2a-3153-3674-f56f-d8dde91843ed@oracle.com> Message-ID: <5a2e3110-a69b-d1fe-a693-9ae9dbda4428@oracle.com> Thank you Lois!? This is an alternative to the #pragma once, depending on how the discussion concludes. Coleen On 1/4/19 1:03 PM, Lois Foltan wrote: > Looks good. > Lois > > On 1/4/2019 10:36 AM, coleen.phillimore at oracle.com wrote: >> Summary: Use script and some manual fixup to fix directores names in >> include guards. >> >> Makes include guards match the current directory rooted at >> src/hotspot (removes VM_ in most cases). >> >> This should be low risk.? Tested with mach5 tier1 and tier2. >> >> https://bugs.openjdk.java.net/browse/JDK-8216167 >> http://cr.openjdk.java.net/~coleenp/8216167.01.diff >> >> I didn't generate a webrev as a space concern for cr.openjdk.java.net >> and nobody should click on it.? Script is posted in bug.? Will update >> and check copyright headers with hg commit. >> >> Thanks, >> Coleen > From fweimer at redhat.com Fri Jan 4 18:59:32 2019 From: fweimer at redhat.com (Florian Weimer) Date: Fri, 04 Jan 2019 19:59:32 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: (David Lloyd's message of "Fri, 4 Jan 2019 08:59:38 -0600") References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> Message-ID: <874laont9n.fsf@oldenburg2.str.redhat.com> * David Lloyd: > The whole thing is starting to seem like a really bad idea to me. > I've been asking around and I can't find anyone who thinks this kind > of change is a good idea; while this clearly isn't a scientific poll, > it does not on the other hand raise my confidence any. It was pointed > out to me that the GCC documentation has been recommending against > this practice perhaps as early as 3.4. I've always found this a bit disingenuous because any bug in #pragma once would be a bug in #import for the Objective-C frontend and would need fixing in libcpp anyway. For some reason, it's a very emotional topic. Since #pragma once documents the intent in a clearer fashion than include guards, migrating off it again should be very easy. On the hand, the renaming problem could also be fixed by just using random identifiers as the include guard that are not derived from the header file name. 8-) Thanks, Florian From kim.barrett at oracle.com Fri Jan 4 21:52:15 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 4 Jan 2019 16:52:15 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> Message-ID: <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> > On Jan 4, 2019, at 10:36 AM, coleen.phillimore at oracle.com wrote: > > Summary: Use script and some manual fixup to fix directores names in include guards. > > Makes include guards match the current directory rooted at src/hotspot (removes VM_ in most cases). > > This should be low risk. Tested with mach5 tier1 and tier2. > > https://bugs.openjdk.java.net/browse/JDK-8216167 > http://cr.openjdk.java.net/~coleenp/8216167.01.diff > > I didn't generate a webrev as a space concern for cr.openjdk.java.net and nobody should click on it. Script is posted in bug. Will update and check copyright headers with hg commit. > > Thanks, > Coleen There are incorrect changes in src/hotspot/cpu/arm/globalDefinitions_arm.hpp The script is not being careful to *only* modify #include guards. I didn?t look for other similar problems. (This is an example of why I suggested rolling your own script might not actually be easier than using the guardonce utilities.) From david.holmes at oracle.com Fri Jan 4 22:27:15 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 5 Jan 2019 08:27:15 +1000 Subject: [Containers] Reasoning for cpu shares limits In-Reply-To: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> References: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> Message-ID: Hi Severin, On 5/01/2019 4:09 am, Severin Gehwolf wrote: > Hi, > > Having come across this cloud foundry issue[1], I wonder why the cgroup > cpu shares' value is being used in the JVM as a heuristic for available > processors. See also: https://bugs.openjdk.java.net/browse/JDK-8197589 There's quite a bit of history on this, and it may be spread across a number of bugs and review threads. Hopefully Bob can provide a neat summary :) Cheers, David > From the man page from docker-run: > > --------------------------------------------------------- > --cpu-shares=0 > CPU shares (relative weight) > > By default, all containers get the same proportion of CPU cycles. This proportion can be modified by changing the container's CPU share weighting relative to the weighting of all other running > containers. > > To modify the proportion from the default of 1024, use the --cpu-shares flag to set the weighting to 2 or higher. > > The proportion will only apply when CPU-intensive processes are running. When tasks in one container are idle, other containers can use the left-over CPU time. The actual amount of CPU time will > vary depending on the number of containers running on the system. > > For example, consider three containers, one has a cpu-share of 1024 and two others have a cpu-share setting of 512. When processes in all three containers attempt to use 100% of CPU, the first > container would receive 50% of the total CPU time. If you add a fourth container with a cpu-share of 1024, the first container only gets 33% of the CPU. The remaining containers receive 16.5%, 16.5% > and 33% of the CPU. > > On a multi-core system, the shares of CPU time are distributed over all CPU cores. Even if a container is limited to less than 100% of CPU time, it can use 100% of each individual CPU core. > > For example, consider a system with more than three cores. If you start one container {C0} with -c=512 running one process, and another container {C1} with -c=1024 running two processes, this can > result in the following division of CPU shares: > > PID container CPU CPU share > 100 {C0} 0 100% of CPU0 > 101 {C1} 1 100% of CPU1 > 102 {C1} 2 100% of CPU2 > > --------------------------------------------------------- > > So the cpu shares value (unlike --cpu-quota) is a relative weight. > > For example, those three cpu-shares settings are equivalent (C1-C4 are > containers; '-c' is a short-cut for '--cpu-shares'): > > A[i] > ------------- > C1 => -c=122 > C2 => -c=122 > C3 => -c=61 > C4 => -c=61 > > B[ii] > ------------- > C1 => -c=1026 > C2 => -c=1026 > C3 => -c=513 > C4 => -c=513 > > C[iii] > ------------- > C1 => -c=2048 > C2 => -c=2048 > C3 => -c=1024 > C4 => -c=1024 > > For A the container CPU heuristics will determine for the JVM to use 1 > CPU for C1-C4. For B and C, the container CPU heuristics will determine > for the JVM to use 2 CPUs for C1 and C2 and 1 CPU for C3 and C4 which > seems rather inconsistent and arbitrary. The reason this is happening > is that 1024 seems to have gotten a questionable meaning in [2]. I > wonder why? > > The JVM cannot reasonably determine from the relative weight of --cpu- > shares' value how many CPUs it should use. As it's a relative weight > that's something for the container runtime to take into account. It > appears to me that the container detection code should probably fall > back to the host CPU value and only take CPU quotas into account. > > Am I missing something obvious here? All I could find was this in JDK- > 8146115: > """ > If cpu_shares has been setup for the container, the number_of_cpus() > will be calculated based on cpu_shares()/1024. 1024 is the default and > standard unit for calculating relative cpu > """ > > "1024 is the default and standard unit for calculating relative cpu" > seems a wrong assumption to me. Thoughts? > > Thanks, > Severin > > [1] https://github.com/cloudfoundry/java-buildpack/issues/650#issuecomment-441777166 > [2] http://hg.openjdk.java.net/jdk/jdk/rev/7f22774a5f42#l4.43 > [i]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c122.out.log > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c61.out.log > [ii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1026.out.log > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c513.out.log > [iii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c2048.out.log > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1024.out.log > > * Files produced with: > > $ for i in 1026 513 2048 1024 122 61; do sudo docker run -ti -c=$i --rm fedora28-jdks:v1 /jdk-head/bin/java -showversion -Xlog:os+container=trace RuntimeProc > container-resources-cpu/c${i}.out.log; done > $ sudo docker run -ti --rm fedora28-jdks:v1 cat RuntimeProc.java > public class RuntimeProc { > public static void main(String[] args) { > int availProc = Runtime.getRuntime().availableProcessors(); > System.out.println(">>> Available processors: " + availProc + " <<<<"); > } > } > > > From bob.vandette at oracle.com Fri Jan 4 22:34:59 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Fri, 4 Jan 2019 17:34:59 -0500 Subject: [Containers] Reasoning for cpu shares limits In-Reply-To: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> References: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> Message-ID: <2FA9BDBB-6DBA-41F2-BAD8-0CA08B606481@oracle.com> Hi Severin, There has been much debate on the best algorithm for selecting the number of CPUs that is reported by the Java Runtime when running in containers. Although the value for cpu-shares can be set to any of the values that you mention, we decided to follow the convention set by Kubernetes and other container orchestration products that use 1024 as the unit for cpu shares. Ignoring the cpu shares in this case is not what users of this popular technology want. https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu ? The spec.containers[].resources.requests.cpu is converted to its core value, which is potentially fractional, and multiplied by 1024. The greater of this number or 2 is used as the value of the --cpu-shares flag in the docker run command. ? The spec.containers[].resources.limits.cpu is converted to its millicore value and multiplied by 100. The resulting value is the total amount of CPU time that a container can use every 100ms. A container cannot use more than its share of CPU time during this interval. There are a few options that can be used if our default behavior doesn?t work for you. 1. Use quotas in addition to or instead of shares. 2. Specify -XX:ActiveProcessorCount=value Bob. > On Jan 4, 2019, at 1:09 PM, Severin Gehwolf wrote: > > Hi, > > Having come across this cloud foundry issue[1], I wonder why the cgroup > cpu shares' value is being used in the JVM as a heuristic for available > processors. > > From the man page from docker-run: > > --------------------------------------------------------- > --cpu-shares=0 > CPU shares (relative weight) > > By default, all containers get the same proportion of CPU cycles. This proportion can be modified by changing the container's CPU share weighting relative to the weighting of all other running > containers. > > To modify the proportion from the default of 1024, use the --cpu-shares flag to set the weighting to 2 or higher. > > The proportion will only apply when CPU-intensive processes are running. When tasks in one container are idle, other containers can use the left-over CPU time. The actual amount of CPU time will > vary depending on the number of containers running on the system. > > For example, consider three containers, one has a cpu-share of 1024 and two others have a cpu-share setting of 512. When processes in all three containers attempt to use 100% of CPU, the first > container would receive 50% of the total CPU time. If you add a fourth container with a cpu-share of 1024, the first container only gets 33% of the CPU. The remaining containers receive 16.5%, 16.5% > and 33% of the CPU. > > On a multi-core system, the shares of CPU time are distributed over all CPU cores. Even if a container is limited to less than 100% of CPU time, it can use 100% of each individual CPU core. > > For example, consider a system with more than three cores. If you start one container {C0} with -c=512 running one process, and another container {C1} with -c=1024 running two processes, this can > result in the following division of CPU shares: > > PID container CPU CPU share > 100 {C0} 0 100% of CPU0 > 101 {C1} 1 100% of CPU1 > 102 {C1} 2 100% of CPU2 > > --------------------------------------------------------- > > So the cpu shares value (unlike --cpu-quota) is a relative weight. > > For example, those three cpu-shares settings are equivalent (C1-C4 are > containers; '-c' is a short-cut for '--cpu-shares'): > > A[i] > ------------- > C1 => -c=122 > C2 => -c=122 > C3 => -c=61 > C4 => -c=61 > > B[ii] > ------------- > C1 => -c=1026 > C2 => -c=1026 > C3 => -c=513 > C4 => -c=513 > > C[iii] > ------------- > C1 => -c=2048 > C2 => -c=2048 > C3 => -c=1024 > C4 => -c=1024 > > For A the container CPU heuristics will determine for the JVM to use 1 > CPU for C1-C4. For B and C, the container CPU heuristics will determine > for the JVM to use 2 CPUs for C1 and C2 and 1 CPU for C3 and C4 which > seems rather inconsistent and arbitrary. The reason this is happening > is that 1024 seems to have gotten a questionable meaning in [2]. I > wonder why? > > The JVM cannot reasonably determine from the relative weight of --cpu- > shares' value how many CPUs it should use. As it's a relative weight > that's something for the container runtime to take into account. It > appears to me that the container detection code should probably fall > back to the host CPU value and only take CPU quotas into account. > > Am I missing something obvious here? All I could find was this in JDK- > 8146115: > """ > If cpu_shares has been setup for the container, the number_of_cpus() > will be calculated based on cpu_shares()/1024. 1024 is the default and > standard unit for calculating relative cpu > """ > > "1024 is the default and standard unit for calculating relative cpu" > seems a wrong assumption to me. Thoughts? > > Thanks, > Severin > > [1] https://github.com/cloudfoundry/java-buildpack/issues/650#issuecomment-441777166 > [2] http://hg.openjdk.java.net/jdk/jdk/rev/7f22774a5f42#l4.43 > [i]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c122.out.log > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c61.out.log > [ii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1026.out.log > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c513.out.log > [iii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c2048.out.log > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1024.out.log > > * Files produced with: > > $ for i in 1026 513 2048 1024 122 61; do sudo docker run -ti -c=$i --rm fedora28-jdks:v1 /jdk-head/bin/java -showversion -Xlog:os+container=trace RuntimeProc > container-resources-cpu/c${i}.out.log; done > $ sudo docker run -ti --rm fedora28-jdks:v1 cat RuntimeProc.java > public class RuntimeProc { > public static void main(String[] args) { > int availProc = Runtime.getRuntime().availableProcessors(); > System.out.println(">>> Available processors: " + availProc + " <<<<"); > } > } > > > From david.holmes at oracle.com Fri Jan 4 22:36:41 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 5 Jan 2019 08:36:41 +1000 Subject: RFR [XS]: 8215961: jdk/jfr/event/os/TestCPUInformation.java fails on AArch64 In-Reply-To: <24077e4c-7106-82d4-b67e-2653cc037198@oracle.com> References: <24077e4c-7106-82d4-b67e-2653cc037198@oracle.com> Message-ID: Seems my email got lost over New Year break :( David On 29/12/2018 7:29 am, David Holmes wrote: > Hi Matthias, > > On 28/12/2018 11:35 pm, Baesken, Matthias wrote: >> Hello, please review this small? fix . >> >> At the moment,? the test??? jdk/jfr/event/os/TestCPUInformation.java >> fails on AArch64 with the following error : >> >> >> java.lang.RuntimeException: Value not in (Intel, AMD, Unknown x86, >> SPARC, ARM, PPC, PowerPC, AArch64, zArch), field='description', >> value='0x50:0x0:0x000:1, simd' >> >> >> Reason is that? the? jdk.CPUInformation?? event misses? a known CPU >> identifier value? in the? description,? see the? description part of >> it? : >> >> Event: jdk.CPUInformation { >> ??? .... >> ?? description = "0x50:0x0:0x000:1, simd" >> ?? sockets = 8 >> ?? .... >> } >> >> >> The patch adds? the CPU identifier info to? the?? _cpu_desc?? string >> where it is taken from?? . >> Please compare also with the ppc - implementation where the info >> (PPC)?? is already added . >> >> vm_version_ext_ppc.cpp >> >> 50???? snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "PPC %s", >> features_string()); >> >> >> >> Bug/webrev : >> >> https://bugs.openjdk.java.net/browse/JDK-8215961 >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8215961.0/ > > Seems a reasonable approach. > > I would expect other ARM systems to also be affected by this. > > Thanks, > David > >> >> Thanks, Matthias >> From coleen.phillimore at oracle.com Fri Jan 4 23:19:15 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 4 Jan 2019 18:19:15 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> Message-ID: <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> On 1/4/19 4:52 PM, Kim Barrett wrote: >> On Jan 4, 2019, at 10:36 AM, coleen.phillimore at oracle.com wrote: >> >> Summary: Use script and some manual fixup to fix directores names in include guards. >> >> Makes include guards match the current directory rooted at src/hotspot (removes VM_ in most cases). >> >> This should be low risk. Tested with mach5 tier1 and tier2. >> >> https://bugs.openjdk.java.net/browse/JDK-8216167 >> http://cr.openjdk.java.net/~coleenp/8216167.01.diff >> >> I didn't generate a webrev as a space concern for cr.openjdk.java.net and nobody should click on it. Script is posted in bug. Will update and check copyright headers with hg commit. >> >> Thanks, >> Coleen > There are incorrect changes in src/hotspot/cpu/arm/globalDefinitions_arm.hpp Thank you for finding this. > The script is not being careful to *only* modify #include guards. I didn?t look for other similar problems. > (This is an example of why I suggested rolling your own script might not actually be easier than using > the guardonce utilities.) > I looked through again and didn't see any other problems.? I'm not planning on productizing my script, which admittedly is too simple for the general job.? But I still found it useful and entertaining (to me) for this particular task. http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 Thanks, Coleen From aph at redhat.com Sat Jan 5 11:31:27 2019 From: aph at redhat.com (Andrew Haley) Date: Sat, 5 Jan 2019 11:31:27 +0000 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> Message-ID: <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> On 1/4/19 3:59 PM, Erik ?sterlund wrote: > On 2019-01-04 15:59, David Lloyd wrote: >> In addition, it was pointed out to me that if, for some reason, a >> header file ends up in more than one location on the include path, >> #pragma once will (probably, as it's not standardized) allow it to >> be included twice, which #ifdef guards avoid. This is perhaps not >> a real concern in this particular code base though. > > That sounds like a bug. Just because something is implementation > defined, doesn't make it okay or expected to not work. Do you know > how to reproduce this, and on which platform/compiler/version? > Obviously, if this was an issue in our code base, one would quickly > notice it doesn't build. It's not a bug, exactly. It's that the question of "is this the same file?" is extremely difficult to answer definitively. Not all filesystems give you a reliable way to answer that question. Sure, you can kludge around the problem with modification times and maybe even a collision-free hash, but getting it really correct is not going to be efficient, and not even possible until that question is rigorously defined. There's a good reason why #pragma once still isn't standard. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Sat Jan 5 15:03:28 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Sat, 5 Jan 2019 16:03:28 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> Message-ID: Hi Andrew, On 2019-01-05 12:31, Andrew Haley wrote: > On 1/4/19 3:59 PM, Erik ?sterlund wrote: > >> On 2019-01-04 15:59, David Lloyd wrote: >>> In addition, it was pointed out to me that if, for some reason, a >>> header file ends up in more than one location on the include path, >>> #pragma once will (probably, as it's not standardized) allow it to >>> be included twice, which #ifdef guards avoid. This is perhaps not >>> a real concern in this particular code base though. >> >> That sounds like a bug. Just because something is implementation >> defined, doesn't make it okay or expected to not work. Do you know >> how to reproduce this, and on which platform/compiler/version? >> Obviously, if this was an issue in our code base, one would quickly >> notice it doesn't build. > > It's not a bug, exactly. It's that the question of "is this the same > file?" is extremely difficult to answer definitively. Not all > filesystems give you a reliable way to answer that question. Sure, you > can kludge around the problem with modification times and maybe even a > collision-free hash, but getting it really correct is not going to be > efficient, and not even possible until that question is rigorously > defined. There's a good reason why #pragma once still isn't standard. > Perhaps. Yet Java has to support Path.toRealPath(), which resolves symlinks. And similarly, realpath() has been part of the POSIX standard since 2008. And even C++17 defines std::filesystem::canonical() as part of the standard. So it seems to me that any system that can't build HotSpot because of the inadequacy of the underlying system to tell symlinked files apart, will also not be able to support said standardized APIs either. Unless again someone shoots himself/herself in the foot intentionally and actually keeps 2 copies of the same file around, and includes both. We should *never* do that, and I would love to get a compiler error if anyone tried to do that. I have asked this before, but does anyone actually know of a compiler/os/filesystem combo that has a #pragma once implementation that gets confused about symbolic links or different include paths, or is this all a hypothetical problem, for a hypothetical compiler+os+filesystem combo that can probably never support e.g. C++17? Perhaps a simple test could be written for this that fails reliably on such systems, so we don't get any surprises. I would rather test this to see if it is a problem or not, instead of having a long hypothetical argument about it, based on what somebody told somebody, with guesses about how relevant compilers may or may not handle this differently and may or may not have different interpretations about whether files should be canonicalized or not. Also, the principle of optimizing for the normal common case yet allowing the odd uncommon case has value. If there exists some hypothetical scenario that could in theory be produced, yet never happens and arguably will never happen in HotSpot (and would immediately fail to build if it occurred), then I don't see why that should block us from making our lives easier. Sure, maybe somebody eventually wants to do some recursive self-include fixed-point iteration magic like Hans Boehm's libatomic. You probably don't want to do that. But you still can do that in those particular files. That doesn't have to constrain the rest of the code base, and dictate the rules for all other files. Especially when there are zero such exceptional cases present. This was the reasoning behind #import in Objective-C as well. There were hypothetical scenarios where you would need #include instead of #import. It sure never happened to me or anyone I know or heard of in reality. But if you really had to, for whatever reason, you could. But that shouldn't stop you from living an easier life in the 99.99% scenario. And remember, converting back can be easily done with a script at any time, should we run into trouble at any point in the future, and hit the wall, because we really need this to build with the Cray C/C++ compiler or whatever. Thanks, /Erik From aph at redhat.com Sat Jan 5 17:26:24 2019 From: aph at redhat.com (Andrew Haley) Date: Sat, 5 Jan 2019 17:26:24 +0000 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> Message-ID: On 1/5/19 3:03 PM, Erik ?sterlund wrote: > I have asked this before, but does anyone actually know of a > compiler/os/filesystem combo that has a #pragma once implementation that > gets confused about symbolic links or different include paths, or is > this all a hypothetical problem, for a hypothetical > compiler+os+filesystem combo that can probably never support e.g. C++17? When we were discussing the possibility of using compiler intrinsics for atomic operations you used the following hypothetical reason to oppose the idea: > 2) Even if you could and the compiler happens to generate that - we > can not rely on it because there is no contract to the compiler what > fence instructions it elects to use. The only contract the compiler > needs to abide to is how atomic C++ operations interact with other > C++ operations. And we do not want the underlying fencing to > silently change when performing compiler upgrades. I could surely have replied to this with a question along the lines of "Does anyone know of a compiler/processor combo that actually changed its atomic operations in a semantically incompatible way?", i.e. used exactly the same reasoning as you're using above. It's not practically possible for a compiler silently to change atomics in the way that you suggested here because it would break binary compatibility, but you insisted that because it's *hypothetically* possible we shouldn't use such atomic intrinsics. Perhaps we should beware of relying on compiler properties that aren't standardized and so might silently change when performing compiler upgrades, even when such changes are extremely unlikely. I don't think so, but I admit it is at least an argument. I suggest to you that these two cases are very similar: they're both about relying on compiler behaviour that is not standardized. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Sat Jan 5 19:24:04 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Sat, 5 Jan 2019 20:24:04 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> Message-ID: <2c73cea1-74ae-8c98-6efc-9855aa444605@oracle.com> Hi Andrew, On 2019-01-05 18:26, Andrew Haley wrote: > On 1/5/19 3:03 PM, Erik ?sterlund wrote: > >> I have asked this before, but does anyone actually know of a >> compiler/os/filesystem combo that has a #pragma once implementation that >> gets confused about symbolic links or different include paths, or is >> this all a hypothetical problem, for a hypothetical >> compiler+os+filesystem combo that can probably never support e.g. C++17? > > When we were discussing the possibility of using compiler intrinsics > for atomic operations you used the following hypothetical reason to > oppose the idea: > >> 2) Even if you could and the compiler happens to generate that - we >> can not rely on it because there is no contract to the compiler what >> fence instructions it elects to use. The only contract the compiler >> needs to abide to is how atomic C++ operations interact with other >> C++ operations. And we do not want the underlying fencing to >> silently change when performing compiler upgrades. > > I could surely have replied to this with a question along the lines of > "Does anyone know of a compiler/processor combo that actually changed > its atomic operations in a semantically incompatible way?", i.e. used > exactly the same reasoning as you're using above. > > It's not practically possible for a compiler silently to change > atomics in the way that you suggested here because it would break > binary compatibility, but you insisted that because it's > *hypothetically* possible we shouldn't use such atomic intrinsics. > > Perhaps we should beware of relying on compiler properties that aren't > standardized and so might silently change when performing compiler > upgrades, even when such changes are extremely unlikely. I don't think > so, but I admit it is at least an argument. > > I suggest to you that these two cases are very similar: they're both > about relying on compiler behaviour that is not standardized. > I see that is a similarity indeed. But there are important differences. The main difference is that compiler internal ABI for atomics on ARMv7 and PPC (which was my particular concern in that conversation), a) do have incompatible bindings that are allowed by the standard described in papers with proposed bindings (as I pointed out then), b) would be really dangerous if it subtly changed because it could go undetected for a long time before anyone noticed stranged crashes because of it. We essentially rely on the generated machine code to have an exact machine code binding that is compatible. And for what it's worth, I am okay with changing the x64 Atomic/OrderAccess implementation to use compiler intrinsics. Because there is essentially no risk due to the nature of the ISA. However, hypothetical differences in whether symbolic references are followed or not for #pragma once would lead to HotSpot either building or not, depending on whether it relies on that or not (pretty sure it doesn't), and never cause bugs to silently infect the binary. Conversely, not using #pragma once and relying on all files getting the manually typed include guards right, seems more dangerous to me. So the atomics reliance comes with a risk, the #pragma once reliance does not - it removes a risk. If we truly stop relying on compiler features that are implementation defined, when there are no risks involved, we would end up crippled and get nothing done. Thanks, /Erik From david.holmes at oracle.com Sat Jan 5 21:22:32 2019 From: david.holmes at oracle.com (David Holmes) Date: Sun, 6 Jan 2019 07:22:32 +1000 Subject: RFR (S) 8216188: Remove expired flags in JDK 13 Message-ID: webrev: http://cr.openjdk.java.net/~dholmes/8216188/webrev/ bug: https://bugs.openjdk.java.net/browse/JDK-8216188 This removes all the expired flags from the special_jvm_flags table, and updates a test. Calvin: this includes IgnoreUnverifiableClassesDuringDump but I can just merge with your change in 8213002 if you push first. Or you can let me do it and close 8213002 as a dup. Testing: tiers 1-3 Thanks, David From david.holmes at oracle.com Sat Jan 5 21:31:27 2019 From: david.holmes at oracle.com (David Holmes) Date: Sun, 6 Jan 2019 07:31:27 +1000 Subject: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root users on linux/bsd In-Reply-To: References: <8f7ff454-ac87-505a-fe9c-15b4d4113093@oracle.com> <7f38de34-459f-27da-95ce-5528abf58f95@oracle.com> <2a31b3ca-85fc-29cb-413d-94dd98775a74@oracle.com> <44c628fa-81ce-ac22-da00-89dcc31ab314@oracle.com> <3a68eebc-3b3e-2911-e6e4-b7913f9a773a@oracle.com> <9b721b57-9b5b-c995-b4ed-668cf644c181@oracle.com> <1addfd7a-77fd-ce6e-36a4-4dfc06382e76@oracle.com> Message-ID: <59d1dfcd-f007-06b1-0a6c-c22c8b3935fb@oracle.com> Looks good to me. Thanks, David On 5/01/2019 3:04 am, Baesken, Matthias wrote: > New webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.3/ > > - comma added > - FLAG_IS_DEFAULT(...) check is back - I thought that the check is no longer needed but maybe I was wrong > > > Best regards, Matthias > > >> -----Original Message----- >> From: Daniel D. Daugherty >> Sent: Freitag, 4. Januar 2019 17:04 >> To: Baesken, Matthias ; David Holmes >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net> >> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non-root >> users on linux/bsd >> >> On 1/4/19 10:38 AM, Baesken, Matthias wrote: >>> Hi David/Dan , here is my new webrev, the warning (for non-root) is >> back and uses the improved wording : >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.2/ >> >> src/hotspot/os/bsd/os_bsd.cpp >> ??? L2260: // (e.g. root privilege or CAP_SYS_NICE capability). >> ??????? nit: please add a comma after 'e.g.'. >> >> ??? old L2310: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { >> ??????? Why did you drop this check around the warning? The >> ??????? warning should only be issued if ThreadPriorityPolicy >> ??????? is not the default value, i.e., it was set on the cmd >> ??????? line to a non-default value. If some distro chooses to >> ??????? change the default of ThreadPriorityPolicy from 0 to 1, >> ??????? then this warning will always happen. >> >> src/hotspot/os/linux/os_linux.cpp >> ??? L4080: // (e.g. root privilege or CAP_SYS_NICE capability). >> ??????? nit: please add a comma after 'e.g.'. >> >> ??? old L4107: ????? if (!FLAG_IS_DEFAULT(ThreadPriorityPolicy)) { >> ??????? Why did you drop this check around the warning? >> >> src/hotspot/share/runtime/globals.hpp >> ??? No comments. >> >> Dan >> >>> >>> Best Regards , Matthias >>> >>> >>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Donnerstag, 3. Januar 2019 23:52 >>>> To: daniel.daugherty at oracle.com; Baesken, Matthias >>>> ; 'hotspot-dev at openjdk.java.net' >> >>> dev at openjdk.java.net> >>>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for non- >> root >>>> users on linux/bsd >>>> >>>> Hi Dan, >>>> >>>> Thanks for your reluctant agreement. I like your updated wording for the >>>> warning. >>>> >>>> David >>>> >>>> On 4/01/2019 8:21 am, Daniel D. Daugherty wrote: >>>>> On 1/3/19 4:49 PM, David Holmes wrote: >>>>>> Hi Matthias, >>>>>> >>>>>> On 4/01/2019 12:56 am, Baesken, Matthias wrote: >>>>>>>> I still want to know how the OS_ERR gets handled by all the higher >>>>>>>> level >>>>>>>> code. How will this failure at runtime get reported back to >>>>>>>> application? code? >>>>>>> Hi David ,?? the "best practice"? currently is? to just ignore the >>>>>>> return code? of? os::set_native_priority? , it is called and we hope >>>>>>> for the best? , >>>>>> Silent failure is not good. >>>>> Agreed. >>>>> >>>>> >>>>>> In that case I think it is appropriate to issue a warning whenever >>>>>> ThreadPriorityPolicy=1 is set, though compatibility dictates no >>>>>> warning in the case that you are root. >>>>> Agreed that you should not get the warning if you are root. >>>>> Reluctantly agree to always issue the warning if >>>>> -XX:ThreadPriorityPolicy=1 is specified and user != root. >>>>> >>>>> >>>>>> Which brings us back to square one and the original patch with a >>>>>> different warning message: >>>>>> >>>>>> warning("-XX:ThreadPriorityPolicy=1 requires system level permissions >>>>>> to be applied. If these permissions do not exist, changes to priority >>>>>> will be silently ignored."); >>>>> Perhaps: >>>>> >>>>> warning("-XX:ThreadPriorityPolicy=1 may require system level >> permission, >>>>> e.g., being the 'root' user. If the necessary permission is not >>>>> possessed, changes to priority will be silently ignored."); >>>>> >>>>> I changed: >>>>> >>>>> - "requires" -> "may require" because you don't always need >>>>> ? a special permission to do the operation. >>>>> - "system level permissions to be applied" -> "system level permission" >>>>> ? so switch to singular permission, dropped "to be applied" >>>>> - added ", e.g., being the 'root' user" >>>>> - "If these permissions do not exist" -> "If the necessary permission is >>>>> not possessed" >>>>> ? so switch to singular permission, switch from "do not exist" to >>>>> ? "is not possessed" >>>>> >>>>> >>>>>> which then takes us back to Dan's comments that the warning should >> be >>>>>> at time of use. But to that I maintain that because use may or may not >>>>>> fail depending on both the available permissions and the requested >>>>>> priority value, that it is better to have a single generic warning in >>>>>> the existing place. >>>>> This is where David and I disagree. I do not think we should issue a >>>>> warning unless the operation failed and David prefers the generic >>>>> warning in one place. >>>>> >>>>> I will reluctantly agree to always issue the warning if >>>>> -XX:ThreadPriorityPolicy=1 is specified and user != root. >>>>> >>>>> Dan >>>>> >>>>>>> ? for example : >>>>>>> >>>>>>> jdk/src/hotspot/share/runtime/vmThread.cpp >>>>>>> >>>>>>> 299? int prio = (VMThreadPriority == -1) >>>>>>> 300??? ? os::java_to_os_priority[NearMaxPriority] >>>>>>> 301??? : VMThreadPriority; >>>>>>> 302? // Note that I cannot call os::set_priority because it expects Java >>>>>>> 303? // priorities and I am *explicitly* using OS priorities so that >>>>>>> it's >>>>>>> 304? // possible to set the VM thread priority higher than any Java >>>>>>> thread. >>>>>>> 305? os::set_native_priority( this, prio ); >>>>>>> >>>>>>> >>>>>>> jdk/src/hotspot/share/compiler/compileBroker.cpp >>>>>>> >>>>>>> 783????? int native_prio = CompilerThreadPriority; >>>>>>> 784????? if (native_prio == -1) { >>>>>>> 785??????? if (UseCriticalCompilerThreadPriority) { >>>>>>> 786????????? native_prio = os::java_to_os_priority[CriticalPriority]; >>>>>>> 787??????? } else { >>>>>>> 788????????? native_prio = os::java_to_os_priority[NearMaxPriority]; >>>>>>> 789??????? } >>>>>>> 790????? } >>>>>>> 791????? os::set_native_priority(thread, native_prio); >>>>>>> >>>>>>> >>>>>>> A difference is >>>>>>> >>>>>>> jdk/src/hotspot/share/runtime/os.cpp >>>>>>> >>>>>>> 217OSReturn os::set_priority(Thread* thread, ThreadPriority p) { >>>>>>> 218 >> debug_only(Thread::check_for_dangling_thread_pointer(thread);) >>>>>>> 219 >>>>>>> 220? if (p >= MinPriority && p <= MaxPriority) { >>>>>>> 221??? int priority = java_to_os_priority[p]; >>>>>>> 222??? return set_native_priority(thread, priority); >>>>>>> >>>>>>> Where the return? code? of? set_native_priority()?? is returned >>>>>>> (however then it is later usually? not handled by the callers of >>>>>>> os::set_priority. >>>>>>> >>>>>>>> As I stated this is not a complete statement as on Linux at least you >>>>>>>> also have to account for RLIMIT_RTPRIO. >>>>>>>> >>>>>>> If you want me to do, I can for course add a short statement about >>>>>>> this . >>>>>> Quite the opposite, I'd rather see a generic statement about >>>>>> permissions than try to cover all the possible situations. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Best regards, Matthias >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: David Holmes >>>>>>>> Sent: Donnerstag, 3. Januar 2019 14:02 >>>>>>>> To: Baesken, Matthias ; >>>>>>>> daniel.daugherty at oracle.com; 'hotspot-dev at openjdk.java.net' >>>> >>>>>>> dev at openjdk.java.net> >>>>>>>> Subject: Re: RFR : 8215962: Support ThreadPriorityPolicy mode 1 for >>>>>>>> non-root >>>>>>>> users on linux/bsd >>>>>>>> >>>>>>>> Hi Matthias, >>>>>>>> >>>>>>>> On 3/01/2019 9:13 pm, Baesken, Matthias wrote: >>>>>>>>> Hello David and Dan ,? here is a second webrev : >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8215962.1/ >>>>>>>>> >>>>>>>>> - adjusted copyright years + fixed some typos >>>>>>>>> - added? the missing return for FreeBSD? (pointed out by Dan) >>>>>>>>> - removed? the? warning message? completely >>>>>>>> I still want to know how the OS_ERR gets handled by all the higher >>>>>>>> level >>>>>>>> code. How will this failure at runtime get reported back to application >>>>>>>> code? >>>>>>>> >>>>>>>> ! // It is only used when ThreadPriorityPolicy=1 and requires root >>>>>>>> privilege or >>>>>>>> ! // CAP_SYS_NICE capability. >>>>>>>> >>>>>>>> As I stated this is not a complete statement as on Linux at least you >>>>>>>> also have to account for RLIMIT_RTPRIO. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> ----- >>>>>>>> > From kim.barrett at oracle.com Sun Jan 6 06:47:28 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 6 Jan 2019 01:47:28 -0500 Subject: RFR (S) 8216188: Remove expired flags in JDK 13 In-Reply-To: References: Message-ID: <95F24F7E-7AFD-468F-89EC-0A7F2D5A95F1@oracle.com> > On Jan 5, 2019, at 4:22 PM, David Holmes wrote: > > webrev: http://cr.openjdk.java.net/~dholmes/8216188/webrev/ > bug: https://bugs.openjdk.java.net/browse/JDK-8216188 > > This removes all the expired flags from the special_jvm_flags table, and updates a test. > > Calvin: this includes IgnoreUnverifiableClassesDuringDump but I can just merge with your change in 8213002 if you push first. Or you can let me do it and close 8213002 as a dup. > > Testing: tiers 1-3 > > Thanks, > David Looks good. From david.holmes at oracle.com Sun Jan 6 12:22:16 2019 From: david.holmes at oracle.com (David Holmes) Date: Sun, 6 Jan 2019 22:22:16 +1000 Subject: RFR (S) 8216188: Remove expired flags in JDK 13 In-Reply-To: <95F24F7E-7AFD-468F-89EC-0A7F2D5A95F1@oracle.com> References: <95F24F7E-7AFD-468F-89EC-0A7F2D5A95F1@oracle.com> Message-ID: <05800dae-2288-dba9-82d6-95e416c24a69@oracle.com> Thanks Kim! David On 6/01/2019 4:47 pm, Kim Barrett wrote: >> On Jan 5, 2019, at 4:22 PM, David Holmes wrote: >> >> webrev: http://cr.openjdk.java.net/~dholmes/8216188/webrev/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8216188 >> >> This removes all the expired flags from the special_jvm_flags table, and updates a test. >> >> Calvin: this includes IgnoreUnverifiableClassesDuringDump but I can just merge with your change in 8213002 if you push first. Or you can let me do it and close 8213002 as a dup. >> >> Testing: tiers 1-3 >> >> Thanks, >> David > > Looks good. > From kim.barrett at oracle.com Sun Jan 6 18:44:05 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 6 Jan 2019 13:44:05 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> Message-ID: <06E408E5-9AD0-4B3D-9D2D-9D7F1CCC1FA3@oracle.com> > On Jan 5, 2019, at 10:03 AM, Erik ?sterlund wrote: > > Hi Andrew, > > On 2019-01-05 12:31, Andrew Haley wrote: >> On 1/4/19 3:59 PM, Erik ?sterlund wrote: >>> On 2019-01-04 15:59, David Lloyd wrote: >>>> In addition, it was pointed out to me that if, for some reason, a >>>> header file ends up in more than one location on the include path, >>>> #pragma once will (probably, as it's not standardized) allow it to >>>> be included twice, which #ifdef guards avoid. This is perhaps not >>>> a real concern in this particular code base though. >>> >>> That sounds like a bug. Just because something is implementation >>> defined, doesn't make it okay or expected to not work. Do you know >>> how to reproduce this, and on which platform/compiler/version? >>> Obviously, if this was an issue in our code base, one would quickly >>> notice it doesn't build. >> It's not a bug, exactly. It's that the question of "is this the same >> file?" is extremely difficult to answer definitively. Not all >> filesystems give you a reliable way to answer that question. Sure, you >> can kludge around the problem with modification times and maybe even a >> collision-free hash, but getting it really correct is not going to be >> efficient, and not even possible until that question is rigorously >> defined. There's a good reason why #pragma once still isn't standard. > > Perhaps. Yet Java has to support Path.toRealPath(), which resolves symlinks. And similarly, realpath() has been part of the POSIX standard since 2008. And even C++17 defines std::filesystem::canonical() as part of the standard. So it seems to me that any system that can't build HotSpot because of the inadequacy of the underlying system to tell symlinked files apart, will also not be able to support said standardized APIs either. Unless again someone shoots himself/herself in the foot intentionally and actually keeps 2 copies of the same file around, and includes both. We should *never* do that, and I would love to get a compiler error if anyone tried to do that. > > I have asked this before, but does anyone actually know of a compiler/os/filesystem combo that has a #pragma once implementation that gets confused about symbolic links or different include paths, or is this all a hypothetical problem, for a hypothetical compiler+os+filesystem combo that can probably never support e.g. C++17? Perhaps a simple test could be written for this that fails reliably on such systems, so we don't get any surprises. I would rather test this to see if it is a problem or not, instead of having a long hypothetical argument about it, based on what somebody told somebody, with guesses about how relevant compilers may or may not handle this differently and may or may not have different interpretations about whether files should be canonicalized or not. Symlinks might have been a problem in the past; I don't think they have been for a long time. I never mentioned them in my initial analysis. The cases where it's hard (perhaps impossible?) to determine that two files are the same that I've heard of involve different mount points that can get you to the same place. For example, there were these references from my earlier message: ---------- Some examples are discussed in the following messages. Having a bind-mount involved can mess things up, for example. I don't know if that's a realistic scenario for building the JDK or HotSpot. https://lists.qt-project.org/pipermail/development/2018-October/067467.html https://lists.qt-project.org/pipermail/development/2018-October/067471.html ---------- The second of those describes cases where realpath doesn't transform two different paths to the same location to the same canonical path. It's not even clear what a "canonical" path would be for some cases like this. This is the sort of thing that I assume has given the committee pause when considering standardization of such a feature. Whether such a setup is at all plausible when building HotSpot is a rather different question. I suspect the answer is no, or at least one would have to work hard to produce a problem. But I long ago stopped assuming that users wouldn't do things that seemed strange to me. The way that include paths can get one in trouble has to do with the "usual" behavior of '#include "..."' starting the search with respect to the current directory. (I've seen discussions in some projects suggesting that one should always use <...> syntax for include, to never bypass the configured include path.) I think the problematic cases related to this cannot happen for HotSpot, so long as there aren't "junk" files with inopportune names in the source tree. But there are ways for such to cause problems regardless, so I don't consider that a problem for this discussion. I don't presently have a strong opinion on the matter. But I do want it to be considered on its real merits, and not seemingly obsolete issues like problems with symlinks. From calvin.cheung at oracle.com Mon Jan 7 00:43:43 2019 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Sun, 06 Jan 2019 16:43:43 -0800 Subject: RFR (S) 8216188: Remove expired flags in JDK 13 In-Reply-To: References: Message-ID: <5C32A0BF.5000109@oracle.com> Hi David, Changes look good. I will close 8213002 as a dup. thanks, Calvin On 1/5/19, 1:22 PM, David Holmes wrote: > webrev: http://cr.openjdk.java.net/~dholmes/8216188/webrev/ > bug: https://bugs.openjdk.java.net/browse/JDK-8216188 > > This removes all the expired flags from the special_jvm_flags table, > and updates a test. > > Calvin: this includes IgnoreUnverifiableClassesDuringDump but I can > just merge with your change in 8213002 if you push first. Or you can > let me do it and close 8213002 as a dup. > > Testing: tiers 1-3 > > Thanks, > David From david.holmes at oracle.com Mon Jan 7 00:49:34 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 7 Jan 2019 10:49:34 +1000 Subject: RFR (S) 8216188: Remove expired flags in JDK 13 In-Reply-To: <5C32A0BF.5000109@oracle.com> References: <5C32A0BF.5000109@oracle.com> Message-ID: <12b06b2a-59f7-6465-e165-5c16d35b32cf@oracle.com> On 7/01/2019 10:43 am, Calvin Cheung wrote: > Hi David, > > Changes look good. > > I will close 8213002 as a dup. Thanks Calvin! David > thanks, > Calvin > > On 1/5/19, 1:22 PM, David Holmes wrote: >> webrev: http://cr.openjdk.java.net/~dholmes/8216188/webrev/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8216188 >> >> This removes all the expired flags from the special_jvm_flags table, >> and updates a test. >> >> Calvin: this includes IgnoreUnverifiableClassesDuringDump but I can >> just merge with your change in 8213002 if you push first. Or you can >> let me do it and close 8213002 as a dup. >> >> Testing: tiers 1-3 >> >> Thanks, >> David From per.liden at oracle.com Mon Jan 7 08:40:04 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 7 Jan 2019 09:40:04 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> Message-ID: <3727d1da-256c-54b7-2d9b-f819ef08cfa4@oracle.com> Hi Coleen, On 1/3/19 3:31 AM, coleen.phillimore at oracle.com wrote: > > Here is the webrev and bug link. > > open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev Looks like your script is now leaving an extra empty line at the end of all files, which wasn't there before. Fox example: --- old/src/hotspot/share/gc/z/zAddress.hpp 2019-01-02 16:41:04.209075410 -0500 +++ new/src/hotspot/share/gc/z/zAddress.hpp 2019-01-02 16:41:03.957075419 -0500 @@ -64,4 +63,3 @@ static void flip_to_remapped(); }; -#endif // SHARE_GC_Z_ZADDRESS_HPP Should be: --- old/src/hotspot/share/gc/z/zAddress.hpp 2019-01-02 16:41:04.209075410 -0500 +++ new/src/hotspot/share/gc/z/zAddress.hpp 2019-01-02 16:41:03.957075419 -0500 @@ -64,4 +63,3 @@ static void flip_to_remapped(); }; - -#endif // SHARE_GC_Z_ZADDRESS_HPP Could you please fix that? Thanks! Per > bug link https://bugs.openjdk.java.net/browse/JDK-8216022 > > On 1/2/19 9:16 PM, coleen.phillimore at oracle.com wrote: >> Summary: change include guards to #pragma once, except in generated >> header files. >> >> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, >> windows-x64, built aarch64 with cross compiler, and zero. >> >> Ran tier1 and 2 tests. >> >> The webrev is huge but there are only 3 lines changed in each header >> file.? So click on the patch. >> >> I'll update the copyright headers with a script with the commit. Also, >> will do this after the shenandoah copyright headers are fixed. >> >> Adrian: I included you to check your platforms. >> >> Happy New Year! >> Coleen > From Nick.Gasson at arm.com Mon Jan 7 09:34:47 2019 From: Nick.Gasson at arm.com (Nick Gasson (Arm Technology China)) Date: Mon, 7 Jan 2019 09:34:47 +0000 Subject: [aarch64-port-dev ] RFR: AArch64: jtreg test vmTestbase/nsk/jvmti/PopFrame/popframe005 segfaults In-Reply-To: References: <492d6889-d5de-be2e-3acb-67b2b2ca4907@arm.com> Message-ID: <12d2c478-1e90-bcf6-ce9f-61413103e3e3@arm.com> Hello, On 05/01/2019 12:09, Yangfei (Felix) wrote: > > Pushed as: http://hg.openjdk.java.net/jdk/jdk/rev/22baf8054a40 > Thanks Andrew and Felix! I had a quick look at the other backends: ARM32 and PPC also use a fixed register to store the dispatch table pointer. ARM32 reloads it in _remove_activation_preserving_args_entry but PPC, as far as I can tell, doesn't. I don't have access to any hardware to test on, but it looks like it's missing a line like the following after the call to restore_interpreter_state (c.f. _rethrow_exception_entry): // Compiled code destroys templateTableBase, reload. __ load_const_optimized(R25_templateTableBase, (address)Interpreter::dispatch_table((TosState)0), R11_scratch1); Nick From aph at redhat.com Mon Jan 7 09:39:32 2019 From: aph at redhat.com (Andrew Haley) Date: Mon, 7 Jan 2019 09:39:32 +0000 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <2c73cea1-74ae-8c98-6efc-9855aa444605@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> <2c73cea1-74ae-8c98-6efc-9855aa444605@oracle.com> Message-ID: <34accc10-660a-3b96-dcd4-a72fe9e8b127@redhat.com> On 1/5/19 7:24 PM, Erik ?sterlund wrote: > I see that is a similarity indeed. But there are important differences. > > The main difference is that compiler internal ABI for atomics on ARMv7 > and PPC (which was my particular concern in that conversation), a) do > have incompatible bindings that are allowed by the standard described in > papers with proposed bindings (as I pointed out then), > > b) would be really dangerous if it subtly changed because it could > go undetected for a long time before anyone noticed stranged crashes > because of it. There are two possibilities: either it'd happen by accident or deliberately. By accident is just a code generation bug, no different from any other, and of course we're always at risk from those. Deliberately would require a lot of dicussion because it'd break binary compatibility. So I don't believe it. But that doesn't matter, I'm satisfied: purely hypothetical but implausible arguments abut what compilers might do with less than fully standardized features are off the table. > We essentially rely on the generated machine code to have an exact > machine code binding that is compatible. And for what it's worth, I > am okay with changing the x64 Atomic/OrderAccess implementation to > use compiler intrinsics. Because there is essentially no risk due to > the nature of the ISA. > However, hypothetical differences in whether symbolic references are > followed or not for #pragma once would lead to HotSpot either > building or not, depending on whether it relies on that or not > (pretty sure it doesn't), and never cause bugs to silently infect > the binary. Conversely, not using #pragma once and relying on all > files getting the manually typed include guards right, seems more > dangerous to me. > > So the atomics reliance comes with a risk, the #pragma once reliance > does not - it removes a risk. OK, so the argument is not hypothetical at all, but purely practical. > If we truly stop relying on compiler features that are > implementation defined, when there are no risks involved, we would > end up crippled and get nothing done. Yes. We should use the compiler to help us as much as possible, reduce our code complexity, and reduce our maintenance costs. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Mon Jan 7 09:51:43 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 7 Jan 2019 10:51:43 +0100 Subject: RFR: 8215889: assert(!_unloading) failed: This oop is not available to unloading class loader data with ZGC Message-ID: Hi, There are SpeculativeTrapData entries in the extra data space of MDOs that are currently not being checked for stale Method* entries due to concurrent class unloading. The fix involves lazily cleaning SpeculativeTrapData entries during ciMethodData::load_extra_data(), which unpacks the extra data from the source MDO to the ci copy of the MDO, that the compiler subsequently uses as reference during the ongoing compilation, and needs to have live metadata only. A new ciMethodData::prepare_metadata() method is added to ci MDO mirrors that lazily cleans the extra data space and pre-caches the ciEnv with all the metadata it encounters. When creating ciMethod handles, the Compile_lock might be taken, which strictly requires safepoint checking. Therefore, prepare_metadata() loops until it can pre-cache all live metadata without any cache misses, because that implies the subsequent code copying the MDO can not safepoint while extracting the extra data from the MDO, which is a requirement as 1) a safepoint may invalidate the metadata again, 2) both the cleaning (from the concurrent GC thread) and extraction (from the compiler thread) must be done under the mdo->extra_data_lock(). Bug: https://bugs.openjdk.java.net/browse/JDK-8215889 Webrev: http://cr.openjdk.java.net/~eosterlund/8215889/webrev.00/ Testing: hs-tier1-6, and a bunch of local testing, including 24 hours kitchensink in fastdebug. Thanks, /Erik From leo.korinth at oracle.com Mon Jan 7 09:56:28 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Mon, 7 Jan 2019 10:56:28 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> Message-ID: <3cfafa65-6e29-f62d-38ea-622b28bca4ba@oracle.com> I like this change. It is _safer_. It creates cleaner, shorter headers. It eases creation, renaming and moving of header files. It makes refactoring easier and more importantly eases review. There is only one style of it so no need to force people to use the "correct" style. Let us make use of #pragma once and thus helping it on its way to become part of the standard! Thanks, Leo On 03/01/2019 03:31, coleen.phillimore at oracle.com wrote: > > Here is the webrev and bug link. > > open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8216022 > > On 1/2/19 9:16 PM, coleen.phillimore at oracle.com wrote: >> Summary: change include guards to #pragma once, except in generated >> header files. >> >> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, >> windows-x64, built aarch64 with cross compiler, and zero. >> >> Ran tier1 and 2 tests. >> >> The webrev is huge but there are only 3 lines changed in each header >> file.? So click on the patch. >> >> I'll update the copyright headers with a script with the commit. Also, >> will do this after the shenandoah copyright headers are fixed. >> >> Adrian: I included you to check your platforms. >> >> Happy New Year! >> Coleen > From sgehwolf at redhat.com Mon Jan 7 10:31:36 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 07 Jan 2019 11:31:36 +0100 Subject: [Containers] Reasoning for cpu shares limits In-Reply-To: <2FA9BDBB-6DBA-41F2-BAD8-0CA08B606481@oracle.com> References: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> <2FA9BDBB-6DBA-41F2-BAD8-0CA08B606481@oracle.com> Message-ID: <4cf4e53e2f1edf92df6da06182ee51327204905f.camel@redhat.com> Hi Bob, Thanks for your response! On Fri, 2019-01-04 at 17:34 -0500, Bob Vandette wrote: > Hi Severin, > > There has been much debate on the best algorithm for selecting the number of CPUs that is > reported by the Java Runtime when running in containers. I can imagine. I'm wondering whether all aspects have been properly considered, though. > Although the value for cpu-shares can be set to any of the values that you mention, we decided to > follow the convention set by Kubernetes and other container orchestration products that use 1024 as > the unit for cpu shares. Ignoring the cpu shares in this case is not what users of this popular technology > want. Why not? A '--cpu-shares=X' setting does not imply JVM internal CPU limits AFAIK. Consider 3 JVM containers running on a node/host vs. 2 JVM containers running on a node/host with the *same* --cpu-shares setting. Effectively, after JDK-8197589, cpu shares value being ignored by the JVM is what's happening. That's what I'm seeing for JVM containers on k8s anyway. > > https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu > > ? The spec.containers[].resources.requests.cpu is converted to its core value, which is potentially fractional, and multiplied by 1024. The greater of this number or 2 is used as the value of the --cpu-shares flag in the docker run command. > ? The spec.containers[].resources.limits.cpu is converted to its millicore value and multiplied by 100. The resulting value is the total amount of CPU time that a container can use every 100ms. A container cannot use more than its share of CPU time during this interval. > > There are a few options that can be used if our default behavior doesn?t work for you. > > 1. Use quotas in addition to or instead of shares. > 2. Specify -XX:ActiveProcessorCount=value OK. So it's modelled after how Kubernetes does things. What I'm questioning is whether the spec.containers[].resources.requests.cpu setting of Kubernetes should have any bearing on the number of CPUs the *JVM* thinks are available to it, though. It's still just a relative weight a JVM-based container would get. What if k8s decides to use a different magic number? Should this be hard-coded in the JVM? Should this be used in the JVM at all? Taking the Kubernetes case, it'll usually set CPU shares *and* CPU quota. The latter very likely being the higher value as k8s models spec.containers[].resources.requests.cpu as a sort of minimal CPU value and spec.containers[].resources.limits.cpu as a maximum, hard limit. In that respect, having CPU shares' value modelled by the k8s case *within the JVM* seems arbitrary as it won't be used anyway. Quotas take precedence. Perhaps that's why JDK-8197589 was done after JDK-8146115? I'd argue that: A) Modelling this after the k8s case and enforcing a CPU limit (within the JVM) based on a relative weight is still wrong. The common case for k8s is both settings, shares and quota, being present. After JDK-8197589, there is even a preference to use quota over CPU shares. I'd argue PreferContainerQuotaForCPUCount JVM switch wouldn't be needed if CPU shares wouldn't have any effect on the internal JVM settings to begin with. B) It breaks other frameworks which don't use this convention for no good reason. Cloudfoundry is a case in point. C) This needs to be at least documented in code as to why that decision has been made. Specifically "#define PER_CPU_SHARES 1024" in src/hotspot/os/linux/osContainer_linux.cpp. As to the possible work-arounds: "Use quotas in addition to or instead of shares": I'd argue that's not an option for most (all?) use-cases. CPU quotas are stable, not relying on other containers running on a node/host. CPU shares, on the other hand, are just a relative weight and largely depend on the number of other containers running on the same node/host. That's something external to the JVM, so it can't possibly know which value it should use. CPU quotas, IMO, make sense to have a baring on the JVMs internal settings as those settings are documented by the CFS bandwitdh control doc (see examples): https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt "ActiveProcessorCount": It's nice to have, but it needs user intervention. For one, needing to know that this switch exists. For two, coming up with a reasonable way to set this value. Anyway, it's a nice stop-gap solution if things don't work as intended. We should keep it. Is there any chance using CPU shares for internal JVM purposes could be reconsidered? The argument that k8s uses 1024 as a scale factor isn't very compelling. It'll still pass it to docker via --cpu-shares, which is a relative weight. I'd be happy to help and improve this situation. Thoughts? Thanks, Severin > Bob. > > > On Jan 4, 2019, at 1:09 PM, Severin Gehwolf wrote: > > > > Hi, > > > > Having come across this cloud foundry issue[1], I wonder why the cgroup > > cpu shares' value is being used in the JVM as a heuristic for available > > processors. > > > > From the man page from docker-run: > > > > --------------------------------------------------------- > > --cpu-shares=0 > > CPU shares (relative weight) > > > > By default, all containers get the same proportion of CPU cycles. This proportion can be modified by changing the container's CPU share weighting relative to the weighting of all other running > > containers. > > > > To modify the proportion from the default of 1024, use the --cpu-shares flag to set the weighting to 2 or higher. > > > > The proportion will only apply when CPU-intensive processes are running. When tasks in one container are idle, other containers can use the left-over CPU time. The actual amount of CPU time will > > vary depending on the number of containers running on the system. > > > > For example, consider three containers, one has a cpu-share of 1024 and two others have a cpu-share setting of 512. When processes in all three containers attempt to use 100% of CPU, the first > > container would receive 50% of the total CPU time. If you add a fourth container with a cpu-share of 1024, the first container only gets 33% of the CPU. The remaining containers receive 16.5%, 16.5% > > and 33% of the CPU. > > > > On a multi-core system, the shares of CPU time are distributed over all CPU cores. Even if a container is limited to less than 100% of CPU time, it can use 100% of each individual CPU core. > > > > For example, consider a system with more than three cores. If you start one container {C0} with -c=512 running one process, and another container {C1} with -c=1024 running two processes, this can > > result in the following division of CPU shares: > > > > PID container CPU CPU share > > 100 {C0} 0 100% of CPU0 > > 101 {C1} 1 100% of CPU1 > > 102 {C1} 2 100% of CPU2 > > > > --------------------------------------------------------- > > > > So the cpu shares value (unlike --cpu-quota) is a relative weight. > > > > For example, those three cpu-shares settings are equivalent (C1-C4 are > > containers; '-c' is a short-cut for '--cpu-shares'): > > > > A[i] > > ------------- > > C1 => -c=122 > > C2 => -c=122 > > C3 => -c=61 > > C4 => -c=61 > > > > B[ii] > > ------------- > > C1 => -c=1026 > > C2 => -c=1026 > > C3 => -c=513 > > C4 => -c=513 > > > > C[iii] > > ------------- > > C1 => -c=2048 > > C2 => -c=2048 > > C3 => -c=1024 > > C4 => -c=1024 > > > > For A the container CPU heuristics will determine for the JVM to use 1 > > CPU for C1-C4. For B and C, the container CPU heuristics will determine > > for the JVM to use 2 CPUs for C1 and C2 and 1 CPU for C3 and C4 which > > seems rather inconsistent and arbitrary. The reason this is happening > > is that 1024 seems to have gotten a questionable meaning in [2]. I > > wonder why? > > > > The JVM cannot reasonably determine from the relative weight of --cpu- > > shares' value how many CPUs it should use. As it's a relative weight > > that's something for the container runtime to take into account. It > > appears to me that the container detection code should probably fall > > back to the host CPU value and only take CPU quotas into account. > > > > Am I missing something obvious here? All I could find was this in JDK- > > 8146115: > > """ > > If cpu_shares has been setup for the container, the number_of_cpus() > > will be calculated based on cpu_shares()/1024. 1024 is the default and > > standard unit for calculating relative cpu > > """ > > > > "1024 is the default and standard unit for calculating relative cpu" > > seems a wrong assumption to me. Thoughts? > > > > Thanks, > > Severin > > > > [1] https://github.com/cloudfoundry/java-buildpack/issues/650#issuecomment-441777166 > > [2] http://hg.openjdk.java.net/jdk/jdk/rev/7f22774a5f42#l4.43 > > [i]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c122.out.log > > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c61.out.log > > [ii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1026.out.log > > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c513.out.log > > [iii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c2048.out.log > > http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1024.out.log > > > > * Files produced with: > > > > $ for i in 1026 513 2048 1024 122 61; do sudo docker run -ti -c=$i --rm fedora28-jdks:v1 /jdk-head/bin/java -showversion -Xlog:os+container=trace RuntimeProc > container-resources-cpu/c${i}.out.log; done > > $ sudo docker run -ti --rm fedora28-jdks:v1 cat RuntimeProc.java > > public class RuntimeProc { > > public static void main(String[] args) { > > int availProc = Runtime.getRuntime().availableProcessors(); > > System.out.println(">>> Available processors: " + availProc + " <<<<"); > > } > > } > > > > > > > > From jesper.wilhelmsson at oracle.com Mon Jan 7 11:49:10 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 7 Jan 2019 12:49:10 +0100 Subject: RFR(xs): JDK-8216266 - ProblemList PeelingZeroTripCount.java Message-ID: <5C24E804-2E2B-41EE-9559-21340F223DC0@oracle.com> Hi, The test PeelingZeroTripCount.java has been failing in JDK 12 CI since it was introduced last Thursday. Please review the patch below to add it to the problem list. I consider this a trivial change. Thanks, /Jesper diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -61,6 +61,8 @@ compiler/runtime/Test8168712.java 8211769,8211771 generic-ppc64,generic-ppc64le,linux-s390x +compiler/loopopts/PeelingZeroTripCount.java 8216135 generic-all + ############################################################################# # :hotspot_gc From tobias.hartmann at oracle.com Mon Jan 7 11:52:22 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 7 Jan 2019 12:52:22 +0100 Subject: RFR(xs): JDK-8216266 - ProblemList PeelingZeroTripCount.java In-Reply-To: <5C24E804-2E2B-41EE-9559-21340F223DC0@oracle.com> References: <5C24E804-2E2B-41EE-9559-21340F223DC0@oracle.com> Message-ID: <2b9918e5-73f6-8ca4-355d-df29fba42463@oracle.com> Hi Jesper, Reviewed. Thanks, Tobias On 07.01.19 12:49, jesper.wilhelmsson at oracle.com wrote: > Hi, > > The test PeelingZeroTripCount.java has been failing in JDK 12 CI since it was introduced last Thursday. Please review the patch below to add it to the problem list. > I consider this a trivial change. > > Thanks, > /Jesper > > > diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -61,6 +61,8 @@ > > compiler/runtime/Test8168712.java 8211769,8211771 generic-ppc64,generic-ppc64le,linux-s390x > > +compiler/loopopts/PeelingZeroTripCount.java 8216135 generic-all > + > ############################################################################# > > # :hotspot_gc > From rwestrel at redhat.com Mon Jan 7 11:52:20 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 07 Jan 2019 12:52:20 +0100 Subject: RFR(xs): JDK-8216266 - ProblemList PeelingZeroTripCount.java In-Reply-To: <5C24E804-2E2B-41EE-9559-21340F223DC0@oracle.com> References: <5C24E804-2E2B-41EE-9559-21340F223DC0@oracle.com> Message-ID: <87muoc1y8b.fsf@redhat.com> > The test PeelingZeroTripCount.java has been failing in JDK 12 CI since it was introduced last Thursday. Please review the patch below to add it to the problem list. > I consider this a trivial change. That looks ok to me. Roland. From jesper.wilhelmsson at oracle.com Mon Jan 7 12:01:41 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 7 Jan 2019 13:01:41 +0100 Subject: RFR(xs): JDK-8216266 - ProblemList PeelingZeroTripCount.java In-Reply-To: <2b9918e5-73f6-8ca4-355d-df29fba42463@oracle.com> References: <5C24E804-2E2B-41EE-9559-21340F223DC0@oracle.com> <2b9918e5-73f6-8ca4-355d-df29fba42463@oracle.com> Message-ID: Thanks Tobias! /Jesper > On 7 Jan 2019, at 12:52, Tobias Hartmann wrote: > > Hi Jesper, > > Reviewed. > > Thanks, > Tobias > > On 07.01.19 12:49, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> >> The test PeelingZeroTripCount.java has been failing in JDK 12 CI since it was introduced last Thursday. Please review the patch below to add it to the problem list. >> I consider this a trivial change. >> >> Thanks, >> /Jesper >> >> >> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -61,6 +61,8 @@ >> >> compiler/runtime/Test8168712.java 8211769,8211771 generic-ppc64,generic-ppc64le,linux-s390x >> >> +compiler/loopopts/PeelingZeroTripCount.java 8216135 generic-all >> + >> ############################################################################# >> >> # :hotspot_gc >> From jesper.wilhelmsson at oracle.com Mon Jan 7 12:01:58 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 7 Jan 2019 13:01:58 +0100 Subject: RFR(xs): JDK-8216266 - ProblemList PeelingZeroTripCount.java In-Reply-To: <87muoc1y8b.fsf@redhat.com> References: <5C24E804-2E2B-41EE-9559-21340F223DC0@oracle.com> <87muoc1y8b.fsf@redhat.com> Message-ID: <81058C21-6299-4312-BCE0-B2A87829D0CF@oracle.com> Thanks Roland! /Jesper > On 7 Jan 2019, at 12:52, Roland Westrelin wrote: > > >> The test PeelingZeroTripCount.java has been failing in JDK 12 CI since it was introduced last Thursday. Please review the patch below to add it to the problem list. >> I consider this a trivial change. > > That looks ok to me. > > Roland. From goetz.lindenmaier at sap.com Mon Jan 7 13:07:25 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 7 Jan 2019 13:07:25 +0000 Subject: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. Message-ID: <9349eed214ce46ee81868840c0dbd54d@sap.com> Hi, Different operating systems use different names for the environment variable that contains the search paths for native libraries. This path is used in a row of tests. A switch over all OSes is needed to find out the proper variable name in each test using it. This change introduces a central function Platform.sharedLibraryPathVariableName() that returns "LD_LIBRARY_PATH", "DYLD_LIBRARY_PATH", "PATH" or "LIBPATH" depending on the current OS. This change also adapts all usages of these variables in the tests to call this function. Because of the change to KDC.java I had to add @library /test/lib to much more tests than where I had to do the underlying change. The change also replaces local checking for path separators by File.pathSeparator in jdk/com/sun/jdi/PrivateTransportTest.java. The change depends on "8215975: [testbug] Adapt nsk tests to the PPC, S390 and AIX platforms." which will be moved from jdk12 to jdk soon. Please review: http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/01/ Best regards, Goetz. From jesper.wilhelmsson at oracle.com Mon Jan 7 13:33:45 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 7 Jan 2019 14:33:45 +0100 Subject: Results: New hotspot Group Member: Chris Plummer Message-ID: Voting for Chris Plummer [1] is closed. Yes: 16 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Thanks, /Jesper [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-December/035815.html From bob.vandette at oracle.com Mon Jan 7 15:24:18 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Mon, 7 Jan 2019 10:24:18 -0500 Subject: [Containers] Reasoning for cpu shares limits In-Reply-To: <4cf4e53e2f1edf92df6da06182ee51327204905f.camel@redhat.com> References: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> <2FA9BDBB-6DBA-41F2-BAD8-0CA08B606481@oracle.com> <4cf4e53e2f1edf92df6da06182ee51327204905f.camel@redhat.com> Message-ID: <55CDA207-A732-4085-8448-60DDC46B291E@oracle.com> > On Jan 7, 2019, at 5:31 AM, Severin Gehwolf wrote: > > Hi Bob, > > Thanks for your response! > > On Fri, 2019-01-04 at 17:34 -0500, Bob Vandette wrote: >> Hi Severin, >> >> There has been much debate on the best algorithm for selecting the number of CPUs that is >> reported by the Java Runtime when running in containers. > > I can imagine. I'm wondering whether all aspects have been properly > considered, though. > Given that there is no perfect answer here, I?ve tried to come up with a solution that at least supports the most popular Cloud use cases. Cgroups pre-dated Kubernetes which used whatever facilities were available to help in its host resource allocation. I looked through the source for Mesos and found that it uses the same 1024/CPU share convention. Oracle cloud and AWS both use this convention. If it were not for this popular convention, I would have had no choice than to ignore the share value since, as you state, there is no way to know the relative value on container has over another. k8s however does use cpu requests (cpu-shares) in order to ensure that a Pod does not exceed the number of available resources. >> Although the value for cpu-shares can be set to any of the values that you mention, we decided to >> follow the convention set by Kubernetes and other container orchestration products that use 1024 as >> the unit for cpu shares. Ignoring the cpu shares in this case is not what users of this popular technology >> want. > > Why not? A '--cpu-shares=X' setting does not imply JVM internal CPU > limits AFAIK. Consider 3 JVM containers running on a node/host vs. 2 > JVM containers running on a node/host with the *same* --cpu-shares > setting. If under k8s you specifiy 2 cpus requests for each, the system designer intentionally wanted to have an impact on the process running in the container. Doing nothing would cause the VM to configure its thread pools to use more than was intended resulting in thrashing under a heavily loaded system. > > Effectively, after JDK-8197589, cpu shares value being ignored by the > JVM is what's happening. That's what I'm seeing for JVM containers on > k8s anyway. cpu-shares are only ignored if there is no cpu-quota set. I have no way of knowing if it is common to have cpu requests without cpu limits but it is possible. Here?s more detail on what cpu requests and limits mean to k8s. Pod scheduling is based on requests. A Pod is scheduled to run on a Node only if the Node has enough CPU resources available to satisfy the Pod CPU request. https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-that-is-too-big-for-your-nodes > >> >> https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu >> >> ? The spec.containers[].resources.requests.cpu is converted to its core value, which is potentially fractional, and multiplied by 1024. The greater of this number or 2 is used as the value of the --cpu-shares flag in the docker run command. >> ? The spec.containers[].resources.limits.cpu is converted to its millicore value and multiplied by 100. The resulting value is the total amount of CPU time that a container can use every 100ms. A container cannot use more than its share of CPU time during this interval. >> >> There are a few options that can be used if our default behavior doesn?t work for you. >> >> 1. Use quotas in addition to or instead of shares. >> 2. Specify -XX:ActiveProcessorCount=value > > OK. So it's modelled after how Kubernetes does things. What I'm > questioning is whether the spec.containers[].resources.requests.cpu > setting of Kubernetes should have any bearing on the number of CPUs the > *JVM* thinks are available to it, though. It's still just a relative > weight a JVM-based container would get. What if k8s decides to use a > different magic number? Should this be hard-coded in the JVM? Should > this be used in the JVM at all? > > Taking the Kubernetes case, it'll usually set CPU shares *and* CPU > quota. The latter very likely being the higher value as k8s models > spec.containers[].resources.requests.cpu as a sort of minimal CPU value > and spec.containers[].resources.limits.cpu as a maximum, hard limit. In > that respect, having CPU shares' value modelled by the k8s case *within > the JVM* seems arbitrary as it won't be used anyway. Quotas take > precedence. Perhaps that's why JDK-8197589 was done after JDK-8146115? > > I'd argue that: > > A) Modelling this after the k8s case and enforcing a CPU limit > (within the JVM) based on a relative weight is still wrong. The > common case for k8s is both settings, shares and quota, being > present. After JDK-8197589, there is even a preference to use > quota over CPU shares. I'd argue PreferContainerQuotaForCPUCount > JVM switch wouldn't be needed if CPU shares wouldn't have any > effect on the internal JVM settings to begin with. > B) It breaks other frameworks which don't use this convention for no > good reason. Cloudfoundry is a case in point. > C) This needs to be at least documented in code as to why that decision > has been made. Specifically "#define PER_CPU_SHARES 1024" in > src/hotspot/os/linux/osContainer_linux.cpp. I agree that PER_CPU_SHARES should have a comment documenting its meaning and origin. > > As to the possible work-arounds: > > "Use quotas in addition to or instead of shares": > > I'd argue that's not an option for most (all?) use-cases. CPU quotas > are stable, not relying on other containers running on a node/host. CPU > shares, on the other hand, are just a relative weight and largely > depend on the number of other containers running on the same node/host. > That's something external to the JVM, so it can't possibly know which > value it should use. CPU quotas, IMO, make sense to have a baring on > the JVMs internal settings as those settings are documented by the CFS > bandwitdh control doc (see examples): > https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt > > "ActiveProcessorCount": > > It's nice to have, but it needs user intervention. For one, needing to > know that this switch exists. For two, coming up with a reasonable way > to set this value. Anyway, it's a nice stop-gap solution if things > don't work as intended. We should keep it. > > Is there any chance using CPU shares for internal JVM purposes could be > reconsidered? The argument that k8s uses 1024 as a scale factor isn't > very compelling. It'll still pass it to docker via --cpu-shares, which > is a relative weight. > > I'd be happy to help and improve this situation. Thoughts? I?d like to get some feedback from the docker, k8s and CloudFoundry community before changing this algorithm once again. Churning it is almost as bad as the current situation since developers may be adapting to the new behavior. Bob. > > Thanks, > Severin > >> Bob. >> >>> On Jan 4, 2019, at 1:09 PM, Severin Gehwolf wrote: >>> >>> Hi, >>> >>> Having come across this cloud foundry issue[1], I wonder why the cgroup >>> cpu shares' value is being used in the JVM as a heuristic for available >>> processors. >>> >>> From the man page from docker-run: >>> >>> --------------------------------------------------------- >>> --cpu-shares=0 >>> CPU shares (relative weight) >>> >>> By default, all containers get the same proportion of CPU cycles. This proportion can be modified by changing the container's CPU share weighting relative to the weighting of all other running >>> containers. >>> >>> To modify the proportion from the default of 1024, use the --cpu-shares flag to set the weighting to 2 or higher. >>> >>> The proportion will only apply when CPU-intensive processes are running. When tasks in one container are idle, other containers can use the left-over CPU time. The actual amount of CPU time will >>> vary depending on the number of containers running on the system. >>> >>> For example, consider three containers, one has a cpu-share of 1024 and two others have a cpu-share setting of 512. When processes in all three containers attempt to use 100% of CPU, the first >>> container would receive 50% of the total CPU time. If you add a fourth container with a cpu-share of 1024, the first container only gets 33% of the CPU. The remaining containers receive 16.5%, 16.5% >>> and 33% of the CPU. >>> >>> On a multi-core system, the shares of CPU time are distributed over all CPU cores. Even if a container is limited to less than 100% of CPU time, it can use 100% of each individual CPU core. >>> >>> For example, consider a system with more than three cores. If you start one container {C0} with -c=512 running one process, and another container {C1} with -c=1024 running two processes, this can >>> result in the following division of CPU shares: >>> >>> PID container CPU CPU share >>> 100 {C0} 0 100% of CPU0 >>> 101 {C1} 1 100% of CPU1 >>> 102 {C1} 2 100% of CPU2 >>> >>> --------------------------------------------------------- >>> >>> So the cpu shares value (unlike --cpu-quota) is a relative weight. >>> >>> For example, those three cpu-shares settings are equivalent (C1-C4 are >>> containers; '-c' is a short-cut for '--cpu-shares'): >>> >>> A[i] >>> ------------- >>> C1 => -c=122 >>> C2 => -c=122 >>> C3 => -c=61 >>> C4 => -c=61 >>> >>> B[ii] >>> ------------- >>> C1 => -c=1026 >>> C2 => -c=1026 >>> C3 => -c=513 >>> C4 => -c=513 >>> >>> C[iii] >>> ------------- >>> C1 => -c=2048 >>> C2 => -c=2048 >>> C3 => -c=1024 >>> C4 => -c=1024 >>> >>> For A the container CPU heuristics will determine for the JVM to use 1 >>> CPU for C1-C4. For B and C, the container CPU heuristics will determine >>> for the JVM to use 2 CPUs for C1 and C2 and 1 CPU for C3 and C4 which >>> seems rather inconsistent and arbitrary. The reason this is happening >>> is that 1024 seems to have gotten a questionable meaning in [2]. I >>> wonder why? >>> >>> The JVM cannot reasonably determine from the relative weight of --cpu- >>> shares' value how many CPUs it should use. As it's a relative weight >>> that's something for the container runtime to take into account. It >>> appears to me that the container detection code should probably fall >>> back to the host CPU value and only take CPU quotas into account. >>> >>> Am I missing something obvious here? All I could find was this in JDK- >>> 8146115: >>> """ >>> If cpu_shares has been setup for the container, the number_of_cpus() >>> will be calculated based on cpu_shares()/1024. 1024 is the default and >>> standard unit for calculating relative cpu >>> """ >>> >>> "1024 is the default and standard unit for calculating relative cpu" >>> seems a wrong assumption to me. Thoughts? >>> >>> Thanks, >>> Severin >>> >>> [1] https://github.com/cloudfoundry/java-buildpack/issues/650#issuecomment-441777166 >>> [2] http://hg.openjdk.java.net/jdk/jdk/rev/7f22774a5f42#l4.43 >>> [i]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c122.out.log >>> http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c61.out.log >>> [ii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1026.out.log >>> http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c513.out.log >>> [iii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c2048.out.log >>> http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1024.out.log >>> >>> * Files produced with: >>> >>> $ for i in 1026 513 2048 1024 122 61; do sudo docker run -ti -c=$i --rm fedora28-jdks:v1 /jdk-head/bin/java -showversion -Xlog:os+container=trace RuntimeProc > container-resources-cpu/c${i}.out.log; done >>> $ sudo docker run -ti --rm fedora28-jdks:v1 cat RuntimeProc.java >>> public class RuntimeProc { >>> public static void main(String[] args) { >>> int availProc = Runtime.getRuntime().availableProcessors(); >>> System.out.println(">>> Available processors: " + availProc + " <<<<"); >>> } >>> } >>> >>> >>> >> >> > From coleen.phillimore at oracle.com Mon Jan 7 16:16:57 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 7 Jan 2019 11:16:57 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <3727d1da-256c-54b7-2d9b-f819ef08cfa4@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <3727d1da-256c-54b7-2d9b-f819ef08cfa4@oracle.com> Message-ID: On 1/7/19 3:40 AM, Per Liden wrote: > Hi Coleen, > > On 1/3/19 3:31 AM, coleen.phillimore at oracle.com wrote: >> >> Here is the webrev and bug link. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev > > Looks like your script is now leaving an extra empty line at the end > of all files, which wasn't there before. Fox example: > > --- old/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 > 16:41:04.209075410 -0500 > +++ new/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 > 16:41:03.957075419 -0500 > @@ -64,4 +63,3 @@ > ?? static void flip_to_remapped(); > ?}; > > -#endif // SHARE_GC_Z_ZADDRESS_HPP > > Should be: > > --- old/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 > 16:41:04.209075410 -0500 > +++ new/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 > 16:41:03.957075419 -0500 > @@ -64,4 +63,3 @@ > ?? static void flip_to_remapped(); > ?}; > - > -#endif // SHARE_GC_Z_ZADDRESS_HPP > > > Could you please fix that? Hi, I fixed the trailing blank lines (in some cases several lines) and had to hand patch files that ended with line continuation '\' from some macro.? I found these in some globals files like g1_globals.hpp and one ci file, that I can't find anymore. http://cr.openjdk.java.net/~coleenp/8216022.diffs.03 Thanks, Coleen > > Thanks! > Per > >> bug link https://bugs.openjdk.java.net/browse/JDK-8216022 >> >> On 1/2/19 9:16 PM, coleen.phillimore at oracle.com wrote: >>> Summary: change include guards to #pragma once, except in generated >>> header files. >>> >>> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, >>> windows-x64, built aarch64 with cross compiler, and zero. >>> >>> Ran tier1 and 2 tests. >>> >>> The webrev is huge but there are only 3 lines changed in each header >>> file.? So click on the patch. >>> >>> I'll update the copyright headers with a script with the commit. >>> Also, will do this after the shenandoah copyright headers are fixed. >>> >>> Adrian: I included you to check your platforms. >>> >>> Happy New Year! >>> Coleen >> From erik.osterlund at oracle.com Mon Jan 7 17:52:56 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 7 Jan 2019 18:52:56 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> Message-ID: <7d3c7187-8c15-f129-cc2a-8ba1fd8589cc@oracle.com> Hi Thomas, On 2019-01-04 18:24, Thomas St?fe wrote: > On Fri, Jan 4, 2019 at 4:01 PM David Lloyd wrote: > >> In the end though this is an example of the kind of change that I for >> one would never allow in one of my projects: it's large, potentially >> impacts portability, and yet in the end it's not really necessary, >> being really just a style issue when it comes right down to it. >> Include guards are standard and portable. '#pragma once' is not. >> >> -- >> - DML > > FWIW, I agree with David on this. Include guards are a simple > mechanism, while the potential troubles surrounding #pragma once worry > me. Include guard errors are easy to find and fix. But the potential > issues with pragma once sound difficult to analyze and almost > impossible to fix if the compiler turns out to be the culprit. So the conversation moved on a bit since you sent this email. Sorry for the late reply. Are you still worried about this? In that case, what are your worries? > Note that at SAP people tend to build out-of-tree and often across > file system borders, with sources on a shared file system and a local > output directory. So yes, that is a common usage scenario. Doesn't sound like that would matter, as long as you don't have multiple include paths from different file systems pointing at the same header files. > That said, I can understand Erik's pain when creating/changing so many > includes. But how common is this scenario? Changing so many include > files happens usually in the course of major rewrites which I would > hope do not occur so often that we need to optimize our workflow for > them. After all, these changes also bring other disruption: file > history gets broken, it is more difficult to compare code across JDK > versions etc. Okay. > Bottomline I would prefer keeping include guards and maybe add a tool > to generate include guards automatically. The external tool for fixing incorrect include guards does not solve the problem IMO, unless everyone is forced to use it. Here is why: 1) As a reviewer, you would still have to try to spot errors in include guards, as everyone will not be using the tool, or forget to use the tool. 2) If somebody else has a slightly different include guard to you, and you apply the tool for your change, it will cause a seemingly unrelated change to include guards of random files in your change. That means that you would have to look through your webrev for spurious include guard changes to files that did not belong to you. And reviewer would have to look out for that too. 3) Sometimes we don't want #pragma once our include guards, for example when we include stuff straight into some shared class (rather than having includes at the top of the header file). We would have to teach the external tool to understand that some headers should and some should not have guards. With #pragma once that is not a problem. It also seems to me that the external tool would be problematic iff using #pragma once is problematic, for similar reasons. If the external tool can trivially determine the file identity and generate an include guard in hotspot based on that, then I don't see how #pragma once could mess it up either. They both automatically determine file identities. Please let me know if you still feel uneasy about this change. Perhaps you could take the patch for a spin in your setup and see if you have any trouble? Thanks, /Erik > Thanks, Thomas > From sgehwolf at redhat.com Mon Jan 7 19:08:13 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 07 Jan 2019 20:08:13 +0100 Subject: [Containers] Reasoning for cpu shares limits In-Reply-To: <55CDA207-A732-4085-8448-60DDC46B291E@oracle.com> References: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> <2FA9BDBB-6DBA-41F2-BAD8-0CA08B606481@oracle.com> <4cf4e53e2f1edf92df6da06182ee51327204905f.camel@redhat.com> <55CDA207-A732-4085-8448-60DDC46B291E@oracle.com> Message-ID: On Mon, 2019-01-07 at 10:24 -0500, Bob Vandette wrote: > > Effectively, after JDK-8197589, cpu shares value being ignored by the > > JVM is what's happening. That's what I'm seeing for JVM containers on > > k8s anyway. > > cpu-shares are only ignored if there is no cpu-quota set. You mean cpu-shares are only ignored if there is a cpu-quota set too, right? > I have no way of knowing if it > is common to have cpu requests without cpu limits but it is possible. Fair enough. > Here?s more detail on what cpu requests and limits mean to k8s. > > Pod scheduling is based on requests. A Pod is scheduled to run on a Node only if the Node has > enough CPU resources available to satisfy the Pod CPU request. > > https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-that-is-too-big-for-your-nodes Yes, thanks. > > > > > > > > https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu > > > > > > ? The spec.containers[].resources.requests.cpu is converted to its core value, which is potentially fractional, and multiplied by 1024. The greater of this number or 2 is used as the value of the --cpu-shares flag in the docker run command. > > > ? The spec.containers[].resources.limits.cpu is converted to its millicore value and multiplied by 100. The resulting value is the total amount of CPU time that a container can use every 100ms. A container cannot use more than its share of CPU time during this interval. > > > > > > There are a few options that can be used if our default behavior doesn?t work for you. > > > > > > 1. Use quotas in addition to or instead of shares. > > > 2. Specify -XX:ActiveProcessorCount=value > > > > OK. So it's modelled after how Kubernetes does things. What I'm > > questioning is whether the spec.containers[].resources.requests.cpu > > setting of Kubernetes should have any bearing on the number of CPUs the > > *JVM* thinks are available to it, though. It's still just a relative > > weight a JVM-based container would get. What if k8s decides to use a > > different magic number? Should this be hard-coded in the JVM? Should > > this be used in the JVM at all? > > > > Taking the Kubernetes case, it'll usually set CPU shares *and* CPU > > quota. The latter very likely being the higher value as k8s models > > spec.containers[].resources.requests.cpu as a sort of minimal CPU value > > and spec.containers[].resources.limits.cpu as a maximum, hard limit. In > > that respect, having CPU shares' value modelled by the k8s case *within > > the JVM* seems arbitrary as it won't be used anyway. Quotas take > > precedence. Perhaps that's why JDK-8197589 was done after JDK-8146115? > > > > I'd argue that: > > > > A) Modelling this after the k8s case and enforcing a CPU limit > > (within the JVM) based on a relative weight is still wrong. The > > common case for k8s is both settings, shares and quota, being > > present. After JDK-8197589, there is even a preference to use > > quota over CPU shares. I'd argue PreferContainerQuotaForCPUCount > > JVM switch wouldn't be needed if CPU shares wouldn't have any > > effect on the internal JVM settings to begin with. > > B) It breaks other frameworks which don't use this convention for no > > good reason. Cloudfoundry is a case in point. > > C) This needs to be at least documented in code as to why that decision > > has been made. Specifically "#define PER_CPU_SHARES 1024" in > > src/hotspot/os/linux/osContainer_linux.cpp. > > I agree that PER_CPU_SHARES should have a comment documenting its > meaning and origin. Sounds good. > I?d like to get some feedback from the docker, k8s and CloudFoundry > community before changing this algorithm once again. Churning it is > almost as bad as the current situation since developers may be adapting > to the new behavior. This seems reasonable. I'll see what I can do. Thanks, Severin From coleen.phillimore at oracle.com Mon Jan 7 19:50:01 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 7 Jan 2019 14:50:01 -0500 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded Message-ID: Summary: Set InstanceKlass::loaded before adding classes to the subklass list, which can be read concurrently by the compiler. Thanks to Erik for the diagnosis and suggested fix.? See bug comments for more details. Tested with hs-tier1-3, 6 and 8. open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8215575 Thanks, Coleen From shade at redhat.com Mon Jan 7 22:20:47 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 7 Jan 2019 23:20:47 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name Message-ID: RFE: https://bugs.openjdk.java.net/browse/JDK-8216302 Fix: http://cr.openjdk.java.net/~shade/8216302/webrev.01/ There is already Class.name field that is used as the cache for Class.getName(). We can use that from inside the VM code when building the stack traces, to avoid converting Symbol*->String all the time. We have to take care that both Java (JVM_GetClassName) and internal paths yield the same cached value, so some code rearrangement and tests are in order. This alleviates a part of stack trace performance degradation that happened 8->11, see the umbrella issue for test and discussion: https://bugs.openjdk.java.net/browse/JDK-8151751 Linux x86_64 release: # 8u191 StackTraceBench.test 1 avgt 15 10.851 ? 0.075 us/op StackTraceBench.test 10 avgt 15 15.325 ? 0.089 us/op StackTraceBench.test 100 avgt 15 59.717 ? 0.449 us/op StackTraceBench.test 1000 avgt 15 529.020 ? 3.654 us/op # jdk/jdk, baseline StackTraceBench.test 1 avgt 15 23.835 ? 0.188 us/op StackTraceBench.test 10 avgt 15 33.204 ? 0.191 us/op StackTraceBench.test 100 avgt 15 125.195 ? 0.694 us/op StackTraceBench.test 1000 avgt 15 1051.047 ? 9.779 us/op # jdk/jdk, patched StackTraceBench.test 1 avgt 15 14.450 ? 0.136 us/op StackTraceBench.test 10 avgt 15 20.182 ? 0.088 us/op StackTraceBench.test 100 avgt 15 77.107 ? 0.632 us/op StackTraceBench.test 1000 avgt 15 647.128 ? 6.159 us/op Testing: Linux x86_64 fastdebug {new test, hotspot tier1}, jdk-submit (running) Thanks, -Aleksey From david.holmes at oracle.com Tue Jan 8 01:49:52 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Jan 2019 11:49:52 +1000 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: Message-ID: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> Hi Coleen, On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: > Summary: Set InstanceKlass::loaded before adding classes to the subklass > list, which can be read concurrently by the compiler. I think you need a storestore barrier to ensure the new order is preserved. Cheers, David > Thanks to Erik for the diagnosis and suggested fix.? See bug comments > for more details. > > Tested with hs-tier1-3, 6 and 8. > > open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8215575 > > Thanks, > Coleen From david.holmes at oracle.com Tue Jan 8 02:15:34 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Jan 2019 12:15:34 +1000 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: Message-ID: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> Hi Aleksey, On 8/01/2019 8:20 am, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8216302 > > Fix: > http://cr.openjdk.java.net/~shade/8216302/webrev.01/ > > There is already Class.name field that is used as the cache for Class.getName(). We can use that > from inside the VM code when building the stack traces, to avoid converting Symbol*->String all the > time. We have to take care that both Java (JVM_GetClassName) and internal paths yield the same > cached value, so some code rearrangement and tests are in order. It seems somewhat awkward to me to have two different code paths for initializing the java.lang.Class name field. Can this be restructured a little more (change Class.getName()) so that the VM always initializes "name" and then JVM_GetClassName could just call java_lang_Class::name, rather than duplicating the logic? Specific comments: src/hotspot/share/classfile/javaClasses.cpp ! Klass* k = as_Klass(java_class); ! assert(k->is_klass(), "just checking"); ! name = k->external_name(); as_Klass already has the requisite assertions so there was no reason to change this part of the code. I see that jvm.cpp already contains the same redundant logic. Copyright years need updating. Thanks, David > This alleviates a part of stack trace performance degradation that happened 8->11, see the umbrella > issue for test and discussion: > https://bugs.openjdk.java.net/browse/JDK-8151751 > > Linux x86_64 release: > > # 8u191 > StackTraceBench.test 1 avgt 15 10.851 ? 0.075 us/op > StackTraceBench.test 10 avgt 15 15.325 ? 0.089 us/op > StackTraceBench.test 100 avgt 15 59.717 ? 0.449 us/op > StackTraceBench.test 1000 avgt 15 529.020 ? 3.654 us/op > > # jdk/jdk, baseline > StackTraceBench.test 1 avgt 15 23.835 ? 0.188 us/op > StackTraceBench.test 10 avgt 15 33.204 ? 0.191 us/op > StackTraceBench.test 100 avgt 15 125.195 ? 0.694 us/op > StackTraceBench.test 1000 avgt 15 1051.047 ? 9.779 us/op > > # jdk/jdk, patched > StackTraceBench.test 1 avgt 15 14.450 ? 0.136 us/op > StackTraceBench.test 10 avgt 15 20.182 ? 0.088 us/op > StackTraceBench.test 100 avgt 15 77.107 ? 0.632 us/op > StackTraceBench.test 1000 avgt 15 647.128 ? 6.159 us/op > > Testing: Linux x86_64 fastdebug {new test, hotspot tier1}, jdk-submit (running) > > Thanks, > -Aleksey > From shade at redhat.com Tue Jan 8 09:56:52 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 8 Jan 2019 10:56:52 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> Message-ID: <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> On 1/8/19 3:15 AM, David Holmes wrote: > It seems somewhat awkward to me to have two different code paths for initializing the > java.lang.Class name field. Can this be restructured a little more (change Class.getName()) so that > the VM always initializes "name" and then JVM_GetClassName could just call java_lang_Class::name, > rather than duplicating the logic? Mmm. I am afraid to do this eagerly because of more memory footprint and potential bootstrapping issues. Also, I want to keep this open to implement a crazy footprint-reducing idea: nulling the fields like Class.name to conserve footprint at expense of additional call to reinstate the value afterwards. > Specific comments: > > src/hotspot/share/classfile/javaClasses.cpp > > !???? Klass* k = as_Klass(java_class); > !???? assert(k->is_klass(), "just checking"); > !???? name = k->external_name(); > > as_Klass already has the requisite assertions so there was no reason to change this part of the > code. I see that jvm.cpp already contains the same redundant logic. Right. Ditched that change. > Copyright years need updating. Updated. Also made the test more up-to-the-point, with clearing the Class.name cache explicitly. New webrev: http://cr.openjdk.java.net/~shade/8216302/webrev.02/ Thanks, -Aleksey From robbin.ehn at oracle.com Tue Jan 8 10:21:03 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 8 Jan 2019 11:21:03 +0100 Subject: RFR(m): 8214271: Fast primitive to wake many threads In-Reply-To: <26ebf8fe-f96e-7c27-3a69-f0159ba8227d@oracle.com> References: <010211e3-93a6-80b9-678c-c84b08812e43@oracle.com> <70669453-e317-a30d-8d5a-e5b938b83c41@oracle.com> <4fb6cd22-cdd0-2419-c863-24b250ac0b16@oracle.com> <2a2679cc-b0e0-f8d0-7336-8666e1a42950@oracle.com> <01873a0f-a0fb-18b9-f7d4-98bb638e9b57@oracle.com> <26ebf8fe-f96e-7c27-3a69-f0159ba8227d@oracle.com> Message-ID: <15ec1409-39b5-3d58-47fa-c0bc3bb214ec@oracle.com> Hi Dan, On 12/21/18 11:21 PM, Daniel D. Daugherty wrote: >> Inc: >> http://cr.openjdk.java.net/~rehn/8214271/5/inc/webrev/ >> Full: >> http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/ > > It's been a while since I've reviewed this topic so I'm going with > the full webrev. Sorry for the long delay in getting back to this: Great! > src/hotspot/os/linux/waitBarrier_linux.cpp > ??? L35: ??? check_type(cond, "%s; error='%s' (errno=%s)", msg, > os::strerror(err),?? \ > ??????? nit - The '%s;' should probably be a '%s:' to match other mesg styles. > > ??? L41: static int futex(volatile int *addr, int futex_op, int op_arg) > ??? L42: { > ??????? nit - the '{' on L42 should be at the end of L41 (with a space in front > of it). > > ??? L47: ? assert(_futex_barrier == 0, "Already armed"); > ??????? Please consider: > ????????? assert(_futex_barrier == 0, "Should not be already armed: " > ???????????????? "_futex_barrier=%d", _futex_barrier); > > ??? L53: ? assert(_futex_barrier != 0, "Not armed"); > ??????? Please consider: > ????????? assert(_futex_barrier != 0, "Should be armed/non-zero."); Fixed all above! > > ??? L58: ? guarantee_with_errno(s > -1, "futex FUTEX_WAKE"); > ??????? Please consider: > ?????????? guarantee_with_errno(s > -1, "futex FUTEX_WAKE failed: s=%d", s); > > ??? L72: ??? guarantee_with_errno((s == 0) || > ??? L73: ???????????????????????? (s == -1 && errno == EAGAIN) || > ??? L74: ???????????????????????? (s == -1 && errno == EINTR), > ??? L75: ???????????????????????? "futex FUTEX_WAIT"); > ??????? Please consider: > ????????? "futex FUTEX_WAIT failed: s=%d", s); This would require a macro change and 's' is of no interest since we know it's -1. So we have 3 options here: - Skip 's' since we know it's -1, as now in v6. - Hardcode the text to -1 since we know it's -1. - Fix macro to take one more parameter. > > ??? L76: ??? // Return value 0: woken up, but re-check in case of spurious wakeup > ??? L78: ??? // Error EAGAIN: we are already disarmed and so will pass the check > ??????? Please add a period to the end of these two lines. > > src/hotspot/share/utilities/waitBarrier.hpp > ??? I like the new header comment. > > ??? L102: ? // Returns implementation type. > ??????? "type" or "description"? > > ??? L109-111: indent needs two more spaces. > > src/hotspot/share/utilities/waitBarrier_generic.hpp > ??? L31: // Except for the barrier tag itself, it uses two counters to keep the > semaphore > ??????? Perhaps: > > ????????? // In addition to the barrier tag, it uses two counters to keep the > semaphore > > src/hotspot/share/utilities/waitBarrier_generic.cpp > ??? L61: ? // Loads of _barrier_threads/_waiters must not float above disarm > store. > ??? L62: ? OrderAccess::fence(); > ??????? Since you are using fence() here, you should also say that "and disarm > store > ??????? must not float below." > > ??? L73: ? OrderAccess::fence(); > ??????? Missing your rationale comment for this fence(). > > ??? L79: ??? OrderAccess::fence(); > ??????? I was expecting this one to be OrderAccess::loadload(). > > ??????? Also, missing your rationale comment for this OrderAccess use. Fixed all above! > > test/hotspot/gtest/utilities/test_waitBarrier.cpp > ??? L59: ????? OrderAccess::storeload(); // Loads in WB must not float up. > ?? ???? Not sure why this storeload() is here instead of being in wait(). > ??????? Seems like wait() should do the "right thing". I choose only to provide trailing fence for all operation for consistency. In safepoint patch we already have correct ordering from other operations in 3 out of 4 places, e.g. only one place needs an explicit leading fence. > > Thumbs up on the code! I don't need to see a new webrev if you choose > to fix the minor things I pointed out above. Including changes for David comments: Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/ Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/ Thanks, Robbin > > Dan > > >> >> gtest passes thousands of loops locally and hundreds in mach5. >> >> Thanks, Robbin >> >>> >>> Thanks, >>> David >>> >>>>> >>>>> s/Implementation/Implementations/ >>>> >>>> Fixed >>>> >>>>> >>>>> The fourth line is no longer needed. >>>> >>>> Above is the reason I would like to keep the fourth line, since only if you >>>> call >>>> both disarm() and wake() you have that guarantee that waiter threads will >>>> return. >>>> >>>> Thanks, Robbin >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>>> Inc: >>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/ >>>>>> >>>>>> Full: >>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/ >>>>>> >>>>>> /Robbin >>>>>> >>>>>>> >>>>>>> Otherwise this all looks good! >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>> >>>>>>>> Full: >>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/ >>>>>>>> >>>>>>>> Thanks, Robbin >>>>>>>> >>>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote: >>>>>>>>> Forgot RFR in subject. >>>>>>>>> >>>>>>>>> /Robbin >>>>>>>>> >>>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote: >>>>>>>>>> Hi all, please review. >>>>>>>>>> >>>>>>>>>> When a safepoint is ended we need a way to get back to 100% >>>>>>>>>> utilization as fast >>>>>>>>>> as possible. 100% utilization means no idle cpu in the system if there >>>>>>>>>> is a >>>>>>>>>> JavaThread that could be executed. The traditional ways to wake many, >>>>>>>>>> e.g. >>>>>>>>>> semaphore, pthread_cond, is not implemented with a single syscall >>>>>>>>>> instead they >>>>>>>>>> typical do one syscall per thread to wake. >>>>>>>>>> >>>>>>>>>> This change-set contains that primitive, the WaitBarrier, and a gtest >>>>>>>>>> for it. >>>>>>>>>> No actual users, which is in coming patches. >>>>>>>>>> >>>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore posting, >>>>>>>>>> threads woken >>>>>>>>>> will also post. On Linux we can instead directly use a futex and with one >>>>>>>>>> syscall wake all. Depending on how many threads and cpus the >>>>>>>>>> performance vary, >>>>>>>>>> but a good utilization of the machine, just on the edge of saturated, >>>>>>>>>> the time to reach 100% utilization is around 3 times faster with the >>>>>>>>>> WaitBarrier (where futex is faster than semaphore). >>>>>>>>>> >>>>>>>>>> Webrev: >>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/ >>>>>>>>>> >>>>>>>>>> CR: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271 >>>>>>>>>> >>>>>>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug and >>>>>>>>>> release. >>>>>>>>>> And have been stable when used in safepoints (t1-8) (coming patches). >>>>>>>>>> >>>>>>>>>> Thanks, Robbin > From robbin.ehn at oracle.com Tue Jan 8 10:42:53 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 8 Jan 2019 11:42:53 +0100 Subject: RFR(m): 8214271: Fast primitive to wake many threads In-Reply-To: <41f5252b-3eb9-9a9e-70e5-49f6d8f9d670@oracle.com> References: <010211e3-93a6-80b9-678c-c84b08812e43@oracle.com> <70669453-e317-a30d-8d5a-e5b938b83c41@oracle.com> <4fb6cd22-cdd0-2419-c863-24b250ac0b16@oracle.com> <2a2679cc-b0e0-f8d0-7336-8666e1a42950@oracle.com> <01873a0f-a0fb-18b9-f7d4-98bb638e9b57@oracle.com> <41f5252b-3eb9-9a9e-70e5-49f6d8f9d670@oracle.com> Message-ID: <755aaf5b-8a49-ef5a-65ce-18550547a91b@oracle.com> Hi David, On 1/2/19 12:35 AM, David Holmes wrote: >>> Further this sounds like a race that could lead to bugs if not used very >>> carefully ie. you can't assume between disarm() and wake() that all threads >>> are blocked. >> >> I didn't realize how subtle this is. I think your original comment that >> disarm/wake should be one operation was spot on. >> Investigating... thinking... testing... yes I think this will work, fixed! >> Sorry for not looking more into this before. > > I'm now curious how this will actually work in the context of the safepoint > changes? Since code already handle this 'invariant' with threads not being block between disarm() and wake(), doing it one operation just very slightly increases the chance that a thread will be blocked when we actually can handle it to be running, but reduces the chance to hit a false positive TLH poll. (with TLH we have a two-step un-synchronizing out of safepoints where we must change global safepoint state before changing the thread polling state) (I have some thoughts on simplifying TLH/safepoint states) > Nit: I would have kept disarm() rather than wake() as I like the arm/disarm > duality. Yes, me too. Not sure why I did the opposite, fixed! > > ? void GenericWaitBarrier::wait(int barrier_tag) { > ??? assert(barrier_tag != 0, "Trying to wait on disarmed value"); > +?? if (barrier_tag == 0 && barrier_tag != _barrier_tag) { > +???? OrderAccess::fence(); > +???? return; > +?? } > > I don't understand what the above is doing. A barrier_tag of 0 is a programming > error caught during testing in debug builds. You don't need to account for it > being 0 in product because this isn't something that can come in from an > external source - we have full code control here. And even if you want to be > this paranoid why would you need the fence? Fixed, but kept the fence, since we say we are providing a trailing fence. Otherwise I would like to add that exception to the description of wait(). Including Dan's comments: Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/ Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/ Thanks, Robbin > > Thanks, > David > ----- > >> Full: >> http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/ >> >> gtest passes thousands of loops locally and hundreds in mach5. >> >> Thanks, Robbin >> >>> >>> Thanks, >>> David >>> >>>>> >>>>> s/Implementation/Implementations/ >>>> >>>> Fixed >>>> >>>>> >>>>> The fourth line is no longer needed. >>>> >>>> Above is the reason I would like to keep the fourth line, since only if you >>>> call >>>> both disarm() and wake() you have that guarantee that waiter threads will >>>> return. >>>> >>>> Thanks, Robbin >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> >>>>>> Inc: >>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/ >>>>>> >>>>>> Full: >>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/ >>>>>> >>>>>> /Robbin >>>>>> >>>>>>> >>>>>>> Otherwise this all looks good! >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>> >>>>>>>> Full: >>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/ >>>>>>>> >>>>>>>> Thanks, Robbin >>>>>>>> >>>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote: >>>>>>>>> Forgot RFR in subject. >>>>>>>>> >>>>>>>>> /Robbin >>>>>>>>> >>>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote: >>>>>>>>>> Hi all, please review. >>>>>>>>>> >>>>>>>>>> When a safepoint is ended we need a way to get back to 100% >>>>>>>>>> utilization as fast >>>>>>>>>> as possible. 100% utilization means no idle cpu in the system if there >>>>>>>>>> is a >>>>>>>>>> JavaThread that could be executed. The traditional ways to wake many, >>>>>>>>>> e.g. >>>>>>>>>> semaphore, pthread_cond, is not implemented with a single syscall >>>>>>>>>> instead they >>>>>>>>>> typical do one syscall per thread to wake. >>>>>>>>>> >>>>>>>>>> This change-set contains that primitive, the WaitBarrier, and a gtest >>>>>>>>>> for it. >>>>>>>>>> No actual users, which is in coming patches. >>>>>>>>>> >>>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore posting, >>>>>>>>>> threads woken >>>>>>>>>> will also post. On Linux we can instead directly use a futex and with one >>>>>>>>>> syscall wake all. Depending on how many threads and cpus the >>>>>>>>>> performance vary, >>>>>>>>>> but a good utilization of the machine, just on the edge of saturated, >>>>>>>>>> the time to reach 100% utilization is around 3 times faster with the >>>>>>>>>> WaitBarrier (where futex is faster than semaphore). >>>>>>>>>> >>>>>>>>>> Webrev: >>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/ >>>>>>>>>> >>>>>>>>>> CR: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271 >>>>>>>>>> >>>>>>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug and >>>>>>>>>> release. >>>>>>>>>> And have been stable when used in safepoints (t1-8) (coming patches). >>>>>>>>>> >>>>>>>>>> Thanks, Robbin From david.holmes at oracle.com Tue Jan 8 11:27:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Jan 2019 21:27:55 +1000 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> Message-ID: <1366d544-baca-e9db-b59c-0daf2b054464@oracle.com> On 8/01/2019 7:56 pm, Aleksey Shipilev wrote: > On 1/8/19 3:15 AM, David Holmes wrote: >> It seems somewhat awkward to me to have two different code paths for initializing the >> java.lang.Class name field. Can this be restructured a little more (change Class.getName()) so that >> the VM always initializes "name" and then JVM_GetClassName could just call java_lang_Class::name, >> rather than duplicating the logic? > > Mmm. I am afraid to do this eagerly because of more memory footprint and potential bootstrapping > issues. I said nothing about doing this eagerly. All I'm suggesting is that instead of getName() doing: if (this.name == null) this.name = getName0(); // calls JVM_GetClassName return this.name; it just does: if (this.name == null) getName0(); // calls JVM_GetClassName return this.name; and JVM_GetClassName calls java_lang_Class::name() which sets "this.name" as a side-effect (as it does if called from the stacktrace code). > Also, I want to keep this open to implement a crazy footprint-reducing idea: nulling the > fields like Class.name to conserve footprint at expense of additional call to reinstate the value > afterwards. I don't see any impact on that plan. Of course you'll need to ensure the cached name is updated in a thread-safe manner. Cheers, David ----- >> Specific comments: >> >> src/hotspot/share/classfile/javaClasses.cpp >> >> !???? Klass* k = as_Klass(java_class); >> !???? assert(k->is_klass(), "just checking"); >> !???? name = k->external_name(); >> >> as_Klass already has the requisite assertions so there was no reason to change this part of the >> code. I see that jvm.cpp already contains the same redundant logic. > > Right. Ditched that change. > >> Copyright years need updating. > > Updated. > > Also made the test more up-to-the-point, with clearing the Class.name cache explicitly. > > New webrev: > http://cr.openjdk.java.net/~shade/8216302/webrev.02/ > > Thanks, > -Aleksey > From erik.osterlund at oracle.com Tue Jan 8 11:59:26 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 8 Jan 2019 12:59:26 +0100 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> Message-ID: Hi David, The required synchronization is that the _subklass link is read/written with at least acquire/release semantics, correspondingly. And now they are. (when appending, the link gets written with a conservative CAS, and the link is loaded with load_acquire). Thanks, /Erik On 2019-01-08 02:49, David Holmes wrote: > Hi Coleen, > > On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >> Summary: Set InstanceKlass::loaded before adding classes to the >> subklass list, which can be read concurrently by the compiler. > > I think you need a storestore barrier to ensure the new order is > preserved. > > Cheers, > David > >> Thanks to Erik for the diagnosis and suggested fix.? See bug comments >> for more details. >> >> Tested with hs-tier1-3, 6 and 8. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >> >> Thanks, >> Coleen From erik.osterlund at oracle.com Tue Jan 8 12:22:12 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 8 Jan 2019 13:22:12 +0100 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: Message-ID: <4becf4ef-93e7-2195-0aed-ba16d16e7925@oracle.com> Hi Coleen, Looks good. /Erik On 2019-01-07 20:50, coleen.phillimore at oracle.com wrote: > Summary: Set InstanceKlass::loaded before adding classes to the > subklass list, which can be read concurrently by the compiler. > > Thanks to Erik for the diagnosis and suggested fix.? See bug comments > for more details. > > Tested with hs-tier1-3, 6 and 8. > > open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8215575 > > Thanks, > Coleen From david.holmes at oracle.com Tue Jan 8 12:58:47 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Jan 2019 22:58:47 +1000 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> Message-ID: <73969a98-1e0d-f5fa-14b3-adf87ee3b933@oracle.com> On 8/01/2019 9:59 pm, Erik ?sterlund wrote: > Hi David, > > The required synchronization is that the _subklass link is read/written > with at least acquire/release semantics, correspondingly. And now they > are. (when appending, the link gets written with a conservative CAS, and > the link is loaded with load_acquire). Okay. I took a look inside append_to_sibling_list and see there is lots of ordering control in there. Aside: why do you need an Atomic::store in set_next_sibling ?? Thanks, David > Thanks, > /Erik > > On 2019-01-08 02:49, David Holmes wrote: >> Hi Coleen, >> >> On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >>> Summary: Set InstanceKlass::loaded before adding classes to the >>> subklass list, which can be read concurrently by the compiler. >> >> I think you need a storestore barrier to ensure the new order is >> preserved. >> >> Cheers, >> David >> >>> Thanks to Erik for the diagnosis and suggested fix.? See bug comments >>> for more details. >>> >>> Tested with hs-tier1-3, 6 and 8. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >>> >>> Thanks, >>> Coleen > From erik.osterlund at oracle.com Tue Jan 8 13:08:08 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 8 Jan 2019 14:08:08 +0100 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: <73969a98-1e0d-f5fa-14b3-adf87ee3b933@oracle.com> References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> <73969a98-1e0d-f5fa-14b3-adf87ee3b933@oracle.com> Message-ID: <9ca5c50b-48a2-5883-3108-a34724d1fada@oracle.com> Hi David, On 2019-01-08 13:58, David Holmes wrote: > On 8/01/2019 9:59 pm, Erik ?sterlund wrote: >> Hi David, >> >> The required synchronization is that the _subklass link is >> read/written with at least acquire/release semantics, >> correspondingly. And now they are. (when appending, the link gets >> written with a conservative CAS, and the link is loaded with >> load_acquire). > > Okay. I took a look inside append_to_sibling_list and see there is > lots of ordering control in there. > > Aside: why do you need an Atomic::store in set_next_sibling ?? Because it is read concurrently. Despite being read concurrently, they do not need load_acquire, because the entries read are strictly older than the entry you call it on, due to prepending in the list. So therefore, the acquire of the _subklass link protects both the _subklass and all _siblings it has. But I still want the atomicity to avoid e.g. word tearing. Now you won't get word tearing anyway because compilers are nice to us, but by annotating it as Atomic, we can eventually get better guarantees about that once we plug in Atomic to C++11 atomics. Thanks, /Erik > Thanks, > David > >> Thanks, >> /Erik >> >> On 2019-01-08 02:49, David Holmes wrote: >>> Hi Coleen, >>> >>> On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >>>> Summary: Set InstanceKlass::loaded before adding classes to the >>>> subklass list, which can be read concurrently by the compiler. >>> >>> I think you need a storestore barrier to ensure the new order is >>> preserved. >>> >>> Cheers, >>> David >>> >>>> Thanks to Erik for the diagnosis and suggested fix.? See bug >>>> comments for more details. >>>> >>>> Tested with hs-tier1-3, 6 and 8. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >>>> >>>> Thanks, >>>> Coleen >> From david.holmes at oracle.com Tue Jan 8 13:15:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Jan 2019 23:15:55 +1000 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: <9ca5c50b-48a2-5883-3108-a34724d1fada@oracle.com> References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> <73969a98-1e0d-f5fa-14b3-adf87ee3b933@oracle.com> <9ca5c50b-48a2-5883-3108-a34724d1fada@oracle.com> Message-ID: <48cc306d-e9b5-eae3-457b-114e8a9f4947@oracle.com> On 8/01/2019 11:08 pm, Erik ?sterlund wrote: > Hi David, > > On 2019-01-08 13:58, David Holmes wrote: >> On 8/01/2019 9:59 pm, Erik ?sterlund wrote: >>> Hi David, >>> >>> The required synchronization is that the _subklass link is >>> read/written with at least acquire/release semantics, >>> correspondingly. And now they are. (when appending, the link gets >>> written with a conservative CAS, and the link is loaded with >>> load_acquire). >> >> Okay. I took a look inside append_to_sibling_list and see there is >> lots of ordering control in there. >> >> Aside: why do you need an Atomic::store in set_next_sibling ?? > > Because it is read concurrently. Despite being read concurrently, they > do not need load_acquire, because the entries read are strictly older > than the entry you call it on, due to prepending in the list. So > therefore, the acquire of the _subklass link protects both the _subklass > and all _siblings it has. But I still want the atomicity to avoid e.g. > word tearing. Now you won't get word tearing anyway because compilers > are nice to us, but by annotating it as Atomic, we can eventually get > better guarantees about that once we plug in Atomic to C++11 atomics. Okay it's late here and I'm tired, but this seems excessively conservative. Atomic::store should only be needed for 64-bit non-pointer types (in case of 32-bit system) or an unaligned access that isn't guaranteed atomic by the platform. Otherwise we'd need Atomic::store and Atomic::load all over the place! David > Thanks, > /Erik > >> Thanks, >> David >> >>> Thanks, >>> /Erik >>> >>> On 2019-01-08 02:49, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >>>>> Summary: Set InstanceKlass::loaded before adding classes to the >>>>> subklass list, which can be read concurrently by the compiler. >>>> >>>> I think you need a storestore barrier to ensure the new order is >>>> preserved. >>>> >>>> Cheers, >>>> David >>>> >>>>> Thanks to Erik for the diagnosis and suggested fix.? See bug >>>>> comments for more details. >>>>> >>>>> Tested with hs-tier1-3, 6 and 8. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >>>>> >>>>> Thanks, >>>>> Coleen >>> > From sgehwolf at redhat.com Tue Jan 8 13:22:53 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 08 Jan 2019 14:22:53 +0100 Subject: RFR (XS): 8216366: Add rationale to PER_CPU_SHARES define Message-ID: <55552298b5cb113a39be4d40d096cbf0bb3a3e36.camel@redhat.com> Hi, Could I get reviews for this comment-only change, please. It's meant to clarify as to why a magic number of 1024 is being used for PER_CPU_SHARES define for the JVM container support. Bug: https://bugs.openjdk.java.net/browse/JDK-8216366 webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8216366/webrev.01/ More info is in this thread: http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036087.html Thanks, Severin From sgehwolf at redhat.com Tue Jan 8 13:25:50 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 08 Jan 2019 14:25:50 +0100 Subject: [Containers] Reasoning for cpu shares limits In-Reply-To: <55CDA207-A732-4085-8448-60DDC46B291E@oracle.com> References: <3b7cef4912bcc0e14e64df95227f5d02d3f0fe62.camel@redhat.com> <2FA9BDBB-6DBA-41F2-BAD8-0CA08B606481@oracle.com> <4cf4e53e2f1edf92df6da06182ee51327204905f.camel@redhat.com> <55CDA207-A732-4085-8448-60DDC46B291E@oracle.com> Message-ID: <5125f6312f0f4140d35a88fc23001ef420c1e42c.camel@redhat.com> On Mon, 2019-01-07 at 10:24 -0500, Bob Vandette wrote: > > C) This needs to be at least documented in code as to why that > > decision > > has been made. Specifically "#define PER_CPU_SHARES 1024" in > > src/hotspot/os/linux/osContainer_linux.cpp. > > I agree that PER_CPU_SHARES should have a comment documenting its > meaning and origin. See http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036137.html Thanks, Severin From bob.vandette at oracle.com Tue Jan 8 13:31:42 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 8 Jan 2019 08:31:42 -0500 Subject: RFR (XS): 8216366: Add rationale to PER_CPU_SHARES define In-Reply-To: <55552298b5cb113a39be4d40d096cbf0bb3a3e36.camel@redhat.com> References: <55552298b5cb113a39be4d40d096cbf0bb3a3e36.camel@redhat.com> Message-ID: <7273109D-9CDE-4958-9BCE-60143B1EF605@oracle.com> I don?t know how folks feel about embedding possible bit-rotting URLs in the sources but it looks good to me. Thanks for doing this. Bob. > On Jan 8, 2019, at 8:22 AM, Severin Gehwolf wrote: > > Hi, > > Could I get reviews for this comment-only change, please. It's meant to > clarify as to why a magic number of 1024 is being used for > PER_CPU_SHARES define for the JVM container support. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8216366 > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8216366/webrev.01/ > > More info is in this thread: > http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036087.html > > Thanks, > Severin > From erik.osterlund at oracle.com Tue Jan 8 14:14:08 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 8 Jan 2019 15:14:08 +0100 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: <48cc306d-e9b5-eae3-457b-114e8a9f4947@oracle.com> References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> <73969a98-1e0d-f5fa-14b3-adf87ee3b933@oracle.com> <9ca5c50b-48a2-5883-3108-a34724d1fada@oracle.com> <48cc306d-e9b5-eae3-457b-114e8a9f4947@oracle.com> Message-ID: Hi David, On 2019-01-08 14:15, David Holmes wrote: > On 8/01/2019 11:08 pm, Erik ?sterlund wrote: >> Hi David, >> >> On 2019-01-08 13:58, David Holmes wrote: >>> On 8/01/2019 9:59 pm, Erik ?sterlund wrote: >>>> Hi David, >>>> >>>> The required synchronization is that the _subklass link is >>>> read/written with at least acquire/release semantics, >>>> correspondingly. And now they are. (when appending, the link gets >>>> written with a conservative CAS, and the link is loaded with >>>> load_acquire). >>> >>> Okay. I took a look inside append_to_sibling_list and see there is >>> lots of ordering control in there. >>> >>> Aside: why do you need an Atomic::store in set_next_sibling ?? >> >> Because it is read concurrently. Despite being read concurrently, >> they do not need load_acquire, because the entries read are strictly >> older than the entry you call it on, due to prepending in the list. >> So therefore, the acquire of the _subklass link protects both the >> _subklass and all _siblings it has. But I still want the atomicity to >> avoid e.g. word tearing. Now you won't get word tearing anyway >> because compilers are nice to us, but by annotating it as Atomic, we >> can eventually get better guarantees about that once we plug in >> Atomic to C++11 atomics. > > Okay it's late here and I'm tired, but this seems excessively > conservative. Atomic::store should only be needed for 64-bit > non-pointer types (in case of 32-bit system) or an unaligned access > that isn't guaranteed atomic by the platform. Otherwise we'd need > Atomic::store and Atomic::load all over the place! Right. The compiler is nice enough to give us atomic accesses as you mention. But since C++11 (which we are planning to upgrade past soonish I think), the standard is explicit about stating you can't assume you will not get word tearing on volatile accesses, even on naturally aligned word sized primitives. Only with native C++ atomics do you get those guarantees. Personally I think that is a really evil compatibility issue. I guess in practice compilers continue working fine anyway, because there is no good reason to break code just for the fun of it. Anyway, this is why in code I write, I try to use Atomic::load/store as good practice when I need atomicity, to reduce headache later on when we reroute Atomic::load/store to std::atomic. /Erik > David > >> Thanks, >> /Erik >> >>> Thanks, >>> David >>> >>>> Thanks, >>>> /Erik >>>> >>>> On 2019-01-08 02:49, David Holmes wrote: >>>>> Hi Coleen, >>>>> >>>>> On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Set InstanceKlass::loaded before adding classes to the >>>>>> subklass list, which can be read concurrently by the compiler. >>>>> >>>>> I think you need a storestore barrier to ensure the new order is >>>>> preserved. >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>>> Thanks to Erik for the diagnosis and suggested fix.? See bug >>>>>> comments for more details. >>>>>> >>>>>> Tested with hs-tier1-3, 6 and 8. >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>> >> From zgu at redhat.com Tue Jan 8 14:19:07 2019 From: zgu at redhat.com (zgu at redhat.com) Date: Tue, 08 Jan 2019 09:19:07 -0500 Subject: [Fwd: Submit repo] References: <1546950303.3477.48.camel@redhat.com> Message-ID: <1546957147.3477.50.camel@redhat.com> Could anyone from Oracle help? Thanks, -Zhengyu -------- Forwarded Message -------- From: zgu at redhat.com To: ops at openjdk.java.net Subject: Submit repo Date: Tue, 08 Jan 2019 07:25:03 -0500 Hi, I submit two tests[1][2] yesterday. After almost 24 hours, I have yet received any results, could someone take a look? Thanks, -Zhengyu [1] https://mail.openjdk.java.net/pipermail/jdk-submit-changes/2019-Jan uary/004668.html [2] https://mail.openjdk.java.net/pipermail/jdk-submit-changes/2019-Jan uary/004669.html From dms at samersoff.net Tue Jan 8 14:28:34 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Tue, 8 Jan 2019 17:28:34 +0300 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> Message-ID: <8bb894e1-5776-de24-0938-4ec3c297345b@samersoff.net> On 05.01.2019 14:31, Andrew Haley wrote: My $.2 > It's that the question of "is this the same > file?" is extremely difficult to answer definitively. I'm second to Andrew with this concern and I trust a human more than a compiler here. Human mistake could be recognized and fixed easily, at the cost of simple python script to verify guards at worst. But if the compiler makes the wrong decision to include/not include a file, then the problem will cost a fortune to resolve. In addition, the ability to manually define macros to enforce a specific order of inclusion is sometimes very useful. -Dmitry > On 1/4/19 3:59 PM, Erik ?sterlund wrote: > >> On 2019-01-04 15:59, David Lloyd wrote: >>> In addition, it was pointed out to me that if, for some reason, a >>> header file ends up in more than one location on the include path, >>> #pragma once will (probably, as it's not standardized) allow it to >>> be included twice, which #ifdef guards avoid. This is perhaps not >>> a real concern in this particular code base though. >> >> That sounds like a bug. Just because something is implementation >> defined, doesn't make it okay or expected to not work. Do you know >> how to reproduce this, and on which platform/compiler/version? >> Obviously, if this was an issue in our code base, one would quickly >> notice it doesn't build. > > It's not a bug, exactly. It's that the question of "is this the same > file?" is extremely difficult to answer definitively. Not all > filesystems give you a reliable way to answer that question. Sure, you > can kludge around the problem with modification times and maybe even a > collision-free hash, but getting it really correct is not going to be > efficient, and not even possible until that question is rigorously > defined. There's a good reason why #pragma once still isn't standard. > From erik.osterlund at oracle.com Tue Jan 8 14:42:18 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 8 Jan 2019 15:42:18 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <8bb894e1-5776-de24-0938-4ec3c297345b@samersoff.net> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> <8bb894e1-5776-de24-0938-4ec3c297345b@samersoff.net> Message-ID: <694db30d-7d8d-14b9-50c3-1b3097efdd84@oracle.com> Hi Dmitry, On 2019-01-08 15:28, Dmitry Samersoff wrote: > On 05.01.2019 14:31, Andrew Haley wrote: > > My $.2 > >> It's that the question of "is this the same >> file?" is extremely difficult to answer definitively. > I'm second to Andrew with this concern and I trust a human more than a > compiler here. > > Human mistake could be recognized and fixed easily, at the cost of > simple python script to verify guards at worst. > > But if the compiler makes the wrong decision to include/not include a > file, then the problem will cost a fortune to resolve. Compilers are deterministic, humans or not. If a compiler implementation of #pragma once is broken, that would not go unnoticed. The build will deterministically not build due to duplicate definitions of whatever is in that header. You would never be able to finish building but with weird bugs introduced. > In addition, the ability to manually define macros to enforce a specific > order of inclusion is sometimes very useful. I'm not sure what you mean by this. 1) Include guards vs pragma once has nothing to do with the include order. It's just different ways of folding the file contents after duplicated inclusions. 2) You can't perform #include in macros - that is not allowed by the preprocessor. You can only use macros to generate the string of your include, such has how we include OS specific headers, but not use macros to generate #includes, that are evaluated. 3) If you depend on the order of includes, your code is very shady and needs fixing. /Erik > -Dmitry > >> On 1/4/19 3:59 PM, Erik ?sterlund wrote: >> >>> On 2019-01-04 15:59, David Lloyd wrote: >>>> In addition, it was pointed out to me that if, for some reason, a >>>> header file ends up in more than one location on the include path, >>>> #pragma once will (probably, as it's not standardized) allow it to >>>> be included twice, which #ifdef guards avoid. This is perhaps not >>>> a real concern in this particular code base though. >>> That sounds like a bug. Just because something is implementation >>> defined, doesn't make it okay or expected to not work. Do you know >>> how to reproduce this, and on which platform/compiler/version? >>> Obviously, if this was an issue in our code base, one would quickly >>> notice it doesn't build. >> It's not a bug, exactly. It's that the question of "is this the same >> file?" is extremely difficult to answer definitively. Not all >> filesystems give you a reliable way to answer that question. Sure, you >> can kludge around the problem with modification times and maybe even a >> collision-free hash, but getting it really correct is not going to be >> efficient, and not even possible until that question is rigorously >> defined. There's a good reason why #pragma once still isn't standard. >> From sgehwolf at redhat.com Tue Jan 8 15:15:41 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 08 Jan 2019 16:15:41 +0100 Subject: RFR (XS): 8216366: Add rationale to PER_CPU_SHARES define In-Reply-To: <7273109D-9CDE-4958-9BCE-60143B1EF605@oracle.com> References: <55552298b5cb113a39be4d40d096cbf0bb3a3e36.camel@redhat.com> <7273109D-9CDE-4958-9BCE-60143B1EF605@oracle.com> Message-ID: On Tue, 2019-01-08 at 08:31 -0500, Bob Vandette wrote: > I don?t know how folks feel about embedding possible bit-rotting URLs in the sources but > it looks good to me. I've considered that. Then concluded that URLs might be useful even if they bitrot (in times like wayback machine and such). > Thanks for doing this. Thanks for the review Bob! Any Reviewers care to comment on this? Can I consider this a trivial change? Thanks, Severin > Bob. > > > > On Jan 8, 2019, at 8:22 AM, Severin Gehwolf > > wrote: > > > > Hi, > > > > Could I get reviews for this comment-only change, please. It's > > meant to > > clarify as to why a magic number of 1024 is being used for > > PER_CPU_SHARES define for the JVM container support. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8216366 > > webrev: > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8216366/webrev.01/ > > > > More info is in this thread: > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036087.html > > > > Thanks, > > Severin > > > > From per.liden at oracle.com Tue Jan 8 15:25:20 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 8 Jan 2019 16:25:20 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <3727d1da-256c-54b7-2d9b-f819ef08cfa4@oracle.com> Message-ID: <09cdb516-53cf-c072-dc0e-dd80aef6f344@oracle.com> Hi Coleen, On 1/7/19 5:16 PM, coleen.phillimore at oracle.com wrote: > > > On 1/7/19 3:40 AM, Per Liden wrote: >> Hi Coleen, >> >> On 1/3/19 3:31 AM, coleen.phillimore at oracle.com wrote: >>> >>> Here is the webrev and bug link. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev >> >> Looks like your script is now leaving an extra empty line at the end >> of all files, which wasn't there before. Fox example: >> >> --- old/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 >> 16:41:04.209075410 -0500 >> +++ new/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 >> 16:41:03.957075419 -0500 >> @@ -64,4 +63,3 @@ >> ?? static void flip_to_remapped(); >> ?}; >> >> -#endif // SHARE_GC_Z_ZADDRESS_HPP >> >> Should be: >> >> --- old/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 >> 16:41:04.209075410 -0500 >> +++ new/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 >> 16:41:03.957075419 -0500 >> @@ -64,4 +63,3 @@ >> ?? static void flip_to_remapped(); >> ?}; >> - >> -#endif // SHARE_GC_Z_ZADDRESS_HPP >> >> >> Could you please fix that? > > Hi, I fixed the trailing blank lines (in some cases several lines) and > had to hand patch files that ended with line continuation '\' from some > macro.? I found these in some globals files like g1_globals.hpp and one > ci file, that I can't find anymore. > > http://cr.openjdk.java.net/~coleenp/8216022.diffs.03 Thanks for fixing. Looks good. And for the record, I'm for using #pragma once instead of manually typed include guards. The number of miss-typed include guards we've seen over the years is enough to convince me that people in general aren't very good at getting this right, and we should let our tool chain handle this for us. cheers, Per > > Thanks, > Coleen >> >> Thanks! >> Per >> >>> bug link https://bugs.openjdk.java.net/browse/JDK-8216022 >>> >>> On 1/2/19 9:16 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: change include guards to #pragma once, except in generated >>>> header files. >>>> >>>> Tested with mach5 for linux-x64{-debug}, solaris-sparc, macosx-x64, >>>> windows-x64, built aarch64 with cross compiler, and zero. >>>> >>>> Ran tier1 and 2 tests. >>>> >>>> The webrev is huge but there are only 3 lines changed in each header >>>> file.? So click on the patch. >>>> >>>> I'll update the copyright headers with a script with the commit. >>>> Also, will do this after the shenandoah copyright headers are fixed. >>>> >>>> Adrian: I included you to check your platforms. >>>> >>>> Happy New Year! >>>> Coleen >>> > From shade at redhat.com Tue Jan 8 15:51:54 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 8 Jan 2019 16:51:54 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <1366d544-baca-e9db-b59c-0daf2b054464@oracle.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <1366d544-baca-e9db-b59c-0daf2b054464@oracle.com> Message-ID: On 1/8/19 12:27 PM, David Holmes wrote: > On 8/01/2019 7:56 pm, Aleksey Shipilev wrote: >> On 1/8/19 3:15 AM, David Holmes wrote: >>> It seems somewhat awkward to me to have two different code paths for initializing the >>> java.lang.Class name field. Can this be restructured a little more (change Class.getName()) so that >>> the VM always initializes "name" and then JVM_GetClassName could just call java_lang_Class::name, >>> rather than duplicating the logic? >> >> Mmm. I am afraid to do this eagerly because of more memory footprint and potential bootstrapping >> issues. > > I said nothing about doing this eagerly. All I'm suggesting is that instead of getName() doing: > > if (this.name == null) > getName0(); // calls JVM_GetClassName > return this.name; > > it just does: > > if (this.name == null) > ? getName0(); // calls JVM_GetClassName > return this.name; Oh. I had this option on the table when doing the patch, but I disregarded it as dirty hack, because calling the getter for side effects *only* is awkward. Handling concurrency on Java side also looks simpler. There is a benign race on Class.name, and to remain benign, it should _not_ do the second read. In other word, null-checking this.name, and then returning the second read of this.name is not entirely correct. The original code is checking and returning the local, not the second heap read: public String getName() { String name = this.name; if (name == null) this.name = name = getName0(); return name; } This, I think, leaves us with taking the return value from getName0. If we wanted to handle JVM_GetClassName better, we could consider calling java_lang_Class::name from JVM_GetClassName, thus going through the cached path. That would do the two stores, though: one through javaClasses, and another on Java side. Seeing how there is only a single use of JVM_GetClassName, and that use is already-cached Class.getName0, I see no reason to do the excessive write. -Aleksey From adinn at redhat.com Tue Jan 8 15:54:05 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 8 Jan 2019 15:54:05 +0000 Subject: RFR (XS): 8216366: Add rationale to PER_CPU_SHARES define In-Reply-To: References: <55552298b5cb113a39be4d40d096cbf0bb3a3e36.camel@redhat.com> <7273109D-9CDE-4958-9BCE-60143B1EF605@oracle.com> Message-ID: <2c8bea05-e9ce-7562-7215-9b602f6ec055@redhat.com> On 08/01/2019 15:15, Severin Gehwolf wrote: > On Tue, 2019-01-08 at 08:31 -0500, Bob Vandette wrote: >> I don?t know how folks feel about embedding possible bit-rotting URLs in the sources but >> it looks good to me. > > I've considered that. Then concluded that URLs might be useful even if > they bitrot (in times like wayback machine and such). > >> Thanks for doing this. > > Thanks for the review Bob! > > Any Reviewers care to comment on this? Can I consider this a trivial > change? It looks pretty trivial to me. If not then consider this a review :-) regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From coleen.phillimore at oracle.com Tue Jan 8 16:08:35 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 11:08:35 -0500 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> Message-ID: Hi David, My original version had a storestore but Erik convinced me that is unneeded since the subklass and sibling lists are what are read concurrently and were the fields that needed the ordering, not necessarily this one.??? If we backport this to 11, we have to add barriers to _subklass and _next_sibling like Erik has added. Does the rest of the change look good? Thanks, Coleen On 1/7/19 8:49 PM, David Holmes wrote: > Hi Coleen, > > On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >> Summary: Set InstanceKlass::loaded before adding classes to the >> subklass list, which can be read concurrently by the compiler. > > I think you need a storestore barrier to ensure the new order is > preserved. > > Cheers, > David > >> Thanks to Erik for the diagnosis and suggested fix.? See bug comments >> for more details. >> >> Tested with hs-tier1-3, 6 and 8. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Tue Jan 8 16:09:11 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 11:09:11 -0500 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: <4becf4ef-93e7-2195-0aed-ba16d16e7925@oracle.com> References: <4becf4ef-93e7-2195-0aed-ba16d16e7925@oracle.com> Message-ID: <9b5e05fe-c2ab-d21a-0e6c-202968fe17d9@oracle.com> Erik, Thank you for your reply to David and the review and discussion for this fix. Coleen On 1/8/19 7:22 AM, Erik ?sterlund wrote: > Hi Coleen, > > Looks good. > > /Erik > > On 2019-01-07 20:50, coleen.phillimore at oracle.com wrote: >> Summary: Set InstanceKlass::loaded before adding classes to the >> subklass list, which can be read concurrently by the compiler. >> >> Thanks to Erik for the diagnosis and suggested fix.? See bug comments >> for more details. >> >> Tested with hs-tier1-3, 6 and 8. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >> >> Thanks, >> Coleen > From sgehwolf at redhat.com Tue Jan 8 16:08:16 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 08 Jan 2019 17:08:16 +0100 Subject: RFR (XS): 8216366: Add rationale to PER_CPU_SHARES define In-Reply-To: <2c8bea05-e9ce-7562-7215-9b602f6ec055@redhat.com> References: <55552298b5cb113a39be4d40d096cbf0bb3a3e36.camel@redhat.com> <7273109D-9CDE-4958-9BCE-60143B1EF605@oracle.com> <2c8bea05-e9ce-7562-7215-9b602f6ec055@redhat.com> Message-ID: <4067c64af2566398fd210703ab8722b42c657422.camel@redhat.com> On Tue, 2019-01-08 at 15:54 +0000, Andrew Dinn wrote: > On 08/01/2019 15:15, Severin Gehwolf wrote: > > On Tue, 2019-01-08 at 08:31 -0500, Bob Vandette wrote: > > > I don?t know how folks feel about embedding possible bit-rotting URLs in the sources but > > > it looks good to me. > > > > I've considered that. Then concluded that URLs might be useful even if > > they bitrot (in times like wayback machine and such). > > > > > Thanks for doing this. > > > > Thanks for the review Bob! > > > > Any Reviewers care to comment on this? Can I consider this a trivial > > change? > > It looks pretty trivial to me. If not then consider this a review :-) Thanks Andrew! Cheers, Severin From coleen.phillimore at oracle.com Tue Jan 8 16:40:43 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 11:40:43 -0500 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> Message-ID: <1d480205-a1d0-0c3a-c1ee-6da6484d2068@oracle.com> http://cr.openjdk.java.net/~shade/8216302/webrev.02/src/hotspot/share/classfile/javaClasses.cpp.udiff.html + oop class_oop = holder->java_mirror(); This is a naked oop.? You should use Handle and HandleMark in this function. 1347 oop java_lang_Class::name(oop java_class, TRAPS) { So is this.? You should pass a Handle to this. Otherwise, this looks really good. Thanks, Coleen On 1/8/19 4:56 AM, Aleksey Shipilev wrote: > On 1/8/19 3:15 AM, David Holmes wrote: >> It seems somewhat awkward to me to have two different code paths for initializing the >> java.lang.Class name field. Can this be restructured a little more (change Class.getName()) so that >> the VM always initializes "name" and then JVM_GetClassName could just call java_lang_Class::name, >> rather than duplicating the logic? > Mmm. I am afraid to do this eagerly because of more memory footprint and potential bootstrapping > issues. Also, I want to keep this open to implement a crazy footprint-reducing idea: nulling the > fields like Class.name to conserve footprint at expense of additional call to reinstate the value > afterwards. > >> Specific comments: >> >> src/hotspot/share/classfile/javaClasses.cpp >> >> !???? Klass* k = as_Klass(java_class); >> !???? assert(k->is_klass(), "just checking"); >> !???? name = k->external_name(); >> >> as_Klass already has the requisite assertions so there was no reason to change this part of the >> code. I see that jvm.cpp already contains the same redundant logic. > Right. Ditched that change. > >> Copyright years need updating. > Updated. > > Also made the test more up-to-the-point, with clearing the Class.name cache explicitly. > > New webrev: > http://cr.openjdk.java.net/~shade/8216302/webrev.02/ > > Thanks, > -Aleksey > From coleen.phillimore at oracle.com Tue Jan 8 16:48:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 11:48:04 -0500 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> Message-ID: On 1/8/19 4:56 AM, Aleksey Shipilev wrote: > On 1/8/19 3:15 AM, David Holmes wrote: >> It seems somewhat awkward to me to have two different code paths for initializing the >> java.lang.Class name field. Can this be restructured a little more (change Class.getName()) so that >> the VM always initializes "name" and then JVM_GetClassName could just call java_lang_Class::name, >> rather than duplicating the logic? > Mmm. I am afraid to do this eagerly because of more memory footprint and potential bootstrapping > issues. Also, I want to keep this open to implement a crazy footprint-reducing idea: nulling the > fields like Class.name to conserve footprint at expense of additional call to reinstate the value > afterwards. I agree we shouldn't initialize the name field eagerly when creating a class.? I also looked at this code path: ??? public String getName() { ??????? String name = this.name; ??????? if (name == null) ??????????? this.name = name = getName0(); ??????? return name; ??? } It looks like when we call JVM_GetClassName, we're initializing the Class.name field by the caller. Maybe could rewrite the java/lang/Class version to be: ??? public String getName() { ??????? String name = this.name; ??????? if (name == null) ??????????? name = getName0();? // this initializes this.name yuck. ??????? return name; ??? } and have JVM_GetClassName call java_lang_Class::name() to do the initialization.?? Seems not worth it just to avoid duplicating these lines in both java_lang_Class::name() and JVM_GetClassName. + const char* name = java_lang_Class::as_external_name(JNIHandles::resolve(cls)); + oop result = StringTable::intern((char*)name, CHECK_NULL); Coleen >> Specific comments: >> >> src/hotspot/share/classfile/javaClasses.cpp >> >> !???? Klass* k = as_Klass(java_class); >> !???? assert(k->is_klass(), "just checking"); >> !???? name = k->external_name(); >> >> as_Klass already has the requisite assertions so there was no reason to change this part of the >> code. I see that jvm.cpp already contains the same redundant logic. > Right. Ditched that change. > >> Copyright years need updating. > Updated. > > Also made the test more up-to-the-point, with clearing the Class.name cache explicitly. > > New webrev: > http://cr.openjdk.java.net/~shade/8216302/webrev.02/ > > Thanks, > -Aleksey > From shade at redhat.com Tue Jan 8 16:55:53 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 8 Jan 2019 17:55:53 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <1d480205-a1d0-0c3a-c1ee-6da6484d2068@oracle.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <1d480205-a1d0-0c3a-c1ee-6da6484d2068@oracle.com> Message-ID: On 1/8/19 5:40 PM, coleen.phillimore at oracle.com wrote: > http://cr.openjdk.java.net/~shade/8216302/webrev.02/src/hotspot/share/classfile/javaClasses.cpp.udiff.html > > + oop class_oop = holder->java_mirror(); > > This is a naked oop.? You should use Handle and HandleMark in this function. Mmm. But it is the same oop we are already handling in the old code... What's the rule here? > 1347 oop java_lang_Class::name(oop java_class, TRAPS) { > > So is this.? You should pass a Handle to this. I don't understand the "should" part. Why should it be handle-ized? javaClasses are expected to work properly with naked oops, because we are not expected to get to safepoint in the middle of it, no? And we are storing oops to the heap itself, not in any VM structure. In the same StackTraceElement::fill_in, we do e.g. this without any handles: oop loader = holder->class_loader(); if (loader != NULL) { oop loader_name = java_lang_ClassLoader::name(loader); if (loader_name != NULL) java_lang_StackTraceElement::set_classLoaderName(element(), loader_name); } -Aleksey From jcbeyler at google.com Tue Jan 8 18:05:32 2019 From: jcbeyler at google.com (JC Beyler) Date: Tue, 8 Jan 2019 10:05:32 -0800 Subject: RFR (L) 8213501 : Deploy ExceptionJniWrapper for a few tests In-Reply-To: References: <895ef766-9c96-7185-4222-178379629ce4@oracle.com> <04a464fa-c1c8-5d86-3633-0b532840561c@oracle.com> <7ef06464-a614-8941-bb51-ce1c467889b2@oracle.com> <45341168-e7e0-90d1-449f-210500882b8f@oracle.com> <55283958-de3d-07f2-51e3-ad34c5046a96@oracle.com> <31613f88-5f7d-938d-e9f6-69cdaf857268@oracle.com> <839301b7-c247-df3b-e485-283e8bb7388b@oracle.com> <95fe277d-ba6e-4fec-77aa-d1f1051751aa@oracle.com> <72bf2f4a-5bf7-98de-5f00-68485072923d@oracle.com> Message-ID: Happy new year all! Could I get a final LGTM for version 6? Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 Thanks! Jc On Mon, Dec 17, 2018 at 8:43 AM JC Beyler wrote: > Hi all, > > I don't believe I got actual LGTM for this version: > > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > > > It removed the namespaces and uses explicit static instead :) > > Thanks! > Jc > > On Wed, Dec 12, 2018 at 8:06 PM JC Beyler wrote: > >> So did I Alexey but with David & Serguei preferring static, it seems more >> reasonable to go down their route :-) >> >> So here is the latest webrev with static instead of an anonymous >> namespace: >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >> >> Let me know what you think, can I get a webrev 06 review? >> >> Thanks! >> Jc >> >> On Wed, Dec 12, 2018 at 3:10 PM Alex Menkov >> wrote: >> >>> Hm.. >>> I considered unnamed namespaces "C++ style" (and static globals as "C >>> style"). >>> Static globals were deprecated in C++ (but some time ago the deprecation >>> was reverted). >>> >>> --alex >>> >>> On 12/12/2018 13:55, serguei.spitsyn at oracle.com wrote: >>> > Agreed. >>> > >>> > Thanks, >>> > Serguei >>> > >>> > >>> > On 12/12/18 13:52, David Holmes wrote: >>> >> FWIW I think namespaces are overkill in all of this test code and >>> just >>> >> obfuscates things - the declaration is easily missed. A static >>> >> variable in a .cpp is clearly a global variable to the file. >>> >> >>> >> Cheers, >>> >> David >>> >> >>> >> >>> >> >>> >> On 13/12/2018 5:37 am, serguei.spitsyn at oracle.com wrote: >>> >>> Hi Jc, >>> >>> >>> >>> >>> >>> On 12/11/18 21:16, JC Beyler wrote: >>> >>>> Hi all, >>> >>>> >>> >>>> Here is the new webrev with the TEST.groups change. Serguei, let me >>> >>>> know if I convinced you with the static vs anonymous namespaces or >>> >>>> if you'd still rather have a "static" for now :-) >>> >>> >>> >>> >>> >>> What do you think about this post? : >>> >>> >>> https://stackoverflow.com/questions/11623451/static-vs-non-static-variables-in-namespace >>> >>> >>> >>> >>> >>>> >>> >>>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.05/ >>> >>>> >>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>> >>> >>> The update looks fine. >>> >>> >>> >>> Thanks, >>> >>> Serguei >>> >>> >>> >>> >>> >>> Thanks, >>> >>> Serguei >>> >>> >>> >>>> >>> >>>> Thanks again for the reviews! >>> >>>> Jc >>> >>>> >>> >>>> On Mon, Dec 10, 2018 at 3:10 PM JC Beyler >> >>>> > wrote: >>> >>>> >>> >>>> Hi Serguei, >>> >>>> >>> >>>> Yes basically it is equivalent :) I can put them in but they are >>> >>>> not required. The norm actually wanted to deprecate it but then >>> >>>> remembered that C compatibility would require the static >>> key-word >>> >>>> for this case [1] >>> >>>> >>> >>>> So, really, they are not required here and will amount to the >>> same >>> >>>> thing: only that file can refer to them and you cannot get to >>> them >>> >>>> without a globally available method to return a pointer to them >>> >>>> (ie same as a static variable in C). >>> >>>> >>> >>>> I can put static if it makes it easier to see but, by being in >>> an >>> >>>> anonymous namespace they are only available for the file's >>> >>>> translation unit. For example: >>> >>>> >>> >>>> $ cat main.cpp >>> >>>> >>> >>>> int totally_global; >>> >>>> static int explictly_static; >>> >>>> >>> >>>> namespace { >>> >>>> int implicitly_static; >>> >>>> } >>> >>>> >>> >>>> void foo(); >>> >>>> int main() { >>> >>>> foo(); >>> >>>> } >>> >>>> >>> >>>> $ g++ -O3 main.cpp -c >>> >>>> $ nm main.o >>> >>>> U _GLOBAL_OFFSET_TABLE_ >>> >>>> 0000000000000000 T main >>> >>>> 0000000000000000 B totally_global >>> >>>> U _Z3foov >>> >>>> >>> >>>> As you can see, the static and anonymous namespace variables are >>> >>>> not in the file due to not being used. If you were to use them, >>> >>>> you'd see them show up as something like: >>> >>>> 0000000000000008 b _ZL17explicitly_static >>> >>>> 0000000000000004 b _ZN12_GLOBAL__N_117implicitly_staticE >>> >>>> >>> >>>> Where again, it shows that it is mangling the names so that no >>> >>>> external usage can happen without tinkering. >>> >>>> >>> >>>> Hopefully that helps :-), >>> >>>> Jc >>> >>>> >>> >>>> [1] >>> >>>> http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1012 >>> >>>> >>> >>>> >>> >>>> On Mon, Dec 10, 2018 at 2:04 PM serguei.spitsyn at oracle.com >>> >>>> >> >>>> > wrote: >>> >>>> >>> >>>> Hi Jc, >>> >>>> >>> >>>> I had little experience with the C++ namespaces. >>> >>>> My understanding is that static in this context should mean >>> >>>> internal linkage. >>> >>>> >>> >>>> Thanks, >>> >>>> Serguei >>> >>>> >>> >>>> >>> >>>> On 12/10/18 13:57, JC Beyler wrote: >>> >>>>> Hi Serguei, >>> >>>>> >>> >>>>> The variables and functions are in a anonymous namespace; >>> my >>> >>>>> understanding of C++ is that this is equivalent to putting >>> it >>> >>>>> as static.Hence, I didn't add them there. Does that make >>> >>>>> sense? >>> >>>>> >>> >>>>> Thanks! >>> >>>>> Jc >>> >>>>> >>> >>>>> On Mon, Dec 10, 2018 at 1:33 PM serguei.spitsyn at oracle.com >>> >>>>> >>> >>>>> >> >>>>> > wrote: >>> >>>>> >>> >>>>> Hi Jc, >>> >>>>> >>> >>>>> It looks good in general. >>> >>>>> One question though. >>> >>>>> >>> >>>>> >>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a_04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html >>> >>>>> >>> >>>>> >>> >>>>> I wonder if the variables and functions have to be >>> static. >>> >>>>> >>> >>>>> Thanks, >>> >>>>> Serguei >>> >>>>> >>> >>>>> >>> >>>>> On 12/5/18 11:36, JC Beyler wrote: >>> >>>>>> Hi all, >>> >>>>>> >>> >>>>>> My apologies to having to come back for another review >>> >>>>>> for this change: I ran into a snag when trying to pull >>> >>>>>> the latest changes compared to the base I was working >>> >>>>>> on. I basically forgot that there was an issue with >>> >>>>>> snprintf and that I had solved it via JDK-8213622. >>> >>>>>> >>> >>>>>> Could I have a new review of this webrev: >>> >>>>>> Webrev: >>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/ >>> >>>>>> >>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>>>>> Incremental from the port of webrev.03 that got LGTMs: >>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04/ >>> >>>>>> >>> >>>>>> >>> >>>>>> A few comments on this because it took me a while to >>> get >>> >>>>>> things in a state I thought was good: >>> >>>>>> - I had to implement an itoa method, do we have >>> >>>>>> something like that in the test base (remember that >>> >>>>>> JDK-8213622 could not use sprintf due to being in the >>> >>>>>> test code)? >>> >>>>>> >>> >>>>>> - The differences here compared to the one you all >>> >>>>>> reviewed are: >>> >>>>>> - I found that adding to the strlen/memcpy error >>> >>>>>> prone and thought that I would try to make it less so. >>> >>>>>> If you want to compare, I extended the strlen/memcpy >>> >>>>>> with the new format to show you if you prefer [1] >>> >>>>>> - Note that the diff between the "old >>> >>>>>> extended way from [1]" to the webrev.04 can be found >>> >>>>>> in [2] >>> >>>>>> >>> >>>>>> - I added a test to test the exception wrapper in >>> >>>>>> tests :); I'm not sure it is deemed useful or not but >>> >>>>>> helped me assure myself that I was not doing things >>> >>>>>> wrong; you can find the base test file here [3]; >>> should >>> >>>>>> we have this or not? (I know that normally we don't >>> add >>> >>>>>> tests to vmTestbase but thought this might be an >>> >>>>>> exception) >>> >>>>>> >>> >>>>>> Thanks for your help and my apologies for the snag, >>> >>>>>> Jc >>> >>>>>> >>> >>>>>> [1]: >>> >>>>>> >>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html >>> >>>>>> >>> >>>>>> < >>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html> >>> >>> >>>>>> >>> >>>>>> [2]: >>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04 >>> >>>>>> >>> >>>>>> [3] >>> >>>>>> >>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html >>> >>>>>> >>> >>>>>> < >>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html> >>> >>> >>>>>> >>> >>>>>> >>> >>>>>> On Mon, Dec 3, 2018 at 11:29 PM David Holmes >>> >>>>>> >> >>>>>> > wrote: >>> >>>>>> >>> >>>>>> Looks fine to me. >>> >>>>>> >>> >>>>>> Thanks, >>> >>>>>> David >>> >>>>>> >>> >>>>>> On 4/12/2018 4:04 pm, JC Beyler wrote: >>> >>>>>> > Hi both, >>> >>>>>> > >>> >>>>>> > Thanks for the reviews! Since Serguei did not >>> >>>>>> insist on get_basename, I >>> >>>>>> > went for get_dirname since the method is a local >>> >>>>>> static method and won't >>> >>>>>> > have its name start spreading, I think it's ok >>> too. >>> >>>>>> > >>> >>>>>> > For the naming of the local variable, the idea >>> >>>>>> initially was to use the >>> >>>>>> > same name as the local variable for JNIEnv >>> already >>> >>>>>> used to reduce the >>> >>>>>> > code change. Since I'm now adding the line macro >>> >>>>>> at the end anyway, this >>> >>>>>> > does not matter anymore so I converged all local >>> >>>>>> variables to "jni". >>> >>>>>> > >>> >>>>>> > So, without further ado, here is the new >>> version: >>> >>>>>> > Webrev: >>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03/ >>> >>>>>> >>> >>>>>> > Bug: >>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>>>>> > >>> >>>>>> > This passes the various tests changed by the >>> >>>>>> webrev on my dev machine. >>> >>>>>> > >>> >>>>>> > Let me know what you think, >>> >>>>>> > Jc >>> >>>>>> > >>> >>>>>> > On Mon, Dec 3, 2018 at 8:40 PM >>> >>>>>> serguei.spitsyn at oracle.com >>> >>>>>> >>> >>>>>> > >> >>>>>> > >>> >>>>>> >> >>>>>> >>> >>>>>> > >> >>>>>> >> wrote: >>> >>>>>> > >>> >>>>>> > On 12/3/18 20:15, Chris Plummer wrote: >>> >>>>>> > > Hi JC, >>> >>>>>> > > >>> >>>>>> > > Overall it looks good. A few naming nits >>> >>>>>> thought: >>> >>>>>> > > >>> >>>>>> > > In bi01t001.cpp, why have you declared >>> the >>> >>>>>> > ExceptionCheckingJniEnvPtr >>> >>>>>> > > using jni_env(jni). Elsewhere you use >>> >>>>>> jni(jni_env) and rename the >>> >>>>>> > > method argument passed in from jni to >>> >>>>>> jni_env. >>> >>>>>> > > >>> >>>>>> > > Related to this, I also noticed in some >>> >>>>>> files that already are using >>> >>>>>> > > ExceptionCheckingJniEnvPtr, such as >>> >>>>>> CharArrayCriticalLocker.cpp, you >>> >>>>>> > > delcared it as env(jni_env). So that >>> means >>> >>>>>> there are 3 different >>> >>>>>> > names >>> >>>>>> > > you have used for the >>> >>>>>> ExceptionCheckingJniEnvPtr local variable. >>> >>>>>> > They >>> >>>>>> > > should be consistent. >>> >>>>>> > > >>> >>>>>> > > Also, can you rename get_basename() to >>> >>>>>> get_dirname()? I know Serguei >>> >>>>>> > > suggested get_basename() a while back, >>> but >>> >>>>>> unless "basename" is >>> >>>>>> > > commonly used for this purpose, I think >>> >>>>>> "dirname" is more self >>> >>>>>> > > explanatory. >>> >>>>>> > >>> >>>>>> > In general, I'm Okay with get_dirname(). >>> >>>>>> > Just to mention dirname can be both short or >>> >>>>>> full, so it is a little >>> >>>>>> > confusing as well. >>> >>>>>> > It is the reason why the get_basename() was >>> >>>>>> suggested. >>> >>>>>> > However, I do not insist on get_basename() >>> nor >>> >>>>>> get_full_dirname(). :) >>> >>>>>> > >>> >>>>>> > Thanks, >>> >>>>>> > Serguei >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > > thanks, >>> >>>>>> > > >>> >>>>>> > > Chris >>> >>>>>> > > >>> >>>>>> > > On 12/2/18 10:29 PM, David Holmes wrote: >>> >>>>>> > >> Hi Jc, >>> >>>>>> > >> >>> >>>>>> > >> I've been lurking on this one and have >>> had >>> >>>>>> a look through. I'm okay >>> >>>>>> > >> with the FatalError approach for the >>> tests >>> >>>>>> - we don't expect >>> >>>>>> > anything >>> >>>>>> > >> to go wrong in a well written test in a >>> >>>>>> correctly functioning VM. >>> >>>>>> > >> >>> >>>>>> > >> Thanks, >>> >>>>>> > >> David >>> >>>>>> > >> >>> >>>>>> > >> >>> >>>>>> > >> >>> >>>>>> > >> On 3/12/2018 3:24 pm, JC Beyler wrote: >>> >>>>>> > >>> Hi all, >>> >>>>>> > >>> >>> >>>>>> > >>> Would someone on the GC or runtime team >>> >>>>>> be motivated to give >>> >>>>>> > this a >>> >>>>>> > >>> review? :) >>> >>>>>> > >>> >>> >>>>>> > >>> It would be much appreciated! >>> >>>>>> > >>> >>> >>>>>> > >>> Webrev: >>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ >>> >>>>>> >>> >>>>>> > >>> Bug: >>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>>>>> > >>> >>> >>>>>> > >>> Thanks for your help, >>> >>>>>> > >>> Jc >>> >>>>>> > >>> >>> >>>>>> > >>> On Tue, Nov 27, 2018 at 4:36 PM JC >>> Beyler >>> >>>>>> >>> >>>>>> > >> >>>>>> > >>> >>>>>> > >>> >> >>>>>> >>> >>>>>> >> >>>>>> >>> wrote: >>> >>>>>> > >>> >>> >>>>>> > >>> Hi Chris, >>> >>>>>> > >>> >>> >>>>>> > >>> Yes I was waiting for another >>> review >>> >>>>>> since you had explicitly >>> >>>>>> > >>> asked :) >>> >>>>>> > >>> >>> >>>>>> > >>> And sounds good that when someone >>> >>>>>> from GC or runtime gives a >>> >>>>>> > >>> review, >>> >>>>>> > >>> I'll wait for your full review on >>> the >>> >>>>>> webrev.02! >>> >>>>>> > >>> >>> >>>>>> > >>> Thanks again for your help, >>> >>>>>> > >>> Jc >>> >>>>>> > >>> >>> >>>>>> > >>> >>> >>>>>> > >>> On Tue, Nov 27, 2018 at 12:48 PM >>> >>>>>> Chris Plummer >>> >>>>>> > >>> >> >>>>>> >>> >>>>>> >> >>>>>> > >>> >>>>>> > >> >>>>>> >>> >>>>>> >> >>>>>> >>> >>> >>>>>> > wrote: >>> >>>>>> > >>> >>> >>>>>> > >>> Hi JC, >>> >>>>>> > >>> >>> >>>>>> > >>> I think it would be good to >>> get a >>> >>>>>> review from the gc or >>> >>>>>> > runtime >>> >>>>>> > >>> teams, since this also affects >>> >>>>>> their tests. >>> >>>>>> > >>> >>> >>>>>> > >>> Also, once we are settled on >>> this >>> >>>>>> FatalError approach, >>> >>>>>> > I still >>> >>>>>> > >>> need to give your webrev-02 a >>> >>>>>> full review. I only >>> >>>>>> > skimmed over >>> >>>>>> > >>> parts of it (I did look at all >>> >>>>>> the changes in webrevo-01). >>> >>>>>> > >>> >>> >>>>>> > >>> thanks, >>> >>>>>> > >>> >>> >>>>>> > >>> Chris >>> >>>>>> > >>> >>> >>>>>> > >>> On 11/27/18 8:58 AM, >>> >>>>>> serguei.spitsyn at oracle.com >>> >>>>>> >>> >>>>>> > >> >>>>>> > >>> >>>>>> > >>> >> >>>>>> >>> >>>>>> > >> >>>>>> >> wrote: >>> >>>>>> > >>>> Hi Jc, >>> >>>>>> > >>>> >>> >>>>>> > >>>> I've already reviewed this >>> too. >>> >>>>>> > >>>> >>> >>>>>> > >>>> Thanks, >>> >>>>>> > >>>> Serguei >>> >>>>>> > >>>> >>> >>>>>> > >>>> >>> >>>>>> > >>>> On 11/27/18 06:56, JC Beyler >>> >>>>>> wrote: >>> >>>>>> > >>>>> Thanks Chris, >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> Anybody else motivated to look at >>> this >>> >>>>>> and review it? :) >>> >>>>>> > >>>>> Jc >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> On Mon, Nov 26, 2018 at 1:26 >>> PM >>> >>>>>> Chris Plummer >>> >>>>>> > >>>>> >> >>>>>> >>> >>>>>> > >> >>>>>> > >>> >>>>>> >> >>>>>> >>> >>>>>> > >> >>>>>> >>> >>> >>>>>> > >>>>> wrote: >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> Hi JC, >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> I'm ok with the FatalError approach, >>> >>>>>> but would >>> >>>>>> > like to >>> >>>>>> > >>>>> hear opinions from others also. >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> thanks, >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> Chris >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> On 11/21/18 8:19 AM, JC Beyler wrote: >>> >>>>>> > >>>>>> Hi Chris, >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> Thanks for taking the >>> time >>> >>>>>> to look at it and yes you >>> >>>>>> > >>>>>> have raised exactly why >>> >>>>>> the webrev is between two >>> >>>>>> > >>>>>> worlds: in cases where a >>> >>>>>> fatal error on failure is >>> >>>>>> > >>>>>> wanted, should we >>> simplify >>> >>>>>> the code to remove >>> >>>>>> > the return >>> >>>>>> > >>>>>> tests since we do them >>> >>>>>> internally? Now that I've >>> >>>>>> > looked >>> >>>>>> > >>>>>> around for non-fatal >>> >>>>>> cases, I think the answer >>> >>>>>> > is yes, >>> >>>>>> > >>>>>> it simplifies the code >>> >>>>>> while maintaining the checks. >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> I looked a bit and it >>> >>>>>> seems that I can't find >>> >>>>>> > easily a >>> >>>>>> > >>>>>> case where the test >>> >>>>>> accepts a JNI failure to >>> >>>>>> > then move >>> >>>>>> > >>>>>> on. Therefore, perhaps, >>> >>>>>> for now, the fail with a >>> >>>>>> > Fatal >>> >>>>>> > >>>>>> is enough and we can >>> work >>> >>>>>> on the tests to clean >>> >>>>>> > them up? >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> That means that this is >>> >>>>>> the new webrev with only >>> >>>>>> > Fatal >>> >>>>>> > >>>>>> and cleans up the tests >>> so >>> >>>>>> that it is no longer in >>> >>>>>> > >>>>>> between two worlds: >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> Webrev: >>> >>>>>> > >>>>>> >>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ >>> >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> >>> >>>>>> > >>>>>> Bug: >>> >>>>>> > >>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> (This passes testing on >>> my >>> >>>>>> dev machine for all the >>> >>>>>> > >>>>>> modified tests) >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> with the example you >>> >>>>>> provided, it now looks like: >>> >>>>>> > >>>>>> >>> >>>>>> > >>> >>>>>> >>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html >>> >>>>>> >>> >>>>>> < >>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>> >>>>>> < >>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> Where it does, to me at >>> >>>>>> least, seem cleaner and less >>> >>>>>> > >>>>>> "noisy". >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> Let me know what you >>> think, >>> >>>>>> > >>>>>> Jc >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> On Tue, Nov 20, 2018 at >>> >>>>>> 9:33 PM Chris Plummer >>> >>>>>> > >>>>>> < >>> chris.plummer at oracle.com >>> >>>>>> >>> >>>>>> > >> >>>>>> > >>> >>>>>> > >>>>>> >> >>>>>> >>> >>>>>> > >> >>>>>> >>> wrote: >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> Hi JC, >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> Sorry about the >>> delay. >>> >>>>>> I had to go back an >>> >>>>>> > look at >>> >>>>>> > >>>>>> the initial 8210842 >>> >>>>>> webrev and RFR thread to see >>> >>>>>> > >>>>>> what this was >>> >>>>>> initially all about. >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> In general the >>> changes >>> >>>>>> look good. >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> I don't have a good >>> >>>>>> answer to your >>> >>>>>> > >>>>>> FatalError/NonFatalError question. It >>> makes >>> >>>>>> > the code >>> >>>>>> > >>>>>> a lot cleaner to use >>> >>>>>> FatalError, but then it >>> >>>>>> > is a >>> >>>>>> > >>>>>> behavior change, and >>> >>>>>> you also need to deal with >>> >>>>>> > >>>>>> tests that >>> >>>>>> intentionally induce errors (do >>> >>>>>> > you have >>> >>>>>> > >>>>>> an example of that). >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> In any case, right >>> now >>> >>>>>> your webrev seems to be >>> >>>>>> > >>>>>> between two worlds. >>> >>>>>> You are producing >>> >>>>>> > FatalError, >>> >>>>>> > >>>>>> but still checking >>> >>>>>> results. Here's a good >>> >>>>>> > example: >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>> >>>>>> >>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html >>> >>>>>> >>> >>>>>> < >>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>> >>>>>> < >>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> I'm not sure if this >>> >>>>>> is just a temporary >>> >>>>>> > state until >>> >>>>>> > >>>>>> it was decided which >>> >>>>>> approach to take. >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> thanks, >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> Chris >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> On 11/20/18 2:14 PM, >>> >>>>>> JC Beyler wrote: >>> >>>>>> > >>>>>>> Hi all, >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Chris thought it >>> made >>> >>>>>> sense to have more >>> >>>>>> > eyes on >>> >>>>>> > >>>>>>> this change than >>> just >>> >>>>>> serviceability as it will >>> >>>>>> > >>>>>>> modify to tests >>> that >>> >>>>>> are not only >>> >>>>>> > serviceability >>> >>>>>> > >>>>>>> tests so I've moved >>> >>>>>> this to conversation >>> >>>>>> > here :) >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> For convenience, >>> I've >>> >>>>>> copy-pasted the >>> >>>>>> > initial RFR: >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Could I have a >>> review >>> >>>>>> for the extension and >>> >>>>>> > usage >>> >>>>>> > >>>>>>> of the >>> >>>>>> ExceptionJniWrapper. This adds lines and >>> >>>>>> > >>>>>>> filenames to the >>> end >>> >>>>>> of the wrapper JNI >>> >>>>>> > methods, >>> >>>>>> > >>>>>>> adds tracing, and >>> >>>>>> throws an error if need >>> >>>>>> > be. I've >>> >>>>>> > >>>>>>> ported the gc/lock >>> >>>>>> files to use the new >>> >>>>>> > >>>>>>> TRACE_JNI_CALL >>> add-on >>> >>>>>> and I've ported a few >>> >>>>>> > of the >>> >>>>>> > >>>>>>> tests that were >>> >>>>>> already changed for the >>> >>>>>> > assignment >>> >>>>>> > >>>>>>> webrev for >>> >>>>>> JDK-8212884. >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Webrev: >>> >>>>>> > >>>>>>> >>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01 >>> >>>>>> >>> >>>>>> > >>>>>>> >>> >>>>>> >>> >>>>>> > >>>>>>> Bug: >>> >>>>>> > >>>>>>> >>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> For illustration, >>> if >>> >>>>>> I force an error to the >>> >>>>>> > >>>>>>> AP04/ap04t03 test >>> and >>> >>>>>> set the verbosity on, >>> >>>>>> > I get >>> >>>>>> > >>>>>>> something like: >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> >> Calling JNI >>> method >>> >>>>>> FindClass from >>> >>>>>> > >>>>>>> ap04t003.cpp:343 >>> >>>>>> > >>>>>>> >> Calling with >>> these >>> >>>>>> parameter(s): >>> >>>>>> > >>>>>>> java/lang/Threadd >>> >>>>>> > >>>>>>> Wait for thread to >>> >>>>>> finish >>> >>>>>> > >>>>>>> << Called JNI >>> method >>> >>>>>> FindClass from >>> >>>>>> > >>>>>>> ap04t003.cpp:343 >>> >>>>>> > >>>>>>> Exception in thread >>> >>>>>> "Thread-0" >>> >>>>>> > >>>>>>> java.lang.NoClassDefFoundError: >>> >>>>>> > java/lang/Threadd >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>> >>>>>> >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Method) >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Caused by: >>> >>>>>> java.lang.ClassNotFoundException: >>> >>>>>> > >>>>>>> java.lang.Threadd >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> > >>> >>>>>> >>> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> > >>> >>>>>> >>> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) >>> >>>>>> > >>>>>>> ... 3 more >>> >>>>>> > >>>>>>> FATAL ERROR in >>> native >>> >>>>>> method: JNI method >>> >>>>>> > FindClass >>> >>>>>> > >>>>>>> : internal error >>> from >>> >>>>>> ap04t003.cpp:343 >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>> >>>>>> >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Method) >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> at >>> >>>>>> > >>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Questions/comments >>> I >>> >>>>>> have about this are: >>> >>>>>> > >>>>>>> - Do we want to >>> >>>>>> force fatal errors when a JNI >>> >>>>>> > >>>>>>> call fails in >>> >>>>>> general? Most of these tests >>> >>>>>> > do the >>> >>>>>> > >>>>>>> right thing and >>> test >>> >>>>>> the return of the JNI >>> >>>>>> > calls, >>> >>>>>> > >>>>>>> for example: >>> >>>>>> > >>>>>>> thrClass = >>> >>>>>> > jni->FindClass("java/lang/Threadd", >>> >>>>>> > >>>>>>> TRACE_JNI_CALL); >>> >>>>>> > >>>>>>> if (thrClass == >>> >>>>>> NULL) { >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> but now the wrapper >>> >>>>>> actually would do a >>> >>>>>> > fatal if >>> >>>>>> > >>>>>>> the FindClass call >>> >>>>>> would return a nullptr, >>> >>>>>> > so we >>> >>>>>> > >>>>>>> could remove that >>> >>>>>> test altogether. What do you >>> >>>>>> > >>>>>>> think? >>> >>>>>> > >>>>>>> - I prefer to >>> >>>>>> leave them as the tests then >>> >>>>>> > >>>>>>> become closer to >>> what >>> >>>>>> real users would have in >>> >>>>>> > >>>>>>> their code and is >>> the >>> >>>>>> "recommended" way of >>> >>>>>> > doing it >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> - The >>> alternative >>> >>>>>> is to use the >>> >>>>>> > NonFatalError I >>> >>>>>> > >>>>>>> added which then >>> just >>> >>>>>> prints out that something >>> >>>>>> > >>>>>>> went wrong, letting >>> >>>>>> the test continue. Question >>> >>>>>> > >>>>>>> will be what should >>> >>>>>> be the default? The >>> >>>>>> > fatal or >>> >>>>>> > >>>>>>> the non-fatal error >>> >>>>>> handling? >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> On a different >>> >>>>>> subject: >>> >>>>>> > >>>>>>> - On the new >>> tests, >>> >>>>>> I've removed the >>> >>>>>> > >>>>>>> NSK_JNI_VERIFY >>> since >>> >>>>>> the JNI wrapper >>> >>>>>> > handles the >>> >>>>>> > >>>>>>> tracing and the >>> >>>>>> verify in almost the same >>> >>>>>> > way; only >>> >>>>>> > >>>>>>> difference I can >>> >>>>>> really tell is that the >>> >>>>>> > complain >>> >>>>>> > >>>>>>> method from NSK >>> has a >>> >>>>>> max complain before >>> >>>>>> > stopping >>> >>>>>> > >>>>>>> to "complain"; I >>> have >>> >>>>>> not added that part >>> >>>>>> > of the >>> >>>>>> > >>>>>>> code in this webrev >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Once we decide on >>> >>>>>> these, I can continue on the >>> >>>>>> > >>>>>>> files from >>> >>>>>> JDK-8212884 and then do both the >>> >>>>>> > >>>>>>> assignment in an if >>> >>>>>> extraction followed-by this >>> >>>>>> > >>>>>>> type of webrev in >>> an >>> >>>>>> easier fashion. >>> >>>>>> > Depending on >>> >>>>>> > >>>>>>> decisions here, >>> >>>>>> NSK*VERIFY can be deprecated as >>> >>>>>> > >>>>>>> well as we go >>> forward. >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Thanks! >>> >>>>>> > >>>>>>> Jc >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> On Mon, Nov 19, >>> 2018 >>> >>>>>> at 11:34 AM Chris Plummer >>> >>>>>> > >>>>>>> < >>> chris.plummer at oracle.com >>> >>>>>> >>> >>>>>> > >> >>>>>> > >>> >>>>>> > >>>>>>> >> >>>>>> >>> >>>>>> > >> >>>>>> >>> wrote: >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> On 11/19/18 >>> 10:07 >>> >>>>>> AM, JC Beyler wrote: >>> >>>>>> > >>>>>>>> Hi all, >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> @David/Chris: >>> >>>>>> should I then push this >>> >>>>>> > RFR to >>> >>>>>> > >>>>>>>> the hotspot >>> >>>>>> mailing or the runtime >>> >>>>>> > one? For >>> >>>>>> > >>>>>>>> what it's >>> worth, >>> >>>>>> a lot of the tests >>> >>>>>> > under the >>> >>>>>> > >>>>>>>> vmTestbase are >>> >>>>>> jvmti so the review also >>> >>>>>> > >>>>>>>> affects >>> >>>>>> serviceability; it just turns >>> >>>>>> > out I >>> >>>>>> > >>>>>>>> started with >>> the >>> >>>>>> GC originally and >>> >>>>>> > then hit >>> >>>>>> > >>>>>>>> some other >>> tests >>> >>>>>> I had touched via the >>> >>>>>> > >>>>>>>> assignment >>> >>>>>> extraction. >>> >>>>>> > >>>>>>> I think hotspot >>> >>>>>> would be best. >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> Chris >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> @Serguei: Done >>> >>>>>> for the method >>> >>>>>> > renaming, for >>> >>>>>> > >>>>>>>> the indent, >>> are >>> >>>>>> you talking about >>> >>>>>> > going from >>> >>>>>> > >>>>>>>> the 8-indent >>> to >>> >>>>>> 4-indent? If so, would >>> >>>>>> > it not >>> >>>>>> > >>>>>>>> just be better >>> >>>>>> to do a new JBS bug and >>> >>>>>> > do the >>> >>>>>> > >>>>>>>> whole files in >>> >>>>>> one go? I ask because >>> >>>>>> > >>>>>>>> otherwise, it >>> >>>>>> will look a bit weird to >>> >>>>>> > have >>> >>>>>> > >>>>>>>> parts of the >>> >>>>>> file as 8-indent and others >>> >>>>>> > >>>>>>>> 4-indent? >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> Thanks for >>> >>>>>> looking at it! >>> >>>>>> > >>>>>>>> Jc >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> On Mon, Nov >>> 19, >>> >>>>>> 2018 at 1:25 AM >>> >>>>>> > >>>>>>>> serguei.spitsyn at oracle.com >>> >>>>>> >>> >>>>>> >> >>>>>> > >>> >>>>>> > >>>>>>>> >> serguei.spitsyn at oracle.com >>> >>>>>> >>> >>>>>> > >> >>>>>> >> >>> >>>>>> > >>>>>>>> >> >>>>>> >>> >>>>>> > >> >>>>>> > >>> >>>>>> > >>>>>>>> >> serguei.spitsyn at oracle.com >>> >>>>>> >>> >>>>>> > >> >>>>>> >>> wrote: >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> Hi Jc, >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> We have to >>> >>>>>> start this review >>> >>>>>> > anyway. :) >>> >>>>>> > >>>>>>>> It looks >>> >>>>>> good to me in general. >>> >>>>>> > >>>>>>>> Thank you >>> >>>>>> for your consistency in this >>> >>>>>> > >>>>>>>> >>> refactoring! >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> Some minor >>> >>>>>> comments. >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>> >>>>>> >>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> +static >>> >>>>>> const char* >>> >>>>>> > remove_folders(const >>> >>>>>> > >>>>>>>> char* >>> >>>>>> fullname) { I'd suggest to >>> >>>>>> > rename >>> >>>>>> > >>>>>>>> the >>> function >>> >>>>>> name to something >>> >>>>>> > traditional >>> >>>>>> > >>>>>>>> like >>> >>>>>> get_basename. Otherwise, it >>> >>>>>> > sounds >>> >>>>>> > >>>>>>>> like this >>> >>>>>> function has to really >>> >>>>>> > remove >>> >>>>>> > >>>>>>>> folders. >>> :) >>> >>>>>> Also, all *Locker.cpp have >>> >>>>>> > >>>>>>>> wrong >>> indent >>> >>>>>> in the bodies of if >>> >>>>>> > and while >>> >>>>>> > >>>>>>>> >>> statements. >>> >>>>>> Could this be fixed >>> >>>>>> > with the >>> >>>>>> > >>>>>>>> >>> refactoring? >>> >>>>>> I did not look on how >>> >>>>>> > this >>> >>>>>> > >>>>>>>> impacts >>> the >>> >>>>>> tests other than >>> >>>>>> > >>>>>>>> serviceability. >>> Thanks, >>> >>>>>> Serguei >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> On >>> 11/16/18 >>> >>>>>> 19:43, JC Beyler wrote: >>> >>>>>> > >>>>>>>>> Hi all, >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Anybody >>> >>>>>> motivated to review this? :) >>> >>>>>> > >>>>>>>>> Jc >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> On Wed, Nov 7, >>> >>>>>> 2018 at 9:53 PM JC >>> >>>>>> > Beyler >>> >>>>>> > >>>>>>>>> >> >>>>>> >>> >>>>>> > >> >>>>>> > >>> >>>>>> > >>>>>>>>> >> >>>>>> >>> >>>>>> > >> >>>>>> >>> wrote: >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Hi all, >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Could I >>> have >>> >>>>>> a review for the >>> >>>>>> > >>>>>>>>> extension >>> >>>>>> and usage of the >>> >>>>>> > >>>>>>>>> ExceptionJniWrapper. This >>> >>>>>> > adds lines >>> >>>>>> > >>>>>>>>> and >>> >>>>>> filenames to the end of the >>> >>>>>> > >>>>>>>>> wrapper >>> JNI >>> >>>>>> methods, adds >>> >>>>>> > tracing, >>> >>>>>> > >>>>>>>>> and throws >>> >>>>>> an error if need >>> >>>>>> > be. I've >>> >>>>>> > >>>>>>>>> ported the >>> >>>>>> gc/lock files to >>> >>>>>> > use the >>> >>>>>> > >>>>>>>>> new >>> >>>>>> TRACE_JNI_CALL add-on and >>> >>>>>> > I've >>> >>>>>> > >>>>>>>>> ported a >>> few >>> >>>>>> of the tests >>> >>>>>> > that were >>> >>>>>> > >>>>>>>>> already >>> >>>>>> changed for the >>> >>>>>> > assignment >>> >>>>>> > >>>>>>>>> webrev for >>> >>>>>> JDK-8212884. >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Webrev: >>> >>>>>> > >>>>>>>>> >>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.00/ >>> >>>>>> >>> >>>>>> > >>>>>>>>> >>> >>>>>> >>> >>>>>> > >>>>>>>>> Bug: >>> >>>>>> > >>>>>>>>> >>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> For >>> >>>>>> illustration, if I force >>> >>>>>> > an error >>> >>>>>> > >>>>>>>>> to the >>> >>>>>> AP04/ap04t03 test and >>> >>>>>> > set the >>> >>>>>> > >>>>>>>>> verbosity >>> >>>>>> on, I get something >>> >>>>>> > like: >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> >> Calling >>> >>>>>> JNI method >>> >>>>>> > FindClass from >>> >>>>>> > >>>>>>>>> ap04t003.cpp:343 >>> >>>>>> > >>>>>>>>> >> Calling >>> >>>>>> with these >>> >>>>>> > parameter(s): >>> >>>>>> > >>>>>>>>> java/lang/Threadd >>> >>>>>> > >>>>>>>>> Wait for >>> >>>>>> thread to finish >>> >>>>>> > >>>>>>>>> << Called >>> >>>>>> JNI method >>> >>>>>> > FindClass from >>> >>>>>> > >>>>>>>>> ap04t003.cpp:343 >>> >>>>>> > >>>>>>>>> Exception >>> in >>> >>>>>> thread "Thread-0" >>> >>>>>> > >>>>>>>>> java.lang.NoClassDefFoundError: >>> >>>>>> > >>>>>>>>> java/lang/Threadd >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>> >>>>>> >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Method) >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Caused by: >>> >>>>>> > >>>>>>>>> java.lang.ClassNotFoundException: >>> >>>>>> > >>>>>>>>> java.lang.Threadd >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> >>> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> >>> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) >>> >>>>>> > >>>>>>>>> ... 3 more >>> >>>>>> > >>>>>>>>> FATAL >>> ERROR >>> >>>>>> in native method: JNI >>> >>>>>> > >>>>>>>>> method >>> >>>>>> FindClass : internal error >>> >>>>>> > >>>>>>>>> from >>> >>>>>> ap04t003.cpp:343 >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>> >>>>>> >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Method) >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>> >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> at >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>> >>>>>> >>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>> >>>>>> >>> >>>>>> > >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Questions/comments I have about >>> >>>>>> > >>>>>>>>> this are: >>> >>>>>> > >>>>>>>>> - Do we >>> >>>>>> want to force fatal >>> >>>>>> > errors >>> >>>>>> > >>>>>>>>> when a JNI >>> >>>>>> call fails in general? >>> >>>>>> > >>>>>>>>> Most of >>> >>>>>> these tests do the right >>> >>>>>> > >>>>>>>>> thing and >>> >>>>>> test the return of >>> >>>>>> > the JNI >>> >>>>>> > >>>>>>>>> calls, for >>> >>>>>> example: >>> >>>>>> > >>>>>>>>> thrClass = >>> >>>>>> > >>>>>>>>> jni->FindClass("java/lang/Threadd", >>> >>>>>> > >>>>>>>>> TRACE_JNI_CALL); >>> >>>>>> > >>>>>>>>> if >>> >>>>>> (thrClass == NULL) { >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> but now >>> the >>> >>>>>> wrapper actually >>> >>>>>> > would do >>> >>>>>> > >>>>>>>>> a fatal if >>> >>>>>> the FindClass call >>> >>>>>> > would >>> >>>>>> > >>>>>>>>> return a >>> >>>>>> nullptr, so we could >>> >>>>>> > remove >>> >>>>>> > >>>>>>>>> that test >>> >>>>>> altogether. What do >>> >>>>>> > you >>> >>>>>> > >>>>>>>>> think? >>> >>>>>> > >>>>>>>>> - I >>> >>>>>> prefer to leave them >>> >>>>>> > as the >>> >>>>>> > >>>>>>>>> tests then >>> >>>>>> become closer to >>> >>>>>> > what real >>> >>>>>> > >>>>>>>>> users >>> would >>> >>>>>> have in their >>> >>>>>> > code and is >>> >>>>>> > >>>>>>>>> the >>> >>>>>> "recommended" way of doing it >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> - The >>> >>>>>> alternative is to >>> >>>>>> > use the >>> >>>>>> > >>>>>>>>> NonFatalError I added >>> >>>>>> which >>> >>>>>> > then just >>> >>>>>> > >>>>>>>>> prints out >>> >>>>>> that something >>> >>>>>> > went wrong, >>> >>>>>> > >>>>>>>>> letting >>> the >>> >>>>>> test continue. >>> >>>>>> > Question >>> >>>>>> > >>>>>>>>> will be >>> what >>> >>>>>> should be the >>> >>>>>> > default? >>> >>>>>> > >>>>>>>>> The fatal >>> or >>> >>>>>> the non-fatal error >>> >>>>>> > >>>>>>>>> handling? >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> On a >>> >>>>>> different subject: >>> >>>>>> > >>>>>>>>> - On the >>> >>>>>> new tests, I've >>> >>>>>> > removed >>> >>>>>> > >>>>>>>>> the >>> >>>>>> NSK_JNI_VERIFY since the JNI >>> >>>>>> > >>>>>>>>> wrapper >>> >>>>>> handles the tracing >>> >>>>>> > and the >>> >>>>>> > >>>>>>>>> verify in >>> >>>>>> almost the same >>> >>>>>> > way; only >>> >>>>>> > >>>>>>>>> >>> difference I >>> >>>>>> can really tell >>> >>>>>> > is that >>> >>>>>> > >>>>>>>>> the >>> complain >>> >>>>>> method from NSK >>> >>>>>> > has a >>> >>>>>> > >>>>>>>>> max >>> complain >>> >>>>>> before stopping to >>> >>>>>> > >>>>>>>>> >>> "complain"; >>> >>>>>> I have not added that >>> >>>>>> > >>>>>>>>> part of >>> the >>> >>>>>> code in this webrev >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Once we >>> >>>>>> decide on these, I can >>> >>>>>> > >>>>>>>>> continue >>> on >>> >>>>>> the files from >>> >>>>>> > >>>>>>>>> >>> JDK-8212884 >>> >>>>>> and then do both the >>> >>>>>> > >>>>>>>>> assignment >>> >>>>>> in an if extraction >>> >>>>>> > >>>>>>>>> >>> followed-by >>> >>>>>> this type of >>> >>>>>> > webrev in an >>> >>>>>> > >>>>>>>>> easier >>> >>>>>> fashion. Depending on >>> >>>>>> > >>>>>>>>> decisions >>> >>>>>> here, NSK*VERIFY can be >>> >>>>>> > >>>>>>>>> deprecated >>> >>>>>> as well as we go >>> >>>>>> > forward. >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> Thank you >>> >>>>>> for the >>> >>>>>> > reviews/comments :) >>> >>>>>> > >>>>>>>>> Jc >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> >>> >>>>>> > >>>>>>>>> -- >>> >>>>>> > >>>>>>>>> Thanks, >>> >>>>>> > >>>>>>>>> Jc >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> >>> >>>>>> > >>>>>>>> -- >>> >>>>>> > >>>>>>>> Thanks, >>> >>>>>> > >>>>>>>> Jc >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> >>> >>>>>> > >>>>>>> -- >>> >>>>>> > >>>>>>> Thanks, >>> >>>>>> > >>>>>>> Jc >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> >>> >>>>>> > >>>>>> -- >>> >>>>>> > >>>>>> Thanks, >>> >>>>>> > >>>>>> Jc >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> >>> >>>>>> > >>>>> -- >>> >>>>>> > >>>>> Thanks, >>> >>>>>> > >>>>> Jc >>> >>>>>> > >>>> >>> >>>>>> > >>> >>> >>>>>> > >>> >>> >>>>>> > >>> >>> >>>>>> > >>> -- >>> >>>>>> > >>> Thanks, >>> >>>>>> > >>> Jc >>> >>>>>> > >>> >>> >>>>>> > >>> >>> >>>>>> > >>> >>> >>>>>> > >>> -- >>> >>>>>> > >>> >>> >>>>>> > >>> Thanks, >>> >>>>>> > >>> Jc >>> >>>>>> > > >>> >>>>>> > > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > -- >>> >>>>>> > >>> >>>>>> > Thanks, >>> >>>>>> > Jc >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> -- >>> >>>>>> Thanks, >>> >>>>>> Jc >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> -- >>> >>>>> Thanks, >>> >>>>> Jc >>> >>>> >>> >>>> >>> >>>> >>> >>>> -- >>> >>>> Thanks, >>> >>>> Jc >>> >>>> >>> >>>> >>> >>>> >>> >>>> -- >>> >>>> >>> >>>> Thanks, >>> >>>> Jc >>> >>> >>> > >>> >> >> >> -- >> >> Thanks, >> Jc >> > > > -- > > Thanks, > Jc > -- Thanks, Jc From coleen.phillimore at oracle.com Tue Jan 8 18:20:36 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 13:20:36 -0500 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <1d480205-a1d0-0c3a-c1ee-6da6484d2068@oracle.com> Message-ID: On 1/8/19 11:55 AM, Aleksey Shipilev wrote: > On 1/8/19 5:40 PM, coleen.phillimore at oracle.com wrote: >> http://cr.openjdk.java.net/~shade/8216302/webrev.02/src/hotspot/share/classfile/javaClasses.cpp.udiff.html >> >> + oop class_oop = holder->java_mirror(); >> >> This is a naked oop.? You should use Handle and HandleMark in this function. > Mmm. But it is the same oop we are already handling in the old code... What's the rule here? The StringTable::intern() could take out a lock.? It might not anymore but putting things like this relieves the need to look into the function to see whether it does or not. > >> 1347 oop java_lang_Class::name(oop java_class, TRAPS) { >> >> So is this.? You should pass a Handle to this. > I don't understand the "should" part. Why should it be handle-ized? javaClasses are expected to work > properly with naked oops, because we are not expected to get to safepoint in the middle of it, no? We could though if StringTable::intern takes out a lock or stops for a safepoint (while resizing with the new concurrent hashtable). > And we are storing oops to the heap itself, not in any VM structure. > > In the same StackTraceElement::fill_in, we do e.g. this without any handles: > > oop loader = holder->class_loader(); > if (loader != NULL) { > oop loader_name = java_lang_ClassLoader::name(loader); > if (loader_name != NULL) > java_lang_StackTraceElement::set_classLoaderName(element(), loader_name); > } Having the oops on the stack and getting a safepoint is the definition of "unhandled oops"..??? If the oop isn't used below any calls involving it, it is safe.? If there are intervening calls (especially ones with TRAPS), they should be put in Handle so that they are known safe without having to know the implementation details of all the functions that are called.? There is little overhead if there is a HandleMark near the Handles. Thanks, Coleen > -Aleksey > From coleen.phillimore at oracle.com Tue Jan 8 18:26:11 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 13:26:11 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <09cdb516-53cf-c072-dc0e-dd80aef6f344@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <3727d1da-256c-54b7-2d9b-f819ef08cfa4@oracle.com> <09cdb516-53cf-c072-dc0e-dd80aef6f344@oracle.com> Message-ID: On 1/8/19 10:25 AM, Per Liden wrote: > Hi Coleen, > > On 1/7/19 5:16 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 1/7/19 3:40 AM, Per Liden wrote: >>> Hi Coleen, >>> >>> On 1/3/19 3:31 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> Here is the webrev and bug link. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8216022.01/webrev >>> >>> Looks like your script is now leaving an extra empty line at the end >>> of all files, which wasn't there before. Fox example: >>> >>> --- old/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 >>> 16:41:04.209075410 -0500 >>> +++ new/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 >>> 16:41:03.957075419 -0500 >>> @@ -64,4 +63,3 @@ >>> ?? static void flip_to_remapped(); >>> ?}; >>> >>> -#endif // SHARE_GC_Z_ZADDRESS_HPP >>> >>> Should be: >>> >>> --- old/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 >>> 16:41:04.209075410 -0500 >>> +++ new/src/hotspot/share/gc/z/zAddress.hpp??? 2019-01-02 >>> 16:41:03.957075419 -0500 >>> @@ -64,4 +63,3 @@ >>> ?? static void flip_to_remapped(); >>> ?}; >>> - >>> -#endif // SHARE_GC_Z_ZADDRESS_HPP >>> >>> >>> Could you please fix that? >> >> Hi, I fixed the trailing blank lines (in some cases several lines) >> and had to hand patch files that ended with line continuation '\' >> from some macro.? I found these in some globals files like >> g1_globals.hpp and one ci file, that I can't find anymore. >> >> http://cr.openjdk.java.net/~coleenp/8216022.diffs.03 > > Thanks for fixing. Looks good. And for the record, I'm for using > #pragma once instead of manually typed include guards. The number of > miss-typed include guards we've seen over the years is enough to > convince me that people in general aren't very good at getting this > right, and we should let our tool chain handle this for us. Thank you, Per. Coleen > > cheers, > Per > >> >> Thanks, >> Coleen >>> >>> Thanks! >>> Per >>> >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8216022 >>>> >>>> On 1/2/19 9:16 PM, coleen.phillimore at oracle.com wrote: >>>>> Summary: change include guards to #pragma once, except in >>>>> generated header files. >>>>> >>>>> Tested with mach5 for linux-x64{-debug}, solaris-sparc, >>>>> macosx-x64, windows-x64, built aarch64 with cross compiler, and zero. >>>>> >>>>> Ran tier1 and 2 tests. >>>>> >>>>> The webrev is huge but there are only 3 lines changed in each >>>>> header file.? So click on the patch. >>>>> >>>>> I'll update the copyright headers with a script with the commit. >>>>> Also, will do this after the shenandoah copyright headers are fixed. >>>>> >>>>> Adrian: I included you to check your platforms. >>>>> >>>>> Happy New Year! >>>>> Coleen >>>> >> From shade at redhat.com Tue Jan 8 18:55:29 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 8 Jan 2019 19:55:29 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> Message-ID: <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> On 1/8/19 5:48 PM, coleen.phillimore at oracle.com wrote: > I agree we shouldn't initialize the name field eagerly when creating a class.? I also looked at this > code path: > > ??? public String getName() { > ??????? String name = this.name; > ??????? if (name == null) > ??????????? this.name = name = getName0(); > ??????? return name; > ??? } > > It looks like when we call JVM_GetClassName, we're initializing the Class.name field by the caller. > > Maybe could rewrite the java/lang/Class version to be: > > ??? public String getName() { > ??????? String name = this.name; > ??????? if (name == null) > ??????????? name = getName0();? // this initializes this.name yuck. > ??????? return name; > ??? } > > and have JVM_GetClassName call java_lang_Class::name() to do the initialization.?? Seems not worth > it just to avoid duplicating these lines in both java_lang_Class::name() and JVM_GetClassName. Right. I also think it does not worth it. My reason is given in another reply in this thread: http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036152.html I'll handelize the oop and keep the rest as is. -Aleksey From sangheon.kim at oracle.com Tue Jan 8 19:15:28 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Tue, 8 Jan 2019 11:15:28 -0800 Subject: RFR (S/M): 8213827: NUMA heap allocation does not respect process membind/interleave settings [Was: Re: [PATCH] JDK NUMA Interleaving issue] In-Reply-To: References: <6e5b102d07b4ceded09115a649be020410240fe7.camel@oracle.com> <9bea7b0957bbfc2f0ac34306ee162f2d98e44bfe.camel@oracle.com> <99164b92f47f264978339ed327da9d41098a7e1d.camel@oracle.com> Message-ID: <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> Hi Thomas, On 12/13/18 2:33 AM, Thomas Schatzl wrote: > Hi Amit, > On Thu, 2018-12-13 at 15:11 +0530, amith pawar wrote: >> Hi Thomas, >> >> Please find the attached patch updated as per your suggestion. >> If everything OK then can you please commit this to repo ? > looks good. We will need a second reviewer though, I am going to ask > around. > > Latest webrev: > http://cr.openjdk.java.net/~tschatzl/8213827/webrev.3/ Webrev.3 looks good to me. I have some minor nits: ---------------------------------------- src/hotspot/os/linux/os_linux.cpp 5012?????? for (int node = 0; node < Linux::numa_max_node(); node++) { - Looks like 'node <*_=_* Linux::numa_max_node()' is the right one to print the latest node? ---------------------------------------- src/hotspot/os/linux/os_linux.hpp ?271?? enum Numa_allocation_policy{ - Looking at 'enum' at os.hpp, we use Camel style. - There are missing space before '{'. - As usual, copyright year updates. I know it was correct when you posted. :) Thanks, Sangheon > > Thanks, > Thomas > > From shade at redhat.com Tue Jan 8 19:41:29 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 8 Jan 2019 20:41:29 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <1d480205-a1d0-0c3a-c1ee-6da6484d2068@oracle.com> Message-ID: <972380fa-0b06-4139-aae5-8f3bc6d01e04@redhat.com> On 1/8/19 7:20 PM, coleen.phillimore at oracle.com wrote: > Having the oops on the stack and getting a safepoint is the definition of "unhandled oops"..??? If > the oop isn't used below any calls involving it, it is safe.? If there are intervening calls > (especially ones with TRAPS), they should be put in Handle so that they are known safe without > having to know the implementation details of all the functions that are called.? There is little > overhead if there is a HandleMark near the Handles. Thanks for the explanation. So, something like this would do? http://cr.openjdk.java.net/~shade/8216302/webrev.03/ Passes new test and hotspot tier1. If this patch looks okay, I would put it through jdk-submit. It is a bit slower than previous version, but still much better than the current baseline: Benchmark (depth) Mode Cnt Score Error Units # Old patch StackTraceBench.test 1 avgt 15 14.450 ? 0.136 us/op StackTraceBench.test 10 avgt 15 20.182 ? 0.088 us/op StackTraceBench.test 100 avgt 15 77.107 ? 0.632 us/op StackTraceBench.test 1000 avgt 15 647.128 ? 6.159 us/op # New patch StackTraceBench.test 1 avgt 15 14.841 ? 0.095 us/op StackTraceBench.test 10 avgt 15 20.708 ? 0.114 us/op StackTraceBench.test 100 avgt 15 78.628 ? 0.621 us/op StackTraceBench.test 1000 avgt 15 659.408 ? 6.015 us/op -Aleksey From coleen.phillimore at oracle.com Tue Jan 8 19:47:27 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 14:47:27 -0500 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <972380fa-0b06-4139-aae5-8f3bc6d01e04@redhat.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <1d480205-a1d0-0c3a-c1ee-6da6484d2068@oracle.com> <972380fa-0b06-4139-aae5-8f3bc6d01e04@redhat.com> Message-ID: <08bc2019-6219-9bbe-28e8-02e70b96eae3@oracle.com> On 1/8/19 2:41 PM, Aleksey Shipilev wrote: > On 1/8/19 7:20 PM, coleen.phillimore at oracle.com wrote: >> Having the oops on the stack and getting a safepoint is the definition of "unhandled oops"..??? If >> the oop isn't used below any calls involving it, it is safe.? If there are intervening calls >> (especially ones with TRAPS), they should be put in Handle so that they are known safe without >> having to know the implementation details of all the functions that are called.? There is little >> overhead if there is a HandleMark near the Handles. > Thanks for the explanation. So, something like this would do? > http://cr.openjdk.java.net/~shade/8216302/webrev.03/ Yes, this looks better.? I can't imagine why it would be measurably slower, but beats faster with low frequency bugs. http://cr.openjdk.java.net/~shade/8216302/webrev.03/test/hotspot/jtreg/runtime/StackTrace/StackTraceClassCache.java.html I have heard that the copyright should also have the Oracle line on it. Thanks, Coleen > > Passes new test and hotspot tier1. If this patch looks okay, I would put it through jdk-submit. > > It is a bit slower than previous version, but still much better than the current baseline: > > Benchmark (depth) Mode Cnt Score Error Units > > # Old patch > StackTraceBench.test 1 avgt 15 14.450 ? 0.136 us/op > StackTraceBench.test 10 avgt 15 20.182 ? 0.088 us/op > StackTraceBench.test 100 avgt 15 77.107 ? 0.632 us/op > StackTraceBench.test 1000 avgt 15 647.128 ? 6.159 us/op > > # New patch > StackTraceBench.test 1 avgt 15 14.841 ? 0.095 us/op > StackTraceBench.test 10 avgt 15 20.708 ? 0.114 us/op > StackTraceBench.test 100 avgt 15 78.628 ? 0.621 us/op > StackTraceBench.test 1000 avgt 15 659.408 ? 6.015 us/op > > -Aleksey > From coleen.phillimore at oracle.com Tue Jan 8 19:54:48 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 14:54:48 -0500 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> Message-ID: On 1/8/19 1:55 PM, Aleksey Shipilev wrote: > On 1/8/19 5:48 PM, coleen.phillimore at oracle.com wrote: >> I agree we shouldn't initialize the name field eagerly when creating a class.? I also looked at this >> code path: >> >> ??? public String getName() { >> ??????? String name = this.name; >> ??????? if (name == null) >> ??????????? this.name = name = getName0(); >> ??????? return name; >> ??? } >> >> It looks like when we call JVM_GetClassName, we're initializing the Class.name field by the caller. >> >> Maybe could rewrite the java/lang/Class version to be: >> >> ??? public String getName() { >> ??????? String name = this.name; >> ??????? if (name == null) >> ??????????? name = getName0();? // this initializes this.name yuck. >> ??????? return name; >> ??? } >> >> and have JVM_GetClassName call java_lang_Class::name() to do the initialization.?? Seems not worth >> it just to avoid duplicating these lines in both java_lang_Class::name() and JVM_GetClassName. > Right. I also think it does not worth it. My reason is given in another reply in this thread: > http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036152.html Right, saw that.? I agreed. Coleen > > I'll handelize the oop and keep the rest as is. > > -Aleksey > > > > From shade at redhat.com Tue Jan 8 19:56:22 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 8 Jan 2019 20:56:22 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <08bc2019-6219-9bbe-28e8-02e70b96eae3@oracle.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <1d480205-a1d0-0c3a-c1ee-6da6484d2068@oracle.com> <972380fa-0b06-4139-aae5-8f3bc6d01e04@redhat.com> <08bc2019-6219-9bbe-28e8-02e70b96eae3@oracle.com> Message-ID: On 1/8/19 8:47 PM, coleen.phillimore at oracle.com wrote: >> Thanks for the explanation. So, something like this would do? >> ?? http://cr.openjdk.java.net/~shade/8216302/webrev.03/ > > Yes, this looks better.? I can't imagine why it would be measurably slower, but beats faster with > low frequency bugs. Right. I think dealing with handles has non-zero cost. Targeted benchmarks are really sensitive to most costs. Pushed the patch to jdk-submit. > http://cr.openjdk.java.net/~shade/8216302/webrev.03/test/hotspot/jtreg/runtime/StackTrace/StackTraceClassCache.java.html > > I have heard that the copyright should also have the Oracle line on it. There are plenty of Red Hat-added files with only Red Hat copyrights; this is the guidance I remember having. StackTraceClassCache follows it. -Aleksey From david.holmes at oracle.com Tue Jan 8 22:18:58 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Jan 2019 08:18:58 +1000 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> Message-ID: <0edf4fed-fa0d-d1ea-1e17-91765ab087b3@oracle.com> Hi Coleen, On 9/01/2019 2:08 am, coleen.phillimore at oracle.com wrote: > > Hi David, > > My original version had a storestore but Erik convinced me that is > unneeded since the subklass and sibling lists are what are read > concurrently and were the fields that needed the ordering, not > necessarily this one.??? If we backport this to 11, we have to add > barriers to _subklass and _next_sibling like Erik has added. > > Does the rest of the change look good? Yes. Thanks, David > Thanks, > Coleen > > On 1/7/19 8:49 PM, David Holmes wrote: >> Hi Coleen, >> >> On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >>> Summary: Set InstanceKlass::loaded before adding classes to the >>> subklass list, which can be read concurrently by the compiler. >> >> I think you need a storestore barrier to ensure the new order is >> preserved. >> >> Cheers, >> David >> >>> Thanks to Erik for the diagnosis and suggested fix.? See bug comments >>> for more details. >>> >>> Tested with hs-tier1-3, 6 and 8. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >>> >>> Thanks, >>> Coleen > From david.holmes at oracle.com Tue Jan 8 22:35:22 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Jan 2019 08:35:22 +1000 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <1366d544-baca-e9db-b59c-0daf2b054464@oracle.com> Message-ID: On 9/01/2019 1:51 am, Aleksey Shipilev wrote: > On 1/8/19 12:27 PM, David Holmes wrote: >> On 8/01/2019 7:56 pm, Aleksey Shipilev wrote: >>> On 1/8/19 3:15 AM, David Holmes wrote: >>>> It seems somewhat awkward to me to have two different code paths for initializing the >>>> java.lang.Class name field. Can this be restructured a little more (change Class.getName()) so that >>>> the VM always initializes "name" and then JVM_GetClassName could just call java_lang_Class::name, >>>> rather than duplicating the logic? >>> >>> Mmm. I am afraid to do this eagerly because of more memory footprint and potential bootstrapping >>> issues. >> >> I said nothing about doing this eagerly. All I'm suggesting is that instead of getName() doing: >> >> if (this.name == null) >> getName0(); // calls JVM_GetClassName >> return this.name; >> >> it just does: >> >> if (this.name == null) >> ? getName0(); // calls JVM_GetClassName >> return this.name; > > Oh. I had this option on the table when doing the patch, but I disregarded it as dirty hack, because > calling the getter for side effects *only* is awkward. Handling concurrency on Java side also looks > simpler. There is a benign race on Class.name, and to remain benign, it should _not_ do the second > read. In other word, null-checking this.name, and then returning the second read of this.name is not > entirely correct. > > The original code is checking and returning the local, not the second heap read: > > public String getName() { > String name = this.name; > if (name == null) > this.name = name = getName0(); > return name; > } > > This, I think, leaves us with taking the return value from getName0. Sure introduce a local as needed. > > If we wanted to handle JVM_GetClassName better, we could consider calling java_lang_Class::name from > JVM_GetClassName, thus going through the cached path. That would do the two stores, though: one > through javaClasses, and another on Java side. Seeing how there is only a single use of > JVM_GetClassName, and that use is already-cached Class.getName0, I see no reason to do the excessive > write. What I don't like is the fact that there is an illusion that the name field is only set by getName() when in fact it can also be set by the VM. So you already have a very hidden side-effect. If JVM_GetClassName is used to call java_lang_class::Name and set the name field then at least you can document the side-effect. David > -Aleksey > From david.holmes at oracle.com Tue Jan 8 22:40:28 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Jan 2019 08:40:28 +1000 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> Message-ID: On 9/01/2019 5:54 am, coleen.phillimore at oracle.com wrote: > On 1/8/19 1:55 PM, Aleksey Shipilev wrote: >> On 1/8/19 5:48 PM, coleen.phillimore at oracle.com wrote: >>> I agree we shouldn't initialize the name field eagerly when creating >>> a class.? I also looked at this >>> code path: >>> >>> ???? public String getName() { >>> ???????? String name = this.name; >>> ???????? if (name == null) >>> ???????????? this.name = name = getName0(); >>> ???????? return name; >>> ???? } >>> >>> It looks like when we call JVM_GetClassName, we're initializing the >>> Class.name field by the caller. >>> >>> Maybe could rewrite the java/lang/Class version to be: >>> >>> ???? public String getName() { >>> ???????? String name = this.name; >>> ???????? if (name == null) >>> ???????????? name = getName0();? // this initializes this.name yuck. >>> ???????? return name; >>> ???? } >>> >>> and have JVM_GetClassName call java_lang_Class::name() to do the >>> initialization.?? Seems not worth >>> it just to avoid duplicating these lines in both >>> java_lang_Class::name() and JVM_GetClassName. >> Right. I also think it does not worth it. My reason is given in >> another reply in this thread: >> >> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036152.html >> > > Right, saw that.? I agreed. I really don't like the fact the VM is now setting the name field and there's nothing in the Java code to give any indication that this is happening. At a minimum a comment should be added, as is done with other class members that get accessed directly by the VM. I also think core-libs folk should be having a say here. David ----- > Coleen >> >> I'll handelize the oop and keep the rest as is. >> >> -Aleksey >> >> >> >> > From david.holmes at oracle.com Tue Jan 8 23:11:42 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Jan 2019 09:11:42 +1000 Subject: Error placementdelmatch when compiling OpenJDK 11 with Oracle Developer Studio In-Reply-To: <20190108-152754.13hbpbdro-2lai@mailcc08> References: <20190108-152754.13hbpbdro-2lai@mailcc08> Message-ID: <6bed0312-efa1-1fff-dc47-fa7994089304@oracle.com> Hi Michael, I've bcc'd jdk-dev and am moving this to hotspot-dev. The official Solaris compiler for OpenJDK is still SS12u4 so we aren't seeing this issue exposed by SS12u5. I think JDK-8164651 may have been closed in error thinking it was the same issue as JDK-8196880, but I think they are different. David ----- On 9/01/2019 12:27 am, Michael Kebe wrote: > Hi, > > I am trying to compile OpenJDK 11 from the hg source (http://hg.openjdk.java.net/jdk-updates/jdk11u), but I get this error: > > "..../jdk11u/src/hotspot/share/adlc/arena.cpp", line 60: Error, placementdelmatch: Placement operator new refers to non-placement operator delete. > "..../jdk11u/src/hotspot/share/adlc/arena.cpp", line 67: Error, placementdelmatch: Placement operator new refers to non-placement operator delete. > "..../jdk11u/src/hotspot/share/adlc/arena.cpp", line 97: Error, placementdelmatch: Placement operator new refers to non-placement operator delete. > > I found this issue https://bugs.openjdk.java.net/browse/JDK-8164651. > It says in a comment, that is fixed, but I used the bleeding edge from the mercurial repository. > > Is the support for Solaris dropped? > > Additional info from the configure script: > > A new configuration has been successfully created in > /..../jdk11u/build/solaris-sparcv9-normal-server-release > using configure arguments '--with-boot-jdk=../jdk-11.0.1'. > > Configuration summary: > * Debug level: release > * HS debug level: product > * JVM variants: server > * JVM features: server: 'cds cmsgc compiler1 compiler2 dtrace epsilongc g1gc jfr jni-check jvmci jvmti management nmt parallelgc serialgc services vm-structs' > * OpenJDK target: OS: solaris, CPU architecture: sparc, address length: 64 > * Version string: 11.0.1-internal+0-adhoc.sysa.jdk11u (11.0.1-internal) > > Tools summary: > * Boot JDK: java version "11.0.1" 2018-10-16 LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.1+13-LTS) Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.1+13-LTS, mixed mode) (at /..../jdk-11.0.1) > * Toolchain: solstudio (Oracle Solaris Studio) > * C Compiler: Version 5.14 (at /opt/developerstudio12.5/bin/cc) > * C++ Compiler: Version 5.14 (at /opt/developerstudio12.5/bin/CC) > > Build performance summary: > * Cores to use: 16 > * Memory limit: 20480 MB > > > Michael > > > > H?ttenwerke Krupp Mannesmann GmbH, Ehinger Str. 200, D-47259 Duisburg > Gesch?ftsf?hrung: Dr. Herbert Eichelkraut, Dr. Gerhard Erdmann, Carsten Laakmann > Vorsitzender des Aufsichtsrats: Prof. Dr.-Ing. Heinz J?rg Fuhrmann > Sitz der Gesellschaft: Duisburg > Eintragung im Handelsregister: Amtsgericht Duisburg HRB 4716 > http://www.hkm.de > From coleen.phillimore at oracle.com Tue Jan 8 23:17:00 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 18:17:00 -0500 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: <0edf4fed-fa0d-d1ea-1e17-91765ab087b3@oracle.com> References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> <0edf4fed-fa0d-d1ea-1e17-91765ab087b3@oracle.com> Message-ID: <6694bbcc-cd7e-57f1-32ea-0c042af7946c@oracle.com> On 1/8/19 5:18 PM, David Holmes wrote: > Hi Coleen, > > On 9/01/2019 2:08 am, coleen.phillimore at oracle.com wrote: >> >> Hi David, >> >> My original version had a storestore but Erik convinced me that is >> unneeded since the subklass and sibling lists are what are read >> concurrently and were the fields that needed the ordering, not >> necessarily this one.??? If we backport this to 11, we have to add >> barriers to _subklass and _next_sibling like Erik has added. >> >> Does the rest of the change look good? > > Yes. Thanks! Coleen > > Thanks, > David > >> Thanks, >> Coleen >> >> On 1/7/19 8:49 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >>>> Summary: Set InstanceKlass::loaded before adding classes to the >>>> subklass list, which can be read concurrently by the compiler. >>> >>> I think you need a storestore barrier to ensure the new order is >>> preserved. >>> >>> Cheers, >>> David >>> >>>> Thanks to Erik for the diagnosis and suggested fix.? See bug >>>> comments for more details. >>>> >>>> Tested with hs-tier1-3, 6 and 8. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >>>> >>>> Thanks, >>>> Coleen >> From mikael.vidstedt at oracle.com Tue Jan 8 23:15:52 2019 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 8 Jan 2019 15:15:52 -0800 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> Message-ID: Perhaps getName0 and JVM_GetClassName should be renamed to reflect that they actually are mutating state? Cheers, Mikael > On Jan 8, 2019, at 2:40 PM, David Holmes wrote: > > On 9/01/2019 5:54 am, coleen.phillimore at oracle.com wrote: >> On 1/8/19 1:55 PM, Aleksey Shipilev wrote: >>> On 1/8/19 5:48 PM, coleen.phillimore at oracle.com wrote: >>>> I agree we shouldn't initialize the name field eagerly when creating a class. I also looked at this >>>> code path: >>>> >>>> public String getName() { >>>> String name = this.name; >>>> if (name == null) >>>> this.name = name = getName0(); >>>> return name; >>>> } >>>> >>>> It looks like when we call JVM_GetClassName, we're initializing the Class.name field by the caller. >>>> >>>> Maybe could rewrite the java/lang/Class version to be: >>>> >>>> public String getName() { >>>> String name = this.name; >>>> if (name == null) >>>> name = getName0(); // this initializes this.name yuck. >>>> return name; >>>> } >>>> >>>> and have JVM_GetClassName call java_lang_Class::name() to do the initialization. Seems not worth >>>> it just to avoid duplicating these lines in both java_lang_Class::name() and JVM_GetClassName. >>> Right. I also think it does not worth it. My reason is given in another reply in this thread: >>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036152.html >> Right, saw that. I agreed. > > I really don't like the fact the VM is now setting the name field and there's nothing in the Java code to give any indication that this is happening. At a minimum a comment should be added, as is done with other class members that get accessed directly by the VM. > > I also think core-libs folk should be having a say here. > > David > ----- > >> Coleen >>> >>> I'll handelize the oop and keep the rest as is. >>> >>> -Aleksey From mandy.chung at oracle.com Tue Jan 8 23:24:01 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 8 Jan 2019 15:24:01 -0800 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> Message-ID: <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> On 1/8/19 2:40 PM, David Holmes wrote: > On 9/01/2019 5:54 am, coleen.phillimore at oracle.com wrote: >> On 1/8/19 1:55 PM, Aleksey Shipilev wrote: >>> On 1/8/19 5:48 PM, coleen.phillimore at oracle.com wrote: >>>> I agree we shouldn't initialize the name field eagerly when >>>> creating a class.? I also looked at this >>>> code path: >>>> >>>> ???? public String getName() { >>>> ???????? String name = this.name; >>>> ???????? if (name == null) >>>> ???????????? this.name = name = getName0(); >>>> ???????? return name; >>>> ???? } >>>> >>>> It looks like when we call JVM_GetClassName, we're initializing the >>>> Class.name field by the caller. >>>> >>>> Maybe could rewrite the java/lang/Class version to be: >>>> >>>> ???? public String getName() { >>>> ???????? String name = this.name; >>>> ???????? if (name == null) >>>> ???????????? name = getName0();? // this initializes this.name yuck. >>>> ???????? return name; >>>> ???? } >>>> >>>> and have JVM_GetClassName call java_lang_Class::name() to do the >>>> initialization.?? Seems not worth >>>> it just to avoid duplicating these lines in both >>>> java_lang_Class::name() and JVM_GetClassName. >>> Right. I also think it does not worth it. My reason is given in >>> another reply in this thread: >>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036152.html >>> >> >> Right, saw that.? I agreed. > > I really don't like the fact the VM is now setting the name field and > there's nothing in the Java code to give any indication that this is > happening. At a minimum a comment should be added, as is done with > other class members that get accessed directly by the VM. > > I also think core-libs folk should be having a say here. Catching up on this thread... Two ways setting the Class::name field isn't pleasant.? What about: public String getName() { ?? String name = this.name; ?? return name != null ? name : initClassName(); } where JVM_InitClassName will call java_lang_Class::name(). Mandy From coleen.phillimore at oracle.com Tue Jan 8 23:21:47 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 8 Jan 2019 18:21:47 -0500 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> <73969a98-1e0d-f5fa-14b3-adf87ee3b933@oracle.com> <9ca5c50b-48a2-5883-3108-a34724d1fada@oracle.com> <48cc306d-e9b5-eae3-457b-114e8a9f4947@oracle.com> Message-ID: On 1/8/19 9:14 AM, Erik ?sterlund wrote: > Hi David, > > On 2019-01-08 14:15, David Holmes wrote: >> On 8/01/2019 11:08 pm, Erik ?sterlund wrote: >>> Hi David, >>> >>> On 2019-01-08 13:58, David Holmes wrote: >>>> On 8/01/2019 9:59 pm, Erik ?sterlund wrote: >>>>> Hi David, >>>>> >>>>> The required synchronization is that the _subklass link is >>>>> read/written with at least acquire/release semantics, >>>>> correspondingly. And now they are. (when appending, the link gets >>>>> written with a conservative CAS, and the link is loaded with >>>>> load_acquire). >>>> >>>> Okay. I took a look inside append_to_sibling_list and see there is >>>> lots of ordering control in there. >>>> >>>> Aside: why do you need an Atomic::store in set_next_sibling ?? >>> >>> Because it is read concurrently. Despite being read concurrently, >>> they do not need load_acquire, because the entries read are strictly >>> older than the entry you call it on, due to prepending in the list. >>> So therefore, the acquire of the _subklass link protects both the >>> _subklass and all _siblings it has. But I still want the atomicity >>> to avoid e.g. word tearing. Now you won't get word tearing anyway >>> because compilers are nice to us, but by annotating it as Atomic, we >>> can eventually get better guarantees about that once we plug in >>> Atomic to C++11 atomics. >> >> Okay it's late here and I'm tired, but this seems excessively >> conservative. Atomic::store should only be needed for 64-bit >> non-pointer types (in case of 32-bit system) or an unaligned access >> that isn't guaranteed atomic by the platform. Otherwise we'd need >> Atomic::store and Atomic::load all over the place! > > Right. The compiler is nice enough to give us atomic accesses as you > mention. But since C++11 (which we are planning to upgrade past > soonish I think), the standard is explicit about stating you can't > assume you will not get word tearing on volatile accesses, even on > naturally aligned word sized primitives. Only with native C++ atomics > do you get those guarantees. Personally I think that is a really evil > compatibility issue. I guess in practice compilers continue working > fine anyway, because there is no good reason to break code just for > the fun of it. This is discomforting, unless the compiler warns you about your volatile accesses without atomics. Coleen > > Anyway, this is why in code I write, I try to use Atomic::load/store > as good practice when I need atomicity, to reduce headache later on > when we reroute Atomic::load/store to std::atomic. > > /Erik > >> David >> >>> Thanks, >>> /Erik >>> >>>> Thanks, >>>> David >>>> >>>>> Thanks, >>>>> /Erik >>>>> >>>>> On 2019-01-08 02:49, David Holmes wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Set InstanceKlass::loaded before adding classes to the >>>>>>> subklass list, which can be read concurrently by the compiler. >>>>>> >>>>>> I think you need a storestore barrier to ensure the new order is >>>>>> preserved. >>>>>> >>>>>> Cheers, >>>>>> David >>>>>> >>>>>>> Thanks to Erik for the diagnosis and suggested fix.? See bug >>>>>>> comments for more details. >>>>>>> >>>>>>> Tested with hs-tier1-3, 6 and 8. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>> >>> > From shade at redhat.com Wed Jan 9 00:08:10 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 9 Jan 2019 01:08:10 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> Message-ID: <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> On 1/9/19 12:24 AM, Mandy Chung wrote: >> I really don't like the fact the VM is now setting the name field and there's nothing in the Java >> code to give any indication that this is happening. At a minimum a comment should be added, as is >> done with other class members that get accessed directly by the VM. >> >> I also think core-libs folk should be having a say here. > > Catching up on this thread... > > Two ways setting the Class::name field isn't pleasant.? What about: > > public String getName() { > ?? String name = this.name; > ?? return name != null ? name : initClassName(); > } > > where JVM_InitClassName will call java_lang_Class::name(). Mmm. Should we really change the jvm.h here? Does that involve CSR? It would have ripple effects on Graal, potential backports (You'd want this in 11, right? I would. This is a visible perf regression since 8), etc. Note that after handelizing java_lang_Class::name, we cannot simply call it from JVM_GetClassName. Do we really think this is worth the hassle like this: http://cr.openjdk.java.net/~shade/8216302/webrev.XX/ I'd rather prefer to document this: diff -r 7c99f0c51412 src/java.base/share/classes/java/lang/Class.java --- a/src/java.base/share/classes/java/lang/Class.java Tue Jan 08 20:27:23 2019 +0100 +++ b/src/java.base/share/classes/java/lang/Class.java Wed Jan 09 00:20:24 2019 +0100 @@ -801,5 +801,6 @@ } - // cache the name to reduce the number of calls into the VM + // Cache the name to reduce the number of calls into the VM. + // This field can be set by VM itself without the call to getName0. private transient String name; private native String getName0(); ...accept that cache can be set on different paths, and move on. -Aleksey From david.holmes at oracle.com Wed Jan 9 00:59:17 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Jan 2019 10:59:17 +1000 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> Message-ID: <1ae9e923-55b5-c006-e7a2-0603faf365d5@oracle.com> On 9/01/2019 10:08 am, Aleksey Shipilev wrote: > On 1/9/19 12:24 AM, Mandy Chung wrote: >>> I really don't like the fact the VM is now setting the name field and there's nothing in the Java >>> code to give any indication that this is happening. At a minimum a comment should be added, as is >>> done with other class members that get accessed directly by the VM. >>> >>> I also think core-libs folk should be having a say here. >> >> Catching up on this thread... >> >> Two ways setting the Class::name field isn't pleasant.? What about: >> >> public String getName() { >> ?? String name = this.name; >> ?? return name != null ? name : initClassName(); >> } >> >> where JVM_InitClassName will call java_lang_Class::name(). > > Mmm. Should we really change the jvm.h here? Does that involve CSR? jvm.h is a private interface between the OpenJDK core libraries and the OpenJDK VM (Hotspot), and does not require a CSR request when changed. > It would have ripple effects on > Graal, potential backports (You'd want this in 11, right? I would. This is a visible perf regression > since 8), etc. Note that after handelizing java_lang_Class::name, we cannot simply call it from > JVM_GetClassName. > > Do we really think this is worth the hassle like this: > http://cr.openjdk.java.net/~shade/8216302/webrev.XX/ That version works for me. > I'd rather prefer to document this: > > diff -r 7c99f0c51412 src/java.base/share/classes/java/lang/Class.java > --- a/src/java.base/share/classes/java/lang/Class.java Tue Jan 08 20:27:23 2019 +0100 > +++ b/src/java.base/share/classes/java/lang/Class.java Wed Jan 09 00:20:24 2019 +0100 > @@ -801,5 +801,6 @@ > } > > - // cache the name to reduce the number of calls into the VM > + // Cache the name to reduce the number of calls into the VM. > + // This field can be set by VM itself without the call to getName0. > private transient String name; > private native String getName0(); > > ...accept that cache can be set on different paths, and move on. That would meet my minimum level for acceptance. Cheers, David > -Aleksey > From claes.redestad at oracle.com Wed Jan 9 01:13:43 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 9 Jan 2019 02:13:43 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> Message-ID: <2f384e36-e406-af93-ff4c-5fc54b391ff9@oracle.com> On 2019-01-09 01:08, Aleksey Shipilev wrote: > You'd want this in 11, right? I would +1 /Claes From mandy.chung at oracle.com Wed Jan 9 01:12:57 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 8 Jan 2019 17:12:57 -0800 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <1ae9e923-55b5-c006-e7a2-0603faf365d5@oracle.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> <1ae9e923-55b5-c006-e7a2-0603faf365d5@oracle.com> Message-ID: <7f2d1db1-7387-0a09-e869-ec3ebce68c42@oracle.com> On 1/8/19 4:59 PM, David Holmes wrote: > On 9/01/2019 10:08 am, Aleksey Shipilev wrote: >> On 1/9/19 12:24 AM, Mandy Chung wrote: >>>> I really don't like the fact the VM is now setting the name field >>>> and there's nothing in the Java >>>> code to give any indication that this is happening. At a minimum a >>>> comment should be added, as is >>>> done with other class members that get accessed directly by the VM. >>>> >>>> I also think core-libs folk should be having a say here. >>> >>> Catching up on this thread... >>> >>> Two ways setting the Class::name field isn't pleasant.? What about: >>> >>> public String getName() { >>> ??? String name = this.name; >>> ??? return name != null ? name : initClassName(); >>> } >>> >>> where JVM_InitClassName will call java_lang_Class::name(). >> >> Mmm. Should we really change the jvm.h here? Does that involve CSR? > > jvm.h is a private interface between the OpenJDK core libraries and > the OpenJDK VM (Hotspot), and does not require a CSR request when > changed. Yup.? CSR is not required for changes in this private interface. > >> It would have ripple effects on >> Graal, potential backports (You'd want this in 11, right? I would. >> This is a visible perf regression >> since 8), etc. Note that after handelizing java_lang_Class::name, we >> cannot simply call it from >> JVM_GetClassName. >> >> Do we really think this is worth the hassle like this: >> ?? http://cr.openjdk.java.net/~shade/8216302/webrev.XX/ > > That version works for me. Thanks for making this change.? I prefer this version which makes the code very clear what it does. Thanks Mandy From david.holmes at oracle.com Wed Jan 9 03:40:42 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Jan 2019 13:40:42 +1000 Subject: RFR(m): 8214271: Fast primitive to wake many threads In-Reply-To: <755aaf5b-8a49-ef5a-65ce-18550547a91b@oracle.com> References: <010211e3-93a6-80b9-678c-c84b08812e43@oracle.com> <70669453-e317-a30d-8d5a-e5b938b83c41@oracle.com> <4fb6cd22-cdd0-2419-c863-24b250ac0b16@oracle.com> <2a2679cc-b0e0-f8d0-7336-8666e1a42950@oracle.com> <01873a0f-a0fb-18b9-f7d4-98bb638e9b57@oracle.com> <41f5252b-3eb9-9a9e-70e5-49f6d8f9d670@oracle.com> <755aaf5b-8a49-ef5a-65ce-18550547a91b@oracle.com> Message-ID: Hi Robbin, No further significant comments, lets just see how this plays out. Some minor nits: src/hotspot/share/utilities/waitBarrier.hpp ! // A primary goal of the WaitBarrier implementation is to disarm all waiting s/disarm/wake/ That was one place a global replace shouldn't have been applied. :) ! // - Calling disarm() guarantees any thread calling or called wait(tag) will "or called" is not grammatically correct. Perhaps: // - Calling disarm() guarantees any thread now calling or that has called wait(tag) will // Guarantees any thread called wait() will be awake when it returns. s/called/that called/ --- src/hotspot/share/utilities/waitBarrier_generic.cpp ! // disarm store must not float below. s/float/sink/ 74 // API specifies wake() must provides a trailing fence. s/wake/disarm/ s/provides/provide/ 81 // API specifies wait() must provides a trailing fence. s/provides/provide/ Thanks, David On 8/01/2019 8:42 pm, Robbin Ehn wrote: > Hi David, > > On 1/2/19 12:35 AM, David Holmes wrote: >>>> Further this sounds like a race that could lead to bugs if not used >>>> very carefully ie. you can't assume between disarm() and wake() that >>>> all threads are blocked. >>> >>> I didn't realize how subtle this is. I think your original comment that >>> disarm/wake should be one operation was spot on. >>> Investigating... thinking... testing... yes I think this will work, >>> fixed! >>> Sorry for not looking more into this before. >> >> I'm now curious how this will actually work in the context of the >> safepoint changes? > > Since code already handle this 'invariant' with threads not being block > between disarm() and wake(), doing it one operation just very slightly > increases the chance that a thread will be blocked when we actually can > handle it to be running, but reduces the chance to hit a false positive > TLH poll. > (with TLH we have a two-step un-synchronizing out of safepoints where we > must change global safepoint state before changing the thread polling > state) > > (I have some thoughts on simplifying TLH/safepoint states) > >> Nit: I would have kept disarm() rather than wake() as I like the >> arm/disarm duality. > > Yes, me too. Not sure why I did the opposite, fixed! > >> >> ?? void GenericWaitBarrier::wait(int barrier_tag) { >> ???? assert(barrier_tag != 0, "Trying to wait on disarmed value"); >> +?? if (barrier_tag == 0 && barrier_tag != _barrier_tag) { >> +???? OrderAccess::fence(); >> +???? return; >> +?? } >> >> I don't understand what the above is doing. A barrier_tag of 0 is a >> programming error caught during testing in debug builds. You don't >> need to account for it being 0 in product because this isn't something >> that can come in from an external source - we have full code control >> here. And even if you want to be this paranoid why would you need the >> fence? > > Fixed, but kept the fence, since we say we are providing a trailing fence. > Otherwise I would like to add that exception to the description of wait(). > > Including Dan's comments: > Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/ > Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/ > > > Thanks, Robbin > >> >> Thanks, >> David >> ----- >> >>> Full: >>> http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/ >>> >>> gtest passes thousands of loops locally and hundreds in mach5. >>> >>> Thanks, Robbin >>> >>>> >>>> Thanks, >>>> David >>>> >>>>>> >>>>>> s/Implementation/Implementations/ >>>>> >>>>> Fixed >>>>> >>>>>> >>>>>> The fourth line is no longer needed. >>>>> >>>>> Above is the reason I would like to keep the fourth line, since >>>>> only if you call >>>>> both disarm() and wake() you have that guarantee that waiter >>>>> threads will >>>>> return. >>>>> >>>>> Thanks, Robbin >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> >>>>>>> Inc: >>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/ >>>>>>> >>>>>>> Full: >>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/ >>>>>>> >>>>>>> /Robbin >>>>>>> >>>>>>>> >>>>>>>> Otherwise this all looks good! >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>> >>>>>>>>> Full: >>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/ >>>>>>>>> >>>>>>>>> Thanks, Robbin >>>>>>>>> >>>>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote: >>>>>>>>>> Forgot RFR in subject. >>>>>>>>>> >>>>>>>>>> /Robbin >>>>>>>>>> >>>>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote: >>>>>>>>>>> Hi all, please review. >>>>>>>>>>> >>>>>>>>>>> When a safepoint is ended we need a way to get back to 100% >>>>>>>>>>> utilization as fast >>>>>>>>>>> as possible. 100% utilization means no idle cpu in the system >>>>>>>>>>> if there is a >>>>>>>>>>> JavaThread that could be executed. The traditional ways to >>>>>>>>>>> wake many, e.g. >>>>>>>>>>> semaphore, pthread_cond, is not implemented with a single >>>>>>>>>>> syscall instead they >>>>>>>>>>> typical do one syscall per thread to wake. >>>>>>>>>>> >>>>>>>>>>> This change-set contains that primitive, the WaitBarrier, and >>>>>>>>>>> a gtest for it. >>>>>>>>>>> No actual users, which is in coming patches. >>>>>>>>>>> >>>>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore >>>>>>>>>>> posting, threads woken >>>>>>>>>>> will also post. On Linux we can instead directly use a futex >>>>>>>>>>> and with one >>>>>>>>>>> syscall wake all. Depending on how many threads and cpus the >>>>>>>>>>> performance vary, >>>>>>>>>>> but a good utilization of the machine, just on the edge of >>>>>>>>>>> saturated, the time to reach 100% utilization is around 3 >>>>>>>>>>> times faster with the WaitBarrier (where futex is faster than >>>>>>>>>>> semaphore). >>>>>>>>>>> >>>>>>>>>>> Webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/ >>>>>>>>>>> >>>>>>>>>>> CR: >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271 >>>>>>>>>>> >>>>>>>>>>> Passes 100 iterations of gtest on our platforms, both >>>>>>>>>>> fastdebug and release. >>>>>>>>>>> And have been stable when used in safepoints (t1-8) (coming >>>>>>>>>>> patches). >>>>>>>>>>> >>>>>>>>>>> Thanks, Robbin From david.holmes at oracle.com Wed Jan 9 07:31:41 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Jan 2019 17:31:41 +1000 Subject: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. In-Reply-To: <9349eed214ce46ee81868840c0dbd54d@sap.com> References: <9349eed214ce46ee81868840c0dbd54d@sap.com> Message-ID: <6277c580-0397-d4ab-7b03-7721544048ff@oracle.com> Hi Goetz, Overall this looks okay to me. A few nits below ... On 7/01/2019 11:07 pm, Lindenmaier, Goetz wrote: > Hi, > > Different operating systems use different names for the environment variable > that contains the search paths for native libraries. This path is used in > a row of tests. A switch over all OSes is needed to find out the proper variable > name in each test using it. > > This change introduces a central function > Platform.sharedLibraryPathVariableName() > that returns "LD_LIBRARY_PATH", "DYLD_LIBRARY_PATH", "PATH" or > "LIBPATH" depending on the current OS. > This change also adapts all usages of these variables in the tests to call > this function. > Because of the change to KDC.java I had to add @library /test/lib > to much more tests than where I had to do the underlying change. Ouch! that was unpleasant. :( > The change also replaces local checking for path separators by > File.pathSeparator in jdk/com/sun/jdi/PrivateTransportTest.java. > > The change depends on "8215975: [testbug] Adapt nsk tests to > the PPC, S390 and AIX platforms." which will be moved from jdk12 > to jdk soon. > > Please review: > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/01/ test/hotspot/jtreg/gtest/GTestWrapper.java 75 env.put(pathVar, path + ":" + ldLibraryPath); Shouldn't ":" be File.pathSeparator? --- test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java Copyright year needs updating. --- test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java 70 private static final Path pathEnvVar The variable isn't an env var, it's just a path - I suggest libraryPath. 101 System.out.println(Platform.sharedLibraryPathVariableName() + "=" + pathEnvVar); ... 114 env.put(Platform.sharedLibraryPathVariableName(), pathEnvVar.toString()); I suggest storing the name in a local to avoid the second call. --- test/jdk/tools/launcher/JliLaunchTest.java 57 env.compute(pathEnvVar, (k, v) -> (v == null) ? libdir : libdir + ":" + v); Shouldn't ":" be File.pathSeparator? --- test/jdk/tools/launcher/Test7029048.java 39 import jdk.test.lib.Platform; Why do you need this? --- test/jdk/vm/JniInvocationTest.java This is a Mac only test so no changes needed. --- test/lib/jdk/test/lib/Platform.java The javadoc comments is unnecessary as we don't generate javadoc here. I see you copied the preceding sharedLibraryExt() style. The @return is superfluous. Thanks, David > Best regards, > Goetz. > From matthias.baesken at sap.com Wed Jan 9 07:51:01 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 9 Jan 2019 07:51:01 +0000 Subject: Error placementdelmatch when compiling OpenJDK 11 with Oracle Developer Studio In-Reply-To: <6bed0312-efa1-1fff-dc47-fa7994089304@oracle.com> References: <20190108-152754.13hbpbdro-2lai@mailcc08> <6bed0312-efa1-1fff-dc47-fa7994089304@oracle.com> Message-ID: Hello Michael, I've seen the arena.cpp compile errors you mentioned as well with SS12u6 , an update to the latest patch of SS12u6 resolved the issue . So you might want to try an update to the latest patch of SS12u5 (or to the latest patch of SS12u6 if this is possible for you) , this could remove the error. However our nightly builds still use SS12u4 . Best regards, Matthias > -----Original Message----- > From: jdk-dev On Behalf Of David > Holmes > Sent: Mittwoch, 9. Januar 2019 00:12 > To: Michael Kebe ; hotspot-dev developers > > Subject: Re: Error placementdelmatch when compiling OpenJDK 11 with > Oracle Developer Studio > > Hi Michael, > > I've bcc'd jdk-dev and am moving this to hotspot-dev. > > The official Solaris compiler for OpenJDK is still SS12u4 so we aren't > seeing this issue exposed by SS12u5. > > I think JDK-8164651 may have been closed in error thinking it was the > same issue as JDK-8196880, but I think they are different. > > David > ----- > > On 9/01/2019 12:27 am, Michael Kebe wrote: > > Hi, > > > > I am trying to compile OpenJDK 11 from the hg source > (http://hg.openjdk.java.net/jdk-updates/jdk11u), but I get this error: > > > > "..../jdk11u/src/hotspot/share/adlc/arena.cpp", line 60: Error, > placementdelmatch: Placement operator new refers to non-placement > operator delete. > > "..../jdk11u/src/hotspot/share/adlc/arena.cpp", line 67: Error, > placementdelmatch: Placement operator new refers to non-placement > operator delete. > > "..../jdk11u/src/hotspot/share/adlc/arena.cpp", line 97: Error, > placementdelmatch: Placement operator new refers to non-placement > operator delete. > > > > I found this issue https://bugs.openjdk.java.net/browse/JDK-8164651. > > It says in a comment, that is fixed, but I used the bleeding edge from the > mercurial repository. > > > > Is the support for Solaris dropped? > > > > Additional info from the configure script: > > > > A new configuration has been successfully created in > > /..../jdk11u/build/solaris-sparcv9-normal-server-release > > using configure arguments '--with-boot-jdk=../jdk-11.0.1'. > > > > Configuration summary: > > * Debug level: release > > * HS debug level: product > > * JVM variants: server > > * JVM features: server: 'cds cmsgc compiler1 compiler2 dtrace epsilongc > g1gc jfr jni-check jvmci jvmti management nmt parallelgc serialgc services > vm-structs' > > * OpenJDK target: OS: solaris, CPU architecture: sparc, address length: 64 > > * Version string: 11.0.1-internal+0-adhoc.sysa.jdk11u (11.0.1-internal) > > > > Tools summary: > > * Boot JDK: java version "11.0.1" 2018-10-16 LTS Java(TM) SE Runtime > Environment 18.9 (build 11.0.1+13-LTS) Java HotSpot(TM) 64-Bit Server VM > 18.9 (build 11.0.1+13-LTS, mixed mode) (at /..../jdk-11.0.1) > > * Toolchain: solstudio (Oracle Solaris Studio) > > * C Compiler: Version 5.14 (at /opt/developerstudio12.5/bin/cc) > > * C++ Compiler: Version 5.14 (at /opt/developerstudio12.5/bin/CC) > > > > Build performance summary: > > * Cores to use: 16 > > * Memory limit: 20480 MB > > > > > > Michael > > > > > > > > H?ttenwerke Krupp Mannesmann GmbH, Ehinger Str. 200, D-47259 > Duisburg > > Gesch?ftsf?hrung: Dr. Herbert Eichelkraut, Dr. Gerhard Erdmann, Carsten > Laakmann > > Vorsitzender des Aufsichtsrats: Prof. Dr.-Ing. Heinz J?rg Fuhrmann > > Sitz der Gesellschaft: Duisburg > > Eintragung im Handelsregister: Amtsgericht Duisburg HRB 4716 > > http://www.hkm.de > > From erik.osterlund at oracle.com Wed Jan 9 09:51:25 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 9 Jan 2019 10:51:25 +0100 Subject: RFR (S) 8215575: C2 crash: assert(get_instanceKlass()->is_loaded()) failed: must be at least loaded In-Reply-To: References: <76a2edc1-7834-33d2-3876-7b52bd95bb37@oracle.com> <73969a98-1e0d-f5fa-14b3-adf87ee3b933@oracle.com> <9ca5c50b-48a2-5883-3108-a34724d1fada@oracle.com> <48cc306d-e9b5-eae3-457b-114e8a9f4947@oracle.com> Message-ID: <0d0bb067-9d42-73d4-2fed-cc47aefe362a@oracle.com> Hi Coleen, On 2019-01-09 00:21, coleen.phillimore at oracle.com wrote: > > > On 1/8/19 9:14 AM, Erik ?sterlund wrote: >> Hi David, >> >> On 2019-01-08 14:15, David Holmes wrote: >>> On 8/01/2019 11:08 pm, Erik ?sterlund wrote: >>>> Hi David, >>>> >>>> On 2019-01-08 13:58, David Holmes wrote: >>>>> On 8/01/2019 9:59 pm, Erik ?sterlund wrote: >>>>>> Hi David, >>>>>> >>>>>> The required synchronization is that the _subklass link is >>>>>> read/written with at least acquire/release semantics, >>>>>> correspondingly. And now they are. (when appending, the link gets >>>>>> written with a conservative CAS, and the link is loaded with >>>>>> load_acquire). >>>>> >>>>> Okay. I took a look inside append_to_sibling_list and see there is >>>>> lots of ordering control in there. >>>>> >>>>> Aside: why do you need an Atomic::store in set_next_sibling ?? >>>> >>>> Because it is read concurrently. Despite being read concurrently, >>>> they do not need load_acquire, because the entries read are >>>> strictly older than the entry you call it on, due to prepending in >>>> the list. So therefore, the acquire of the _subklass link protects >>>> both the _subklass and all _siblings it has. But I still want the >>>> atomicity to avoid e.g. word tearing. Now you won't get word >>>> tearing anyway because compilers are nice to us, but by annotating >>>> it as Atomic, we can eventually get better guarantees about that >>>> once we plug in Atomic to C++11 atomics. >>> >>> Okay it's late here and I'm tired, but this seems excessively >>> conservative. Atomic::store should only be needed for 64-bit >>> non-pointer types (in case of 32-bit system) or an unaligned access >>> that isn't guaranteed atomic by the platform. Otherwise we'd need >>> Atomic::store and Atomic::load all over the place! >> >> Right. The compiler is nice enough to give us atomic accesses as you >> mention. But since C++11 (which we are planning to upgrade past >> soonish I think), the standard is explicit about stating you can't >> assume you will not get word tearing on volatile accesses, even on >> naturally aligned word sized primitives. Only with native C++ atomics >> do you get those guarantees. Personally I think that is a really evil >> compatibility issue. I guess in practice compilers continue working >> fine anyway, because there is no good reason to break code just for >> the fun of it. > > This is discomforting, unless the compiler warns you about your > volatile accesses without atomics. That is discomforting indeed. The compiler does not warn about using volatile, because volatile is perfectly well defined, just not for concurrency at all. And the compiler doesn't know that you are using it for that. So instead, this is treated like all other C++ undefined behaviour: you have to just remember the whole standard in the back of your head all the time when you are coding, max out on your coffee, and figure out that things are undefined behaviour and we are really not allowed to do it, despite compilers saying everything is fine. And it will probably take us years to get rid of these assumptions. That's why I'm already annotating the intentions in my code to make that easier. Hopefully compilers give us a break until we get there. /Erik > Coleen >> >> Anyway, this is why in code I write, I try to use Atomic::load/store >> as good practice when I need atomicity, to reduce headache later on >> when we reroute Atomic::load/store to std::atomic. >> >> /Erik >> >>> David >>> >>>> Thanks, >>>> /Erik >>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks, >>>>>> /Erik >>>>>> >>>>>> On 2019-01-08 02:49, David Holmes wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> On 8/01/2019 5:50 am, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Set InstanceKlass::loaded before adding classes to the >>>>>>>> subklass list, which can be read concurrently by the compiler. >>>>>>> >>>>>>> I think you need a storestore barrier to ensure the new order is >>>>>>> preserved. >>>>>>> >>>>>>> Cheers, >>>>>>> David >>>>>>> >>>>>>>> Thanks to Erik for the diagnosis and suggested fix.? See bug >>>>>>>> comments for more details. >>>>>>>> >>>>>>>> Tested with hs-tier1-3, 6 and 8. >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8215575.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8215575 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>> >>>> >> > From goetz.lindenmaier at sap.com Wed Jan 9 10:34:24 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 9 Jan 2019 10:34:24 +0000 Subject: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. In-Reply-To: <6277c580-0397-d4ab-7b03-7721544048ff@oracle.com> References: <9349eed214ce46ee81868840c0dbd54d@sap.com> <6277c580-0397-d4ab-7b03-7721544048ff@oracle.com> Message-ID: <129ed17946754b9c896fa41dd44d031f@sap.com> Hi David, thanks for looking at my change. It was asked for by Gary when he reviewed https://bugs.openjdk.java.net/browse/JDK-8215975 New webrev: http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02-incremental/ http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02/ See my comments inline below. Best regards, Goetz. > test/hotspot/jtreg/gtest/GTestWrapper.java > > 75 env.put(pathVar, path + ":" + ldLibraryPath); > > Shouldn't ":" be File.pathSeparator? Fixed. > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > ChannelTest.java > > Copyright year needs updating. Done. > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > ChannelTest.java > > 70 private static final Path pathEnvVar > > The variable isn't an env var, it's just a path - I suggest libraryPath. A cleanup not directly related. But makes sense, done. > 101 > System.out.println(Platform.sharedLibraryPathVariableName() + "=" + > pathEnvVar); > ... > 114 env.put(Platform.sharedLibraryPathVariableName(), > pathEnvVar.toString()); > > I suggest storing the name in a local to avoid the second call. Done. > test/jdk/tools/launcher/JliLaunchTest.java > > 57 env.compute(pathEnvVar, (k, v) -> (v == null) ? libdir > : libdir + ":" + v); > > Shouldn't ":" be File.pathSeparator? This is because there is anyways a switch about the OS. Did some more cleaning up. > test/jdk/tools/launcher/Test7029048.java > > 39 import jdk.test.lib.Platform; > > Why do you need this? Removed. > test/jdk/vm/JniInvocationTest.java > > This is a Mac only test so no changes needed. I would like to change this anyways. I think this makes it look more consistent. > test/lib/jdk/test/lib/Platform.java > > The javadoc comments is unnecessary as we don't generate javadoc here. I > see you copied the preceding sharedLibraryExt() style. The @return is > superfluous. Changed. Better? > > Thanks, > David > > > Best regards, > > Goetz. > > From amith.pawar at gmail.com Wed Jan 9 11:30:46 2019 From: amith.pawar at gmail.com (amith pawar) Date: Wed, 9 Jan 2019 17:00:46 +0530 Subject: RFR (S/M): 8213827: NUMA heap allocation does not respect process membind/interleave settings [Was: Re: [PATCH] JDK NUMA Interleaving issue] In-Reply-To: <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> References: <6e5b102d07b4ceded09115a649be020410240fe7.camel@oracle.com> <9bea7b0957bbfc2f0ac34306ee162f2d98e44bfe.camel@oracle.com> <99164b92f47f264978339ed327da9d41098a7e1d.camel@oracle.com> <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> Message-ID: Hi Sangheon, Thanks for reviewing and updated with suggested changes. please check. Thanks, Amit Pawar On Wed, Jan 9, 2019 at 12:45 AM wrote: > Hi Thomas, > > On 12/13/18 2:33 AM, Thomas Schatzl wrote: > > Hi Amit, > On Thu, 2018-12-13 at 15:11 +0530, amith pawar wrote: > > Hi Thomas, > > Please find the attached patch updated as per your suggestion. > If everything OK then can you please commit this to repo ? > > looks good. We will need a second reviewer though, I am going to ask > around. > > Latest webrev:http://cr.openjdk.java.net/~tschatzl/8213827/webrev.3/ > > Webrev.3 looks good to me. > > I have some minor nits: > ---------------------------------------- > src/hotspot/os/linux/os_linux.cpp > 5012 for (int node = 0; node < Linux::numa_max_node(); node++) { > - Looks like 'node <*=* Linux::numa_max_node()' is the right one to print > the latest node? > > ---------------------------------------- > src/hotspot/os/linux/os_linux.hpp > 271 enum Numa_allocation_policy{ > - Looking at 'enum' at os.hpp, we use Camel style. > - There are missing space before '{'. > > - As usual, copyright year updates. I know it was correct when you posted. > :) > > Thanks, > Sangheon > > > Thanks, > Thomas > > > > > -- With best regards, amit pawar -------------- next part -------------- diff -r 7d8676b2487f src/hotspot/os/linux/os_linux.cpp --- a/src/hotspot/os/linux/os_linux.cpp Wed Jan 09 10:19:54 2019 +0100 +++ b/src/hotspot/os/linux/os_linux.cpp Wed Jan 09 16:48:57 2019 +0530 @@ -33,6 +33,7 @@ #include "compiler/disassembler.hpp" #include "interpreter/interpreter.hpp" #include "logging/log.hpp" +#include "logging/logStream.hpp" #include "memory/allocation.inline.hpp" #include "memory/filemap.hpp" #include "oops/oop.inline.hpp" @@ -2780,7 +2781,7 @@ // Get the total number of nodes in the system including nodes without memory. for (node = 0; node <= highest_node_number; node++) { - if (isnode_in_existing_nodes(node)) { + if (is_node_in_existing_nodes(node)) { num_nodes++; } } @@ -2796,7 +2797,7 @@ // node number. If the nodes have been bound explicitly using numactl membind, // then allocate memory from those nodes only. for (int node = 0; node <= highest_node_number; node++) { - if (Linux::isnode_in_bound_nodes((unsigned int)node)) { + if (Linux::is_node_in_bound_nodes((unsigned int)node)) { ids[i++] = node; } } @@ -2899,11 +2900,15 @@ libnuma_dlsym(handle, "numa_distance"))); set_numa_get_membind(CAST_TO_FN_PTR(numa_get_membind_func_t, libnuma_v2_dlsym(handle, "numa_get_membind"))); + set_numa_get_interleave_mask(CAST_TO_FN_PTR(numa_get_interleave_mask_func_t, + libnuma_v2_dlsym(handle, "numa_get_interleave_mask"))); if (numa_available() != -1) { set_numa_all_nodes((unsigned long*)libnuma_dlsym(handle, "numa_all_nodes")); set_numa_all_nodes_ptr((struct bitmask **)libnuma_dlsym(handle, "numa_all_nodes_ptr")); set_numa_nodes_ptr((struct bitmask **)libnuma_dlsym(handle, "numa_nodes_ptr")); + set_numa_interleave_bitmask(_numa_get_interleave_mask()); + set_numa_membind_bitmask(_numa_get_membind()); // Create an index -> node mapping, since nodes are not always consecutive _nindex_to_node = new (ResourceObj::C_HEAP, mtInternal) GrowableArray(0, true); rebuild_nindex_to_node_map(); @@ -2929,7 +2934,7 @@ nindex_to_node()->clear(); for (int node = 0; node <= highest_node_number; node++) { - if (Linux::isnode_in_existing_nodes(node)) { + if (Linux::is_node_in_existing_nodes(node)) { nindex_to_node()->append(node); } } @@ -2966,16 +2971,16 @@ // the closest configured node. Check also if node is bound, i.e. it's allowed // to allocate memory from the node. If it's not allowed, map cpus in that node // to the closest node from which memory allocation is allowed. - if (!isnode_in_configured_nodes(nindex_to_node()->at(i)) || - !isnode_in_bound_nodes(nindex_to_node()->at(i))) { + if (!is_node_in_configured_nodes(nindex_to_node()->at(i)) || + !is_node_in_bound_nodes(nindex_to_node()->at(i))) { closest_distance = INT_MAX; // Check distance from all remaining nodes in the system. Ignore distance // from itself, from another non-configured node, and from another non-bound // node. for (size_t m = 0; m < node_num; m++) { if (m != i && - isnode_in_configured_nodes(nindex_to_node()->at(m)) && - isnode_in_bound_nodes(nindex_to_node()->at(m))) { + is_node_in_configured_nodes(nindex_to_node()->at(m)) && + is_node_in_bound_nodes(nindex_to_node()->at(m))) { distance = numa_distance(nindex_to_node()->at(i), nindex_to_node()->at(m)); // If a closest node is found, update. There is always at least one // configured and bound node in the system so there is always at least @@ -3030,9 +3035,13 @@ os::Linux::numa_bitmask_isbitset_func_t os::Linux::_numa_bitmask_isbitset; os::Linux::numa_distance_func_t os::Linux::_numa_distance; os::Linux::numa_get_membind_func_t os::Linux::_numa_get_membind; +os::Linux::numa_get_interleave_mask_func_t os::Linux::_numa_get_interleave_mask; +os::Linux::Numa_allocation_policy os::Linux::_current_numa_policy; unsigned long* os::Linux::_numa_all_nodes; struct bitmask* os::Linux::_numa_all_nodes_ptr; struct bitmask* os::Linux::_numa_nodes_ptr; +struct bitmask* os::Linux::_numa_interleave_bitmask; +struct bitmask* os::Linux::_numa_membind_bitmask; bool os::pd_uncommit_memory(char* addr, size_t size) { uintptr_t res = (uintptr_t) ::mmap(addr, size, PROT_NONE, @@ -4944,6 +4953,74 @@ OSContainer::init(); } +void os::Linux::numa_init() { + + // Java can be invoked as + // 1. Without numactl and heap will be allocated/configured on all nodes as + // per the system policy. + // 2. With numactl --interleave: + // Use numa_get_interleave_mask(v2) API to get nodes bitmask. The same + // API for membind case bitmask is reset. + // Interleave is only hint and Kernel can fallback to other nodes if + // no memory is available on the target nodes. + // 3. With numactl --membind: + // Use numa_get_membind(v2) API to get nodes bitmask. The same API for + // interleave case returns bitmask of all nodes. + // numa_all_nodes_ptr holds bitmask of all nodes. + // numa_get_interleave_mask(v2) and numa_get_membind(v2) APIs returns correct + // bitmask when externally configured to run on all or fewer nodes. + + if (!Linux::libnuma_init()) { + UseNUMA = false; + } else { + if ((Linux::numa_max_node() < 1) || Linux::is_bound_to_single_node()) { + // If there's only one node (they start from 0) or if the process + // is bound explicitly to a single node using membind, disable NUMA. + UseNUMA = false; + } else { + + LogTarget(Info,os) log; + LogStream ls(log); + + Linux::set_configured_numa_policy (Linux::identify_numa_policy()); + + struct bitmask* bmp = Linux::_numa_membind_bitmask; + const char* numa_mode = "membind"; + + if (Linux::is_running_in_interleave_mode()) { + bmp = Linux::_numa_interleave_bitmask; + numa_mode = "interleave"; + } + + ls.print("UseNUMA is enabled and invoked in '%s' mode." + " Heap will be configured using NUMA memory nodes:", numa_mode); + + for (int node = 0; node <= Linux::numa_max_node(); node++) { + if (Linux::_numa_bitmask_isbitset(bmp, node)) { + ls.print(" %d", node); + } + } + } + } + + if (UseParallelGC && UseNUMA && UseLargePages && !can_commit_large_page_memory()) { + // With SHM and HugeTLBFS large pages we cannot uncommit a page, so there's no way + // we can make the adaptive lgrp chunk resizing work. If the user specified both + // UseNUMA and UseLargePages (or UseSHM/UseHugeTLBFS) on the command line - warn + // and disable adaptive resizing. + if (UseAdaptiveSizePolicy || UseAdaptiveNUMAChunkSizing) { + warning("UseNUMA is not fully compatible with SHM/HugeTLBFS large pages, " + "disabling adaptive resizing (-XX:-UseAdaptiveSizePolicy -XX:-UseAdaptiveNUMAChunkSizing)"); + UseAdaptiveSizePolicy = false; + UseAdaptiveNUMAChunkSizing = false; + } + } + + if (!UseNUMA && ForceNUMA) { + UseNUMA = true; + } +} + // this is called _after_ the global arguments have been parsed jint os::init_2(void) { @@ -4988,32 +5065,7 @@ Linux::glibc_version(), Linux::libpthread_version()); if (UseNUMA) { - if (!Linux::libnuma_init()) { - UseNUMA = false; - } else { - if ((Linux::numa_max_node() < 1) || Linux::isbound_to_single_node()) { - // If there's only one node (they start from 0) or if the process - // is bound explicitly to a single node using membind, disable NUMA. - UseNUMA = false; - } - } - - if (UseParallelGC && UseNUMA && UseLargePages && !can_commit_large_page_memory()) { - // With SHM and HugeTLBFS large pages we cannot uncommit a page, so there's no way - // we can make the adaptive lgrp chunk resizing work. If the user specified both - // UseNUMA and UseLargePages (or UseSHM/UseHugeTLBFS) on the command line - warn - // and disable adaptive resizing. - if (UseAdaptiveSizePolicy || UseAdaptiveNUMAChunkSizing) { - warning("UseNUMA is not fully compatible with SHM/HugeTLBFS large pages, " - "disabling adaptive resizing (-XX:-UseAdaptiveSizePolicy -XX:-UseAdaptiveNUMAChunkSizing)"); - UseAdaptiveSizePolicy = false; - UseAdaptiveNUMAChunkSizing = false; - } - } - - if (!UseNUMA && ForceNUMA) { - UseNUMA = true; - } + Linux::numa_init(); } if (MaxFDLimit) { diff -r 7d8676b2487f src/hotspot/os/linux/os_linux.hpp --- a/src/hotspot/os/linux/os_linux.hpp Wed Jan 09 10:19:54 2019 +0100 +++ b/src/hotspot/os/linux/os_linux.hpp Wed Jan 09 16:48:57 2019 +0530 @@ -211,6 +211,7 @@ // none present private: + static void numa_init(); static void expand_stack_to(address bottom); typedef int (*sched_getcpu_func_t)(void); @@ -222,6 +223,7 @@ typedef void (*numa_interleave_memory_func_t)(void *start, size_t size, unsigned long *nodemask); typedef void (*numa_interleave_memory_v2_func_t)(void *start, size_t size, struct bitmask* mask); typedef struct bitmask* (*numa_get_membind_func_t)(void); + typedef struct bitmask* (*numa_get_interleave_mask_func_t)(void); typedef void (*numa_set_bind_policy_func_t)(int policy); typedef int (*numa_bitmask_isbitset_func_t)(struct bitmask *bmp, unsigned int n); @@ -239,9 +241,12 @@ static numa_bitmask_isbitset_func_t _numa_bitmask_isbitset; static numa_distance_func_t _numa_distance; static numa_get_membind_func_t _numa_get_membind; + static numa_get_interleave_mask_func_t _numa_get_interleave_mask; static unsigned long* _numa_all_nodes; static struct bitmask* _numa_all_nodes_ptr; static struct bitmask* _numa_nodes_ptr; + static struct bitmask* _numa_interleave_bitmask; + static struct bitmask* _numa_membind_bitmask; static void set_sched_getcpu(sched_getcpu_func_t func) { _sched_getcpu = func; } static void set_numa_node_to_cpus(numa_node_to_cpus_func_t func) { _numa_node_to_cpus = func; } @@ -255,10 +260,21 @@ static void set_numa_bitmask_isbitset(numa_bitmask_isbitset_func_t func) { _numa_bitmask_isbitset = func; } static void set_numa_distance(numa_distance_func_t func) { _numa_distance = func; } static void set_numa_get_membind(numa_get_membind_func_t func) { _numa_get_membind = func; } + static void set_numa_get_interleave_mask(numa_get_interleave_mask_func_t func) { _numa_get_interleave_mask = func; } static void set_numa_all_nodes(unsigned long* ptr) { _numa_all_nodes = ptr; } static void set_numa_all_nodes_ptr(struct bitmask **ptr) { _numa_all_nodes_ptr = (ptr == NULL ? NULL : *ptr); } static void set_numa_nodes_ptr(struct bitmask **ptr) { _numa_nodes_ptr = (ptr == NULL ? NULL : *ptr); } + static void set_numa_interleave_bitmask(struct bitmask* ptr) { _numa_interleave_bitmask = ptr ; } + static void set_numa_membind_bitmask(struct bitmask* ptr) { _numa_membind_bitmask = ptr ; } static int sched_getcpu_syscall(void); + + enum Numa_allocation_policy { + not_initialized, + membind, + interleave + }; + static Numa_allocation_policy _current_numa_policy; + public: static int sched_getcpu() { return _sched_getcpu != NULL ? _sched_getcpu() : -1; } static int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen) { @@ -272,11 +288,35 @@ static int numa_tonode_memory(void *start, size_t size, int node) { return _numa_tonode_memory != NULL ? _numa_tonode_memory(start, size, node) : -1; } + + static bool is_running_in_interleave_mode() { + return _current_numa_policy == interleave ? true : false; + } + + static void set_configured_numa_policy(Numa_allocation_policy numa_policy) { + _current_numa_policy = numa_policy ; + } + + static Numa_allocation_policy identify_numa_policy() { + Numa_allocation_policy current_policy = membind; + for (int node = 0; node <= Linux::numa_max_node() ; node++) { + if (Linux::_numa_bitmask_isbitset(Linux::_numa_interleave_bitmask, node)) { + current_policy = interleave; + } + } + return current_policy ; + } + static void numa_interleave_memory(void *start, size_t size) { // Use v2 api if available - if (_numa_interleave_memory_v2 != NULL && _numa_all_nodes_ptr != NULL) { - _numa_interleave_memory_v2(start, size, _numa_all_nodes_ptr); - } else if (_numa_interleave_memory != NULL && _numa_all_nodes != NULL) { + if (_numa_interleave_memory_v2 != NULL && _numa_membind_bitmask != NULL) { + // Use interleave bitmask while running interleave mode. + if (is_running_in_interleave_mode()) { + _numa_interleave_memory_v2(start, size, _numa_interleave_bitmask); + } else if (_numa_membind_bitmask != NULL) { + _numa_interleave_memory_v2(start, size, _numa_membind_bitmask); + } + } else { _numa_interleave_memory(start, size, _numa_all_nodes); } } @@ -291,14 +331,14 @@ static int get_node_by_cpu(int cpu_id); static int get_existing_num_nodes(); // Check if numa node is configured (non-zero memory node). - static bool isnode_in_configured_nodes(unsigned int n) { + static bool is_node_in_configured_nodes(unsigned int n) { if (_numa_bitmask_isbitset != NULL && _numa_all_nodes_ptr != NULL) { return _numa_bitmask_isbitset(_numa_all_nodes_ptr, n); } else return false; } // Check if numa node exists in the system (including zero memory nodes). - static bool isnode_in_existing_nodes(unsigned int n) { + static bool is_node_in_existing_nodes(unsigned int n) { if (_numa_bitmask_isbitset != NULL && _numa_nodes_ptr != NULL) { return _numa_bitmask_isbitset(_numa_nodes_ptr, n); } else if (_numa_bitmask_isbitset != NULL && _numa_all_nodes_ptr != NULL) { @@ -317,16 +357,19 @@ return false; } // Check if node is in bound node set. - static bool isnode_in_bound_nodes(int node) { - if (_numa_get_membind != NULL && _numa_bitmask_isbitset != NULL) { - return _numa_bitmask_isbitset(_numa_get_membind(), node); - } else { - return false; + static bool is_node_in_bound_nodes(int node) { + if (_numa_bitmask_isbitset != NULL) { + if (is_running_in_interleave_mode()) { + return _numa_bitmask_isbitset(_numa_interleave_bitmask, node); + } else { + return _numa_membind_bitmask != NULL ? _numa_bitmask_isbitset(_numa_membind_bitmask, node) : false; + } } + return false; } // Check if bound to only one numa node. // Returns true if bound to a single numa node, otherwise returns false. - static bool isbound_to_single_node() { + static bool is_bound_to_single_node() { int nodes = 0; struct bitmask* bmp = NULL; unsigned int node = 0; From david.holmes at oracle.com Wed Jan 9 12:27:54 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 9 Jan 2019 22:27:54 +1000 Subject: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. In-Reply-To: <129ed17946754b9c896fa41dd44d031f@sap.com> References: <9349eed214ce46ee81868840c0dbd54d@sap.com> <6277c580-0397-d4ab-7b03-7721544048ff@oracle.com> <129ed17946754b9c896fa41dd44d031f@sap.com> Message-ID: Hi Goetz, On 9/01/2019 8:34 pm, Lindenmaier, Goetz wrote: > Hi David, > > thanks for looking at my change. > It was asked for by Gary when he reviewed https://bugs.openjdk.java.net/browse/JDK-8215975 > > New webrev: > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02-incremental/ Looks good. Two further pre-existing nits spotted: test/hotspot/jtreg/gtest/GTestWrapper.java ! * Copyright (c) 2016, 2019 Oracle Need a comma after 2019. Ditto for: test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java Actually I now see quite a number of files missing the comma so I'll file a general bug to fix that. Thanks, David > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02/ > > See my comments inline below. > > Best regards, > Goetz. > >> test/hotspot/jtreg/gtest/GTestWrapper.java >> >> 75 env.put(pathVar, path + ":" + ldLibraryPath); >> >> Shouldn't ":" be File.pathSeparator? > Fixed. > >> test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited >> ChannelTest.java >> >> Copyright year needs updating. > Done. > >> test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited >> ChannelTest.java >> >> 70 private static final Path pathEnvVar >> >> The variable isn't an env var, it's just a path - I suggest libraryPath. > A cleanup not directly related. But makes sense, done. > >> 101 >> System.out.println(Platform.sharedLibraryPathVariableName() + "=" + >> pathEnvVar); >> ... >> 114 env.put(Platform.sharedLibraryPathVariableName(), >> pathEnvVar.toString()); >> >> I suggest storing the name in a local to avoid the second call. > Done. > >> test/jdk/tools/launcher/JliLaunchTest.java >> >> 57 env.compute(pathEnvVar, (k, v) -> (v == null) ? libdir >> : libdir + ":" + v); >> >> Shouldn't ":" be File.pathSeparator? > This is because there is anyways a switch about the OS. > Did some more cleaning up. > >> test/jdk/tools/launcher/Test7029048.java >> >> 39 import jdk.test.lib.Platform; >> >> Why do you need this? > Removed. > >> test/jdk/vm/JniInvocationTest.java >> >> This is a Mac only test so no changes needed. > I would like to change this anyways. I think this makes > it look more consistent. > >> test/lib/jdk/test/lib/Platform.java >> >> The javadoc comments is unnecessary as we don't generate javadoc here. I >> see you copied the preceding sharedLibraryExt() style. The @return is >> superfluous. > Changed. Better? > > >> >> Thanks, >> David >> >>> Best regards, >>> Goetz. >>> From goetz.lindenmaier at sap.com Wed Jan 9 12:51:32 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 9 Jan 2019 12:51:32 +0000 Subject: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. In-Reply-To: References: <9349eed214ce46ee81868840c0dbd54d@sap.com> <6277c580-0397-d4ab-7b03-7721544048ff@oracle.com> <129ed17946754b9c896fa41dd44d031f@sap.com> Message-ID: <7cefd8a46ae647969894c43cab72bc88@sap.com> Hi David, I fixed these locally. Best regards, Goetz. > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 9. Januar 2019 13:28 > To: Lindenmaier, Goetz ; 'hotspot- > dev at openjdk.java.net' ; > gary.adams at oracle.com > Subject: Re: RFR(M): 8216265: [testbug] Introduce > Platform.sharedLibraryPathVariableName() and adapt all tests. > > Hi Goetz, > > On 9/01/2019 8:34 pm, Lindenmaier, Goetz wrote: > > Hi David, > > > > thanks for looking at my change. > > It was asked for by Gary when he reviewed > https://bugs.openjdk.java.net/browse/JDK-8215975 > > > > New webrev: > > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02-incremental/ > > Looks good. Two further pre-existing nits spotted: > > test/hotspot/jtreg/gtest/GTestWrapper.java > > ! * Copyright (c) 2016, 2019 Oracle > > Need a comma after 2019. > > Ditto for: > > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > ChannelTest.java > > Actually I now see quite a number of files missing the comma so > I'll file a general bug to fix that. > > Thanks, > David > > > > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02/ > > > > See my comments inline below. > > > > Best regards, > > Goetz. > > > >> test/hotspot/jtreg/gtest/GTestWrapper.java > >> > >> 75 env.put(pathVar, path + ":" + ldLibraryPath); > >> > >> Shouldn't ":" be File.pathSeparator? > > Fixed. > > > >> > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > >> ChannelTest.java > >> > >> Copyright year needs updating. > > Done. > > > >> > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > >> ChannelTest.java > >> > >> 70 private static final Path pathEnvVar > >> > >> The variable isn't an env var, it's just a path - I suggest libraryPath. > > A cleanup not directly related. But makes sense, done. > > > >> 101 > >> System.out.println(Platform.sharedLibraryPathVariableName() + "=" + > >> pathEnvVar); > >> ... > >> 114 env.put(Platform.sharedLibraryPathVariableName(), > >> pathEnvVar.toString()); > >> > >> I suggest storing the name in a local to avoid the second call. > > Done. > > > >> test/jdk/tools/launcher/JliLaunchTest.java > >> > >> 57 env.compute(pathEnvVar, (k, v) -> (v == null) ? libdir > >> : libdir + ":" + v); > >> > >> Shouldn't ":" be File.pathSeparator? > > This is because there is anyways a switch about the OS. > > Did some more cleaning up. > > > >> test/jdk/tools/launcher/Test7029048.java > >> > >> 39 import jdk.test.lib.Platform; > >> > >> Why do you need this? > > Removed. > > > >> test/jdk/vm/JniInvocationTest.java > >> > >> This is a Mac only test so no changes needed. > > I would like to change this anyways. I think this makes > > it look more consistent. > > > >> test/lib/jdk/test/lib/Platform.java > >> > >> The javadoc comments is unnecessary as we don't generate javadoc here. > I > >> see you copied the preceding sharedLibraryExt() style. The @return is > >> superfluous. > > Changed. Better? > > > > > >> > >> Thanks, > >> David > >> > >>> Best regards, > >>> Goetz. > >>> From robbin.ehn at oracle.com Wed Jan 9 12:57:13 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 9 Jan 2019 13:57:13 +0100 Subject: RFR(m): 8214271: Fast primitive to wake many threads In-Reply-To: References: <010211e3-93a6-80b9-678c-c84b08812e43@oracle.com> <70669453-e317-a30d-8d5a-e5b938b83c41@oracle.com> <4fb6cd22-cdd0-2419-c863-24b250ac0b16@oracle.com> <2a2679cc-b0e0-f8d0-7336-8666e1a42950@oracle.com> <01873a0f-a0fb-18b9-f7d4-98bb638e9b57@oracle.com> <41f5252b-3eb9-9a9e-70e5-49f6d8f9d670@oracle.com> <755aaf5b-8a49-ef5a-65ce-18550547a91b@oracle.com> Message-ID: Hi David, On 1/9/19 4:40 AM, David Holmes wrote: > Hi Robbin, > > No further significant comments, lets just see how this plays out. Yes, thanks! Fixed nits. If Dan have something more non-trivial I'll publish a v7. /Robbin > > Some minor nits: > > src/hotspot/share/utilities/waitBarrier.hpp > > ! // A primary goal of the WaitBarrier implementation is to disarm all waiting > > s/disarm/wake/ > > That was one place a global replace shouldn't have been applied. :) Fixed! > > ! //??? - Calling disarm() guarantees any thread calling or called wait(tag) will > > "or called" is not grammatically correct. Perhaps: > > //??? - Calling disarm() guarantees any thread now calling or that has called > wait(tag) will > Fixed! > > ??? // Guarantees any thread called wait() will be awake when it returns. > > s/called/that called/ > Fixed! > --- > > src/hotspot/share/utilities/waitBarrier_generic.cpp > > !?? // disarm store must not float below. > > s/float/sink/ > Fixed! > 74?? // API specifies wake() must provides a trailing fence. > > s/wake/disarm/ > s/provides/provide/ Fixed > > ?81???? // API specifies wait() must provides a trailing fence. > > s/provides/provide/ Fixed > > Thanks, > David > > On 8/01/2019 8:42 pm, Robbin Ehn wrote: >> Hi David, >> >> On 1/2/19 12:35 AM, David Holmes wrote: >>>>> Further this sounds like a race that could lead to bugs if not used very >>>>> carefully ie. you can't assume between disarm() and wake() that all threads >>>>> are blocked. >>>> >>>> I didn't realize how subtle this is. I think your original comment that >>>> disarm/wake should be one operation was spot on. >>>> Investigating... thinking... testing... yes I think this will work, fixed! >>>> Sorry for not looking more into this before. >>> >>> I'm now curious how this will actually work in the context of the safepoint >>> changes? >> >> Since code already handle this 'invariant' with threads not being block >> between disarm() and wake(), doing it one operation just very slightly >> increases the chance that a thread will be blocked when we actually can handle >> it to be running, but reduces the chance to hit a false positive TLH poll. >> (with TLH we have a two-step un-synchronizing out of safepoints where we must >> change global safepoint state before changing the thread polling state) >> >> (I have some thoughts on simplifying TLH/safepoint states) >> >>> Nit: I would have kept disarm() rather than wake() as I like the arm/disarm >>> duality. >> >> Yes, me too. Not sure why I did the opposite, fixed! >> >>> >>> ?? void GenericWaitBarrier::wait(int barrier_tag) { >>> ???? assert(barrier_tag != 0, "Trying to wait on disarmed value"); >>> +?? if (barrier_tag == 0 && barrier_tag != _barrier_tag) { >>> +???? OrderAccess::fence(); >>> +???? return; >>> +?? } >>> >>> I don't understand what the above is doing. A barrier_tag of 0 is a >>> programming error caught during testing in debug builds. You don't need to >>> account for it being 0 in product because this isn't something that can come >>> in from an external source - we have full code control here. And even if you >>> want to be this paranoid why would you need the fence? >> >> Fixed, but kept the fence, since we say we are providing a trailing fence. >> Otherwise I would like to add that exception to the description of wait(). >> >> Including Dan's comments: >> Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/ >> Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/ >> >> >> Thanks, Robbin >> >>> >>> Thanks, >>> David >>> ----- >>> >>>> Full: >>>> http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/ >>>> >>>> gtest passes thousands of loops locally and hundreds in mach5. >>>> >>>> Thanks, Robbin >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>>> >>>>>>> s/Implementation/Implementations/ >>>>>> >>>>>> Fixed >>>>>> >>>>>>> >>>>>>> The fourth line is no longer needed. >>>>>> >>>>>> Above is the reason I would like to keep the fourth line, since only if >>>>>> you call >>>>>> both disarm() and wake() you have that guarantee that waiter threads will >>>>>> return. >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> >>>>>>>> Inc: >>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/ >>>>>>>> >>>>>>>> Full: >>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/ >>>>>>>> >>>>>>>> /Robbin >>>>>>>> >>>>>>>>> >>>>>>>>> Otherwise this all looks good! >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>> >>>>>>>>>> Full: >>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/ >>>>>>>>>> >>>>>>>>>> Thanks, Robbin >>>>>>>>>> >>>>>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote: >>>>>>>>>>> Forgot RFR in subject. >>>>>>>>>>> >>>>>>>>>>> /Robbin >>>>>>>>>>> >>>>>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote: >>>>>>>>>>>> Hi all, please review. >>>>>>>>>>>> >>>>>>>>>>>> When a safepoint is ended we need a way to get back to 100% >>>>>>>>>>>> utilization as fast >>>>>>>>>>>> as possible. 100% utilization means no idle cpu in the system if >>>>>>>>>>>> there is a >>>>>>>>>>>> JavaThread that could be executed. The traditional ways to wake >>>>>>>>>>>> many, e.g. >>>>>>>>>>>> semaphore, pthread_cond, is not implemented with a single syscall >>>>>>>>>>>> instead they >>>>>>>>>>>> typical do one syscall per thread to wake. >>>>>>>>>>>> >>>>>>>>>>>> This change-set contains that primitive, the WaitBarrier, and a >>>>>>>>>>>> gtest for it. >>>>>>>>>>>> No actual users, which is in coming patches. >>>>>>>>>>>> >>>>>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore posting, >>>>>>>>>>>> threads woken >>>>>>>>>>>> will also post. On Linux we can instead directly use a futex and >>>>>>>>>>>> with one >>>>>>>>>>>> syscall wake all. Depending on how many threads and cpus the >>>>>>>>>>>> performance vary, >>>>>>>>>>>> but a good utilization of the machine, just on the edge of >>>>>>>>>>>> saturated, the time to reach 100% utilization is around 3 times >>>>>>>>>>>> faster with the WaitBarrier (where futex is faster than semaphore). >>>>>>>>>>>> >>>>>>>>>>>> Webrev: >>>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/ >>>>>>>>>>>> >>>>>>>>>>>> CR: >>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271 >>>>>>>>>>>> >>>>>>>>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug and >>>>>>>>>>>> release. >>>>>>>>>>>> And have been stable when used in safepoints (t1-8) (coming patches). >>>>>>>>>>>> >>>>>>>>>>>> Thanks, Robbin From coleen.phillimore at oracle.com Wed Jan 9 13:45:20 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Jan 2019 08:45:20 -0500 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <7f2d1db1-7387-0a09-e869-ec3ebce68c42@oracle.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> <1ae9e923-55b5-c006-e7a2-0603faf365d5@oracle.com> <7f2d1db1-7387-0a09-e869-ec3ebce68c42@oracle.com> Message-ID: <1bcc2c51-9931-e6b1-bedb-816bb4d89e33@oracle.com> +1 for the XX version. Coleen On 1/8/19 8:12 PM, Mandy Chung wrote: > > > On 1/8/19 4:59 PM, David Holmes wrote: >> On 9/01/2019 10:08 am, Aleksey Shipilev wrote: >>> On 1/9/19 12:24 AM, Mandy Chung wrote: >>>>> I really don't like the fact the VM is now setting the name field >>>>> and there's nothing in the Java >>>>> code to give any indication that this is happening. At a minimum a >>>>> comment should be added, as is >>>>> done with other class members that get accessed directly by the VM. >>>>> >>>>> I also think core-libs folk should be having a say here. >>>> >>>> Catching up on this thread... >>>> >>>> Two ways setting the Class::name field isn't pleasant.? What about: >>>> >>>> public String getName() { >>>> ??? String name = this.name; >>>> ??? return name != null ? name : initClassName(); >>>> } >>>> >>>> where JVM_InitClassName will call java_lang_Class::name(). >>> >>> Mmm. Should we really change the jvm.h here? Does that involve CSR? >> >> jvm.h is a private interface between the OpenJDK core libraries and >> the OpenJDK VM (Hotspot), and does not require a CSR request when >> changed. > > Yup.? CSR is not required for changes in this private interface. >> >>> It would have ripple effects on >>> Graal, potential backports (You'd want this in 11, right? I would. >>> This is a visible perf regression >>> since 8), etc. Note that after handelizing java_lang_Class::name, we >>> cannot simply call it from >>> JVM_GetClassName. >>> >>> Do we really think this is worth the hassle like this: >>> http://cr.openjdk.java.net/~shade/8216302/webrev.XX/ >> >> That version works for me. > > Thanks for making this change.? I prefer this version which makes the > code very clear what it does. > > Thanks > Mandy From shade at redhat.com Wed Jan 9 14:01:28 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 9 Jan 2019 15:01:28 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <1bcc2c51-9931-e6b1-bedb-816bb4d89e33@oracle.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> <1ae9e923-55b5-c006-e7a2-0603faf365d5@oracle.com> <7f2d1db1-7387-0a09-e869-ec3ebce68c42@oracle.com> <1bcc2c51-9931-e6b1-bedb-816bb4d89e33@oracle.com> Message-ID: <1216c095-12a4-1088-b911-be02a1be2a8a@redhat.com> On 1/9/19 2:45 PM, coleen.phillimore at oracle.com wrote: > +1 for the XX version. > Coleen Okay, this version passes the tests (build, new test, hotspot tier1, jdk-submit), I am going to push it, if there are no objections -- it is the XX version, but with updated copyrights: http://cr.openjdk.java.net/~shade/8216302/webrev.05/ Thanks, -Aleksey From mandy.chung at oracle.com Wed Jan 9 16:24:58 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 9 Jan 2019 08:24:58 -0800 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: <1216c095-12a4-1088-b911-be02a1be2a8a@redhat.com> References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> <1ae9e923-55b5-c006-e7a2-0603faf365d5@oracle.com> <7f2d1db1-7387-0a09-e869-ec3ebce68c42@oracle.com> <1bcc2c51-9931-e6b1-bedb-816bb4d89e33@oracle.com> <1216c095-12a4-1088-b911-be02a1be2a8a@redhat.com> Message-ID: On 1/9/19 6:01 AM, Aleksey Shipilev wrote: > On 1/9/19 2:45 PM, coleen.phillimore at oracle.com wrote: >> +1 for the XX version. >> Coleen > Okay, this version passes the tests (build, new test, hotspot tier1, jdk-submit), I am going to push > it, if there are no objections -- it is the XX version, but with updated copyrights: > http://cr.openjdk.java.net/~shade/8216302/webrev.05/ > +1 Thanks. Mandy From coleen.phillimore at oracle.com Wed Jan 9 16:47:36 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Jan 2019 11:47:36 -0500 Subject: RFR: 8215889: assert(!_unloading) failed: This oop is not available to unloading class loader data with ZGC In-Reply-To: References: Message-ID: http://cr.openjdk.java.net/~eosterlund/8215889/webrev.00/src/hotspot/share/ci/ciMethodData.cpp.udiff.html - - { // During translation a safepoint can happen or VM lock can be taken (e.g., Compile_lock). - MutexUnlocker ml(mdo->extra_data_lock()); - data_dst.translate_from(&data_src); - } + data_dst.translate_from(&data_src); You don't need to unlock extra_data_lock because prepare_metadata() has already added the Method* to ci_metadata (which can take out the Compile_lock). Adding Method* to ci_metadata will cause the Method* to not be unloaded.? This is my commentary. I think your comments about this are sufficient. I wish PrepareExtraDataClosure was in oops/methodData.cpp but it can't be because it has to add Method* to ci_metadata handles. This looks good. Thanks, Coleen On 1/7/19 4:51 AM, Erik ?sterlund wrote: > Hi, > > There are SpeculativeTrapData entries in the extra data space of MDOs > that are currently not being checked for stale Method* entries due to > concurrent class unloading. > > The fix involves lazily cleaning SpeculativeTrapData entries during > ciMethodData::load_extra_data(), which unpacks the extra data from the > source MDO to the ci copy of the MDO, that the compiler subsequently > uses as reference during the ongoing compilation, and needs to have > live metadata only. > > A new ciMethodData::prepare_metadata() method is added to ci MDO > mirrors that lazily cleans the extra data space and pre-caches the > ciEnv with all the metadata it encounters. When creating ciMethod > handles, the Compile_lock might be taken, which strictly requires > safepoint checking. Therefore, prepare_metadata() loops until it can > pre-cache all live metadata without any cache misses, because that > implies the subsequent code copying the MDO can not safepoint while > extracting the extra data from the MDO, which is a requirement as 1) a > safepoint may invalidate the metadata again, 2) both the cleaning > (from the concurrent GC thread) and extraction (from the compiler > thread) must be done under the mdo->extra_data_lock(). > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8215889 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8215889/webrev.00/ > > Testing: hs-tier1-6, and a bunch of local testing, including 24 hours > kitchensink in fastdebug. > > Thanks, > /Erik From coleen.phillimore at oracle.com Wed Jan 9 17:02:41 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Jan 2019 12:02:41 -0500 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <34accc10-660a-3b96-dcd4-a72fe9e8b127@redhat.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> <2c73cea1-74ae-8c98-6efc-9855aa444605@oracle.com> <34accc10-660a-3b96-dcd4-a72fe9e8b127@redhat.com> Message-ID: <64d0e1d8-34f1-87ca-8492-b8e8c7ac4da3@oracle.com> I'm replying to this message but my reply applies to the one by Dmitry and Thomas as well. Internally at Oracle, many of us like and support the #pragma once change.? But as an open source project, we need to consider strongly the opinions of all contributors, or at least the ones that have spoken up here and are also Reviewers.? Also, we cannot practically verify all the different platforms that OpenJDK can be built on, whether supported by an organization or not. Some of the arguments against #pragma once seem like situations that are difficult or impossible to observe in our build, and Erik's replies are well reasoned.? Even so, I have to think that this discomfort with this change is greater than the cut/paste annoyance of creating include guards.? Include guards are a defacto industry standard way of solving this problem. This is a practical argument against continuing with this change at this time.? Therefore, I'm withdrawing this change. Thanks, Coleen On 1/7/19 4:39 AM, Andrew Haley wrote: > On 1/5/19 7:24 PM, Erik ?sterlund wrote: >> I see that is a similarity indeed. But there are important differences. >> >> The main difference is that compiler internal ABI for atomics on ARMv7 >> and PPC (which was my particular concern in that conversation), a) do >> have incompatible bindings that are allowed by the standard described in >> papers with proposed bindings (as I pointed out then), >> >> b) would be really dangerous if it subtly changed because it could >> go undetected for a long time before anyone noticed stranged crashes >> because of it. > There are two possibilities: either it'd happen by accident or > deliberately. By accident is just a code generation bug, no different > from any other, and of course we're always at risk from those. > Deliberately would require a lot of dicussion because it'd break > binary compatibility. So I don't believe it. > > But that doesn't matter, I'm satisfied: purely hypothetical but > implausible arguments abut what compilers might do with less than > fully standardized features are off the table. > >> We essentially rely on the generated machine code to have an exact >> machine code binding that is compatible. And for what it's worth, I >> am okay with changing the x64 Atomic/OrderAccess implementation to >> use compiler intrinsics. Because there is essentially no risk due to >> the nature of the ISA. >> However, hypothetical differences in whether symbolic references are >> followed or not for #pragma once would lead to HotSpot either >> building or not, depending on whether it relies on that or not >> (pretty sure it doesn't), and never cause bugs to silently infect >> the binary. Conversely, not using #pragma once and relying on all >> files getting the manually typed include guards right, seems more >> dangerous to me. >> >> So the atomics reliance comes with a risk, the #pragma once reliance >> does not - it removes a risk. > OK, so the argument is not hypothetical at all, but purely practical. > >> If we truly stop relying on compiler features that are >> implementation defined, when there are no risks involved, we would >> end up crippled and get nothing done. > Yes. We should use the compiler to help us as much as possible, reduce > our code complexity, and reduce our maintenance costs. > From erik.osterlund at oracle.com Wed Jan 9 17:03:02 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 9 Jan 2019 18:03:02 +0100 Subject: RFR: 8215889: assert(!_unloading) failed: This oop is not available to unloading class loader data with ZGC In-Reply-To: References: Message-ID: Hi Coleen, Thanks for the review! /Erik > On 9 Jan 2019, at 17:47, coleen.phillimore at oracle.com wrote: > > > http://cr.openjdk.java.net/~eosterlund/8215889/webrev.00/src/hotspot/share/ci/ciMethodData.cpp.udiff.html > > - > - { // During translation a safepoint can happen or VM lock can be taken (e.g., Compile_lock). > - MutexUnlocker ml(mdo->extra_data_lock()); > - data_dst.translate_from(&data_src); > - } > + data_dst.translate_from(&data_src); > > You don't need to unlock extra_data_lock because prepare_metadata() has already added the Method* to ci_metadata (which can take out the Compile_lock). > > Adding Method* to ci_metadata will cause the Method* to not be unloaded. This is my commentary. I think your comments about this are sufficient. > > I wish PrepareExtraDataClosure was in oops/methodData.cpp but it can't be because it has to add Method* to ci_metadata handles. > > This looks good. > > Thanks, > Coleen > >> On 1/7/19 4:51 AM, Erik ?sterlund wrote: >> Hi, >> >> There are SpeculativeTrapData entries in the extra data space of MDOs that are currently not being checked for stale Method* entries due to concurrent class unloading. >> >> The fix involves lazily cleaning SpeculativeTrapData entries during ciMethodData::load_extra_data(), which unpacks the extra data from the source MDO to the ci copy of the MDO, that the compiler subsequently uses as reference during the ongoing compilation, and needs to have live metadata only. >> >> A new ciMethodData::prepare_metadata() method is added to ci MDO mirrors that lazily cleans the extra data space and pre-caches the ciEnv with all the metadata it encounters. When creating ciMethod handles, the Compile_lock might be taken, which strictly requires safepoint checking. Therefore, prepare_metadata() loops until it can pre-cache all live metadata without any cache misses, because that implies the subsequent code copying the MDO can not safepoint while extracting the extra data from the MDO, which is a requirement as 1) a safepoint may invalidate the metadata again, 2) both the cleaning (from the concurrent GC thread) and extraction (from the compiler thread) must be done under the mdo->extra_data_lock(). >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8215889 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8215889/webrev.00/ >> >> Testing: hs-tier1-6, and a bunch of local testing, including 24 hours kitchensink in fastdebug. >> >> Thanks, >> /Erik > From david.lloyd at redhat.com Wed Jan 9 17:15:25 2019 From: david.lloyd at redhat.com (David Lloyd) Date: Wed, 9 Jan 2019 11:15:25 -0600 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: <64d0e1d8-34f1-87ca-8492-b8e8c7ac4da3@oracle.com> References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> <2c73cea1-74ae-8c98-6efc-9855aa444605@oracle.com> <34accc10-660a-3b96-dcd4-a72fe9e8b127@redhat.com> <64d0e1d8-34f1-87ca-8492-b8e8c7ac4da3@oracle.com> Message-ID: If there was some way to push for and achieve "#pragma once" inclusion in the upstream C/C++ standards, I think that would change the arguments substantially. On Wed, Jan 9, 2019 at 11:03 AM wrote: > > > I'm replying to this message but my reply applies to the one by Dmitry > and Thomas as well. > > Internally at Oracle, many of us like and support the #pragma once > change. But as an open source project, we need to consider strongly the > opinions of all contributors, or at least the ones that have spoken up > here and are also Reviewers. Also, we cannot practically verify all the > different platforms that OpenJDK can be built on, whether supported by > an organization or not. > > Some of the arguments against #pragma once seem like situations that are > difficult or impossible to observe in our build, and Erik's replies are > well reasoned. Even so, I have to think that this discomfort with this > change is greater than the cut/paste annoyance of creating include > guards. Include guards are a defacto industry standard way of solving > this problem. > > This is a practical argument against continuing with this change at this > time. Therefore, I'm withdrawing this change. > > Thanks, > Coleen > > On 1/7/19 4:39 AM, Andrew Haley wrote: > > On 1/5/19 7:24 PM, Erik ?sterlund wrote: > >> I see that is a similarity indeed. But there are important differences. > >> > >> The main difference is that compiler internal ABI for atomics on ARMv7 > >> and PPC (which was my particular concern in that conversation), a) do > >> have incompatible bindings that are allowed by the standard described in > >> papers with proposed bindings (as I pointed out then), > >> > >> b) would be really dangerous if it subtly changed because it could > >> go undetected for a long time before anyone noticed stranged crashes > >> because of it. > > There are two possibilities: either it'd happen by accident or > > deliberately. By accident is just a code generation bug, no different > > from any other, and of course we're always at risk from those. > > Deliberately would require a lot of dicussion because it'd break > > binary compatibility. So I don't believe it. > > > > But that doesn't matter, I'm satisfied: purely hypothetical but > > implausible arguments abut what compilers might do with less than > > fully standardized features are off the table. > > > >> We essentially rely on the generated machine code to have an exact > >> machine code binding that is compatible. And for what it's worth, I > >> am okay with changing the x64 Atomic/OrderAccess implementation to > >> use compiler intrinsics. Because there is essentially no risk due to > >> the nature of the ISA. > >> However, hypothetical differences in whether symbolic references are > >> followed or not for #pragma once would lead to HotSpot either > >> building or not, depending on whether it relies on that or not > >> (pretty sure it doesn't), and never cause bugs to silently infect > >> the binary. Conversely, not using #pragma once and relying on all > >> files getting the manually typed include guards right, seems more > >> dangerous to me. > >> > >> So the atomics reliance comes with a risk, the #pragma once reliance > >> does not - it removes a risk. > > OK, so the argument is not hypothetical at all, but purely practical. > > > >> If we truly stop relying on compiler features that are > >> implementation defined, when there are no risks involved, we would > >> end up crippled and get nothing done. > > Yes. We should use the compiler to help us as much as possible, reduce > > our code complexity, and reduce our maintenance costs. > > > -- - DML From fweimer at redhat.com Wed Jan 9 18:05:51 2019 From: fweimer at redhat.com (Florian Weimer) Date: Wed, 09 Jan 2019 19:05:51 +0100 Subject: RFR (tedious) 8216022: Use #pragma once In-Reply-To: (David Lloyd's message of "Wed, 9 Jan 2019 11:15:25 -0600") References: <9250036e-8696-6103-6c3f-513fa11ffebd@oracle.com> <5262362A-1F21-4339-BFDC-7DB81C61D977@oracle.com> <99994f41-284d-0b3f-1b90-27c0b7933620@redhat.com> <165692e4-030c-38fd-e8ec-f7f6c467e928@oracle.com> <8fa77621-1de6-45fa-8f27-90eac1b39b8e@redhat.com> <8736q8smt7.fsf@oldenburg2.str.redhat.com> <72149c30-198b-4953-3ff5-db9d41a13c02@redhat.com> <575efd28-6644-ee8c-846d-3787f41ab9ce@oracle.com> <75687bd1-6fc2-51c4-557a-ffaa3f582499@redhat.com> <2c73cea1-74ae-8c98-6efc-9855aa444605@oracle.com> <34accc10-660a-3b96-dcd4-a72fe9e8b127@redhat.com> <64d0e1d8-34f1-87ca-8492-b8e8c7ac4da3@oracle.com> Message-ID: <87r2dloge8.fsf@oldenburg2.str.redhat.com> * David Lloyd: > If there was some way to push for and achieve "#pragma once" inclusion > in the upstream C/C++ standards, I think that would change the > arguments substantially. I believe modules (any variant) will behave in this way: importing a model will always be idempotent. I would expect it to be unlikely that a similar preprocessor feature would be accepted at this point. The value of standardization is minimal anyway because it cannot describe when two files are the same: The standard does not know about hard links, symbolic links, bind mounts, reparse points, and other things that there are out there. But that would be required for complete interoperability between independent implementations. Thanks, Florian From harold.seigel at oracle.com Wed Jan 9 18:56:25 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Wed, 9 Jan 2019 13:56:25 -0500 Subject: RFR 8207964: [TESTBUG] Change stressTime to default to 30 for nsk tests Message-ID: Hi, Please review this fix to change the default stress time for hotspot vmTestbase tests from 60 seconds to 30 seconds. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8207964/webrev/index.html JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207964 The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, Windows, Solaris, and Mac OS X. Thanks, Harold From shade at redhat.com Wed Jan 9 19:29:15 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 9 Jan 2019 20:29:15 +0100 Subject: RFR (S) 8216302: StackTraceElement::fill_in can use cached Class.name In-Reply-To: References: <06c190f8-cdd9-ecd2-8e36-0285e1bb1060@oracle.com> <4b943c73-d6c2-4751-219c-c47d9bf77ad3@redhat.com> <8d41a444-4b1b-f176-9cd0-374649e1c270@redhat.com> <6a039665-2f22-8651-5276-3f9dddbbc481@oracle.com> <2addf665-d49f-563b-649e-a88c1139de56@redhat.com> <1ae9e923-55b5-c006-e7a2-0603faf365d5@oracle.com> <7f2d1db1-7387-0a09-e869-ec3ebce68c42@oracle.com> <1bcc2c51-9931-e6b1-bedb-816bb4d89e33@oracle.com> <1216c095-12a4-1088-b911-be02a1be2a8a@redhat.com> Message-ID: <4b275a39-003b-2d6a-ca9c-10dae40a959d@redhat.com> On 1/9/19 5:24 PM, Mandy Chung wrote: > On 1/9/19 6:01 AM, Aleksey Shipilev wrote: >> On 1/9/19 2:45 PM, coleen.phillimore at oracle.com wrote: >>> +1 for the XX version. >>> Coleen >> Okay, this version passes the tests (build, new test, hotspot tier1, jdk-submit), I am going to push >> it, if there are no objections -- it is the XX version, but with updated copyrights: >> http://cr.openjdk.java.net/~shade/8216302/webrev.05/ >> > > +1 Okay, pushed. Thanks everyone. -Aleksey From sangheon.kim at oracle.com Wed Jan 9 19:33:46 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Wed, 9 Jan 2019 11:33:46 -0800 Subject: RFR (S/M): 8213827: NUMA heap allocation does not respect process membind/interleave settings [Was: Re: [PATCH] JDK NUMA Interleaving issue] In-Reply-To: References: <6e5b102d07b4ceded09115a649be020410240fe7.camel@oracle.com> <9bea7b0957bbfc2f0ac34306ee162f2d98e44bfe.camel@oracle.com> <99164b92f47f264978339ed327da9d41098a7e1d.camel@oracle.com> <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> Message-ID: <3b8edd37-80cd-0f06-55ed-326972db98de@oracle.com> Hi Amith, On 1/9/19 3:30 AM, amith pawar wrote: > Hi Sangheon, > > Thanks for reviewing and updated with suggested changes. please check. Thank you for addressing my comments. But I can't see below comments addressed: >> - Looking at 'enum' at os.hpp, we use Camel style. I meant to change from 'Numa_allocation_policy' to 'NumaAllocationPolicy'. >> - As usual, copyright year updates. I know it was correct when you >> posted. :) Looking at the latest source code, only os_linux.hpp needs a new copyright year. - * Copyright (c) 1999, 2018, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 1999, 2019, Oracle and/or its affiliates. All rights reserved. Looking at the v5, ??+????? ls.print("UseNUMA is enabled and invoked in '%s' mode." +??????????????? " Heap will be configured using NUMA memory nodes:", numa_mode); There is one more space before " Heap.... ", please remove it. I see the latest version that Thomas posted is v3, but your attached version is v5. :) In addition, it would be better to provide webrev instead of a patch. ( http://openjdk.java.net/guide/codeReview.html ) Thanks, Sangheon > > Thanks, > Amit Pawar > > On Wed, Jan 9, 2019 at 12:45 AM > wrote: > > Hi Thomas, > > On 12/13/18 2:33 AM, Thomas Schatzl wrote: >> Hi Amit, >> On Thu, 2018-12-13 at 15:11 +0530, amith pawar wrote: >>> Hi Thomas, >>> >>> Please find the attached patch updated as per your suggestion. >>> If everything OK then can you please commit this to repo ? >> looks good. We will need a second reviewer though, I am going to ask >> around. >> >> Latest webrev: >> http://cr.openjdk.java.net/~tschatzl/8213827/webrev.3/ >> > Webrev.3 looks good to me. > > I have some minor nits: > ---------------------------------------- > src/hotspot/os/linux/os_linux.cpp > 5012?????? for (int node = 0; node < Linux::numa_max_node(); node++) { > - Looks like 'node <*_=_* Linux::numa_max_node()' is the right one > to print the latest node? > > ---------------------------------------- > src/hotspot/os/linux/os_linux.hpp > ?271?? enum Numa_allocation_policy{ > - Looking at 'enum' at os.hpp, we use Camel style. > - There are missing space before '{'. > > - As usual, copyright year updates. I know it was correct when you > posted. :) > > Thanks, > Sangheon > > >> Thanks, >> Thomas >> >> > > > > -- > With best regards, > amit pawar From david.holmes at oracle.com Wed Jan 9 21:28:49 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 10 Jan 2019 07:28:49 +1000 Subject: RFR 8207964: [TESTBUG] Change stressTime to default to 30 for nsk tests In-Reply-To: References: Message-ID: Hi Harold, cc'd serviceability as a lot of nsk tests are in that area. On 10/01/2019 4:56 am, Harold David Seigel wrote: > Hi, > > Please review this fix to change the default stress time for hotspot > vmTestbase tests from 60 seconds to 30 seconds. Which tests actually use this default value? > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8207964/webrev/index.html The actual change to the default appears correct. I just don't know what impact this is going to have on any actual tests. Thanks, David > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8207964 > > The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, > Windows, Solaris, and Mac OS X. > > Thanks, Harold > From coleen.phillimore at oracle.com Wed Jan 9 21:46:55 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 9 Jan 2019 16:46:55 -0500 Subject: RFR 8207964: [TESTBUG] Change stressTime to default to 30 for nsk tests In-Reply-To: References: Message-ID: <895d8aeb-6ee6-492c-bc6f-805e62f530b1@oracle.com> On 1/9/19 4:28 PM, David Holmes wrote: > Hi Harold, > > cc'd serviceability as a lot of nsk tests are in that area. > > On 10/01/2019 4:56 am, Harold David Seigel wrote: >> Hi, >> >> Please review this fix to change the default stress time for hotspot >> vmTestbase tests from 60 seconds to 30 seconds. > > Which tests actually use this default value? All tests that don't pass -stressTime eg: vmTestbase/metaspace/stressHierarchy tests. The closed tests also don't pass -stressTime except a couple. Some of the GC unloading tests pass -stressTime 180. > >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8207964/webrev/index.html > > The actual change to the default appears correct. > > I just don't know what impact this is going to have on any actual tests. > I ran some of these manually setting -stressTime 30 like vmTestbase/metaspace/stressDictionary/StressDictionary.java and it ran enough iterations to do the work needed in the test.? That's why I suggested making the default lower. This change looks good to me. Thanks! Coleen > Thanks, > David > >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8207964 >> >> The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, >> Windows, Solaris, and Mac OS X. >> >> Thanks, Harold >> From david.holmes at oracle.com Wed Jan 9 21:50:12 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 10 Jan 2019 07:50:12 +1000 Subject: RFR 8207964: [TESTBUG] Change stressTime to default to 30 for nsk tests In-Reply-To: <895d8aeb-6ee6-492c-bc6f-805e62f530b1@oracle.com> References: <895d8aeb-6ee6-492c-bc6f-805e62f530b1@oracle.com> Message-ID: <145a4f18-18f8-f662-9af0-936d7eea6db8@oracle.com> On 10/01/2019 7:46 am, coleen.phillimore at oracle.com wrote: > On 1/9/19 4:28 PM, David Holmes wrote: >> Hi Harold, >> >> cc'd serviceability as a lot of nsk tests are in that area. >> >> On 10/01/2019 4:56 am, Harold David Seigel wrote: >>> Hi, >>> >>> Please review this fix to change the default stress time for hotspot >>> vmTestbase tests from 60 seconds to 30 seconds. >> >> Which tests actually use this default value? > > All tests that don't pass -stressTime eg: > vmTestbase/metaspace/stressHierarchy tests. > The closed tests also don't pass -stressTime except a couple. > > Some of the GC unloading tests pass -stressTime 180. Let me rephrase my question :) What nsk tests actually use stressTime to control their execution? I can't tell if we affecting 10 tests or 10,000 with this change. :) Thanks, David >> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8207964/webrev/index.html >> >> The actual change to the default appears correct. >> >> I just don't know what impact this is going to have on any actual tests. >> > > I ran some of these manually setting -stressTime 30 like > vmTestbase/metaspace/stressDictionary/StressDictionary.java and it ran > enough iterations to do the work needed in the test.? That's why I > suggested making the default lower. > > This change looks good to me. > Thanks! > Coleen >> Thanks, >> David >> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8207964 >>> >>> The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, >>> Windows, Solaris, and Mac OS X. >>> >>> Thanks, Harold >>> > From david.holmes at oracle.com Thu Jan 10 01:00:25 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 10 Jan 2019 11:00:25 +1000 Subject: RFR (S): 8214816: os::read() should not transition to _thread_blocked with safepoint check on Solaris Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8214816 webrev: http://cr.openjdk.java.net/~dholmes/8214816/webrev/ Please see the bug report for detailed background. In short summary all platforms now have the same os::read and os::read_at that doesn't do safepoint checks. Most of the changes are code deletions: - removed unused os::restartable_read() method - removed os::read from all os_*.cpp files and added shared inline definition in os.inline.hpp (checked all callsites already have os.inline.hpp included) - removed os::read_at from all non-Windows os_*.cpp files and added shared definition in os_posix.cpp (simple wrapper to pread()) - fixed the return type of os::read and os::read_at to be ssize_t not size_t. Also fixed os::read error handling in src/hotspot/share/compiler/directivesParser.cpp, and filed JDK-8216461 to have a JFR usage of os::read_at fixed. Also changed src/hotspot/share/runtime/arguments.cpp to use os::read as it no longer needs to avoid the thread-state-transition. Arguably we could go the other way here and remove os::read completely and use the native ::read on all platforms - there are already uses of ::read elsewhere in the code. Testing: Mach5 tiers 1 - 3 Thanks, David From david.holmes at oracle.com Thu Jan 10 01:54:33 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 10 Jan 2019 11:54:33 +1000 Subject: RFR 8207964: [TESTBUG] Change stressTime to default to 30 for nsk tests In-Reply-To: <145a4f18-18f8-f662-9af0-936d7eea6db8@oracle.com> References: <895d8aeb-6ee6-492c-bc6f-805e62f530b1@oracle.com> <145a4f18-18f8-f662-9af0-936d7eea6db8@oracle.com> Message-ID: On 10/01/2019 7:50 am, David Holmes wrote: > On 10/01/2019 7:46 am, coleen.phillimore at oracle.com wrote: >> On 1/9/19 4:28 PM, David Holmes wrote: >>> Hi Harold, >>> >>> cc'd serviceability as a lot of nsk tests are in that area. >>> >>> On 10/01/2019 4:56 am, Harold David Seigel wrote: >>>> Hi, >>>> >>>> Please review this fix to change the default stress time for hotspot >>>> vmTestbase tests from 60 seconds to 30 seconds. >>> >>> Which tests actually use this default value? >> >> All tests that don't pass -stressTime eg: >> vmTestbase/metaspace/stressHierarchy tests. >> The closed tests also don't pass -stressTime except a couple. >> >> Some of the GC unloading tests pass -stressTime 180. > > Let me rephrase my question :) What nsk tests actually use stressTime to > control their execution? I can't tell if we affecting 10 tests or 10,000 > with this change. :) Poking around a bit this is a hard question to answer. The stressTime is used by the Stresser (AFAICS) and I see 48 uses of the Stresser in the various test support files, but those files can in turn be used by multiple tests. Anyway this wasn't a blocking query. 30 seconds should be long enough in general, and a number of tests explicitly bump the time up. This may help reduce overall test execution time. It will be interesting to see if this impacts timeouts. Thanks, David ----- > > Thanks, > David > >>> >>>> Open Webrev: >>>> http://cr.openjdk.java.net/~hseigel/bug_8207964/webrev/index.html >>> >>> The actual change to the default appears correct. >>> >>> I just don't know what impact this is going to have on any actual tests. >>> >> >> I ran some of these manually setting -stressTime 30 like >> vmTestbase/metaspace/stressDictionary/StressDictionary.java and it ran >> enough iterations to do the work needed in the test.? That's why I >> suggested making the default lower. >> >> This change looks good to me. >> Thanks! >> Coleen >>> Thanks, >>> David >>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8207964 >>>> >>>> The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, >>>> Windows, Solaris, and Mac OS X. >>>> >>>> Thanks, Harold >>>> >> From jiangli.zhou at oracle.com Thu Jan 10 05:27:52 2019 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Wed, 9 Jan 2019 21:27:52 -0800 Subject: RFR (S): 8214816: os::read() should not transition to _thread_blocked with safepoint check on Solaris In-Reply-To: References: Message-ID: <8a5c5305-a624-3994-e681-d847671f0772@oracle.com> Hi David, The history lesson on os::read is fascinating! The cleanup looks nice to me. src/hotspot/os/windows/os_windows.cpp copyright year also needs update. Thanks, Jiangli On 1/9/19 5:00 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8214816 > webrev: http://cr.openjdk.java.net/~dholmes/8214816/webrev/ > > Please see the bug report for detailed background. In short summary > all platforms now have the same os::read and os::read_at that doesn't > do safepoint checks. Most of the changes are code deletions: > > - removed unused os::restartable_read() method > - removed os::read from all os_*.cpp files and added shared inline > definition in os.inline.hpp (checked all callsites already have > os.inline.hpp included) > - removed os::read_at from all non-Windows os_*.cpp files and added > shared definition in os_posix.cpp (simple wrapper to pread()) > - fixed the return type of os::read and os::read_at to be ssize_t not > size_t. > > Also fixed os::read error handling in > src/hotspot/share/compiler/directivesParser.cpp, and filed JDK-8216461 > to have a JFR usage of os::read_at fixed. > > Also changed src/hotspot/share/runtime/arguments.cpp to use os::read > as it no longer needs to avoid the thread-state-transition. Arguably > we could go the other way here and remove os::read completely and use > the native ::read on all platforms - there are already uses of ::read > elsewhere in the code. > > Testing: Mach5 tiers 1 - 3 > > Thanks, > David From amith.pawar at gmail.com Thu Jan 10 11:18:59 2019 From: amith.pawar at gmail.com (amith pawar) Date: Thu, 10 Jan 2019 16:48:59 +0530 Subject: RFR (S/M): 8213827: NUMA heap allocation does not respect process membind/interleave settings [Was: Re: [PATCH] JDK NUMA Interleaving issue] In-Reply-To: <3b8edd37-80cd-0f06-55ed-326972db98de@oracle.com> References: <6e5b102d07b4ceded09115a649be020410240fe7.camel@oracle.com> <9bea7b0957bbfc2f0ac34306ee162f2d98e44bfe.camel@oracle.com> <99164b92f47f264978339ed327da9d41098a7e1d.camel@oracle.com> <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> <3b8edd37-80cd-0f06-55ed-326972db98de@oracle.com> Message-ID: Hi Sangheon, Thanks again. I have done the required changes and created webrev. Please use following link to download the same as gmail is not allowing to attach. https://drive.google.com/open?id=1QzmW6LdmKbBNHp4-hlcIr9DKY7anUMBy Thanks, Amit On Thu, Jan 10, 2019 at 1:03 AM wrote: > Hi Amith, > > On 1/9/19 3:30 AM, amith pawar wrote: > > Hi Sangheon, > > Thanks for reviewing and updated with suggested changes. please check. > > Thank you for addressing my comments. > But I can't see below comments addressed: > > - Looking at 'enum' at os.hpp, we use Camel style. > > I meant to change from 'Numa_allocation_policy' to 'NumaAllocationPolicy'. > > - As usual, copyright year updates. I know it was correct when you posted. > :) > > Looking at the latest source code, only os_linux.hpp needs a new copyright > year. > - * Copyright (c) 1999, 2018, Oracle and/or its affiliates. All rights > reserved. > + * Copyright (c) 1999, 2019, Oracle and/or its affiliates. All rights > reserved. > > Looking at the v5, > ??+ ls.print("UseNUMA is enabled and invoked in '%s' mode." > + " Heap will be configured using NUMA memory nodes:", > numa_mode); > There is one more space before " Heap.... ", please remove it. > > I see the latest version that Thomas posted is v3, but your attached > version is v5. :) > > In addition, it would be better to provide webrev instead of a patch. ( > http://openjdk.java.net/guide/codeReview.html ) > > Thanks, > Sangheon > > > Thanks, > Amit Pawar > > On Wed, Jan 9, 2019 at 12:45 AM wrote: > >> Hi Thomas, >> >> On 12/13/18 2:33 AM, Thomas Schatzl wrote: >> >> Hi Amit, >> On Thu, 2018-12-13 at 15:11 +0530, amith pawar wrote: >> >> Hi Thomas, >> >> Please find the attached patch updated as per your suggestion. >> If everything OK then can you please commit this to repo ? >> >> looks good. We will need a second reviewer though, I am going to ask >> around. >> >> Latest webrev:http://cr.openjdk.java.net/~tschatzl/8213827/webrev.3/ >> >> Webrev.3 looks good to me. >> >> I have some minor nits: >> ---------------------------------------- >> src/hotspot/os/linux/os_linux.cpp >> 5012 for (int node = 0; node < Linux::numa_max_node(); node++) { >> - Looks like 'node <*=* Linux::numa_max_node()' is the right one to >> print the latest node? >> >> ---------------------------------------- >> src/hotspot/os/linux/os_linux.hpp >> 271 enum Numa_allocation_policy{ >> - Looking at 'enum' at os.hpp, we use Camel style. >> - There are missing space before '{'. >> >> - As usual, copyright year updates. I know it was correct when you >> posted. :) >> >> Thanks, >> Sangheon >> >> >> Thanks, >> Thomas >> >> >> >> >> > > -- > With best regards, > amit pawar > > > -- With best regards, amit pawar From coleen.phillimore at oracle.com Thu Jan 10 12:34:48 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 10 Jan 2019 07:34:48 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> Message-ID: No takers?? This fixes the include guard to match the file name in 1540 header files. If you add a header file, please use the file name after src/hotspot as the include guard name from now on (exclude the VM that used to be there). http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 Thanks, Coleen On 1/4/19 6:19 PM, coleen.phillimore at oracle.com wrote: > > > On 1/4/19 4:52 PM, Kim Barrett wrote: >>> On Jan 4, 2019, at 10:36 AM, coleen.phillimore at oracle.com wrote: >>> >>> Summary: Use script and some manual fixup to fix directores names in >>> include guards. >>> >>> Makes include guards match the current directory rooted at >>> src/hotspot (removes VM_ in most cases). >>> >>> This should be low risk.? Tested with mach5 tier1 and tier2. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8216167 >>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff >>> >>> I didn't generate a webrev as a space concern for >>> cr.openjdk.java.net and nobody should click on it.? Script is posted >>> in bug.? Will update and check copyright headers with hg commit. >>> >>> Thanks, >>> Coleen >> There are incorrect changes in >> src/hotspot/cpu/arm/globalDefinitions_arm.hpp > > Thank you for finding this. >> The script is not being careful to *only* modify #include guards.? I >> didn?t look for other similar problems. >> (This is an example of why I suggested rolling your own script might >> not actually be easier than using >> the guardonce utilities.) >> > I looked through again and didn't see any other problems.? I'm not > planning on productizing my script, which admittedly is too simple for > the general job.? But I still found it useful and entertaining (to me) > for this particular task. > > http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 > > Thanks, > Coleen From nils.eliasson at oracle.com Thu Jan 10 14:28:13 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 10 Jan 2019 15:28:13 +0100 Subject: RFR: 8215889: assert(!_unloading) failed: This oop is not available to unloading class loader data with ZGC In-Reply-To: References: Message-ID: <2fa54159-1a67-2210-d2be-4cc97177d012@oracle.com> Hi Erik, Nice work! Looks good. // Nils On 2019-01-07 10:51, Erik ?sterlund wrote: > Hi, > > There are SpeculativeTrapData entries in the extra data space of MDOs > that are currently not being checked for stale Method* entries due to > concurrent class unloading. > > The fix involves lazily cleaning SpeculativeTrapData entries during > ciMethodData::load_extra_data(), which unpacks the extra data from the > source MDO to the ci copy of the MDO, that the compiler subsequently > uses as reference during the ongoing compilation, and needs to have > live metadata only. > > A new ciMethodData::prepare_metadata() method is added to ci MDO > mirrors that lazily cleans the extra data space and pre-caches the > ciEnv with all the metadata it encounters. When creating ciMethod > handles, the Compile_lock might be taken, which strictly requires > safepoint checking. Therefore, prepare_metadata() loops until it can > pre-cache all live metadata without any cache misses, because that > implies the subsequent code copying the MDO can not safepoint while > extracting the extra data from the MDO, which is a requirement as 1) a > safepoint may invalidate the metadata again, 2) both the cleaning > (from the concurrent GC thread) and extraction (from the compiler > thread) must be done under the mdo->extra_data_lock(). > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8215889 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8215889/webrev.00/ > > Testing: hs-tier1-6, and a bunch of local testing, including 24 hours > kitchensink in fastdebug. > > Thanks, > /Erik From erik.osterlund at oracle.com Thu Jan 10 14:56:43 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 10 Jan 2019 15:56:43 +0100 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> Message-ID: Hi Coleen, share/gc/shared/barrierSet.inline.hpp Was added recently, needs an update. share/prims/wbtestmethods/parserTests.hpp Still has VM prefix in include guard, and lacks a whitespace in the #endif part. share/gc/parallel/psGenerationCounters.hpp Has a leading newline, which I think should be removed if we are fixing incorrect trailing newlines. share/opto/adlcVMDeps.hpp Seems to do something wrong now. This file is included both by libjvm.so and the adlc parser binary. It conditionally includes memory/allocation.hpp if it is built in libjvm.so, which is figured out by checking if the adlc arena allocation header has already been included or not with the following hack: #ifndef SHARE_VM_ADLC_ARENA_HPP #include "memory/allocation.hpp" #endif You changed SHARE_VM_ADLC_ARENA_HPP here to SHARE_OPTO_ADLCVMDEPS_HPP in the patch you posted, probably because your script thought it was an include guard. But it is not. It's an ugly hack. So with the changes you proposed, the condition would always be true, which is not intended. It should check for SHARE_ADLC_ARENA_HPP instead now. Otherwise this looks... I don't know... okay. But #pragma once would have been so much better. :c Thanks, /Erik On 2019-01-10 13:34, coleen.phillimore at oracle.com wrote: > > No takers?? This fixes the include guard to match the file name in > 1540 header files. > > If you add a header file, please use the file name after src/hotspot > as the include guard name from now on (exclude the VM that used to be > there). > > http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 > > Thanks, > Coleen > > On 1/4/19 6:19 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 1/4/19 4:52 PM, Kim Barrett wrote: >>>> On Jan 4, 2019, at 10:36 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> Summary: Use script and some manual fixup to fix directores names >>>> in include guards. >>>> >>>> Makes include guards match the current directory rooted at >>>> src/hotspot (removes VM_ in most cases). >>>> >>>> This should be low risk.? Tested with mach5 tier1 and tier2. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8216167 >>>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff >>>> >>>> I didn't generate a webrev as a space concern for >>>> cr.openjdk.java.net and nobody should click on it.? Script is >>>> posted in bug.? Will update and check copyright headers with hg >>>> commit. >>>> >>>> Thanks, >>>> Coleen >>> There are incorrect changes in >>> src/hotspot/cpu/arm/globalDefinitions_arm.hpp >> >> Thank you for finding this. >>> The script is not being careful to *only* modify #include guards.? I >>> didn?t look for other similar problems. >>> (This is an example of why I suggested rolling your own script might >>> not actually be easier than using >>> the guardonce utilities.) >>> >> I looked through again and didn't see any other problems.? I'm not >> planning on productizing my script, which admittedly is too simple >> for the general job.? But I still found it useful and entertaining >> (to me) for this particular task. >> >> http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 >> >> Thanks, >> Coleen > From erik.osterlund at oracle.com Thu Jan 10 14:57:15 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 10 Jan 2019 15:57:15 +0100 Subject: RFR: 8215889: assert(!_unloading) failed: This oop is not available to unloading class loader data with ZGC In-Reply-To: <2fa54159-1a67-2210-d2be-4cc97177d012@oracle.com> References: <2fa54159-1a67-2210-d2be-4cc97177d012@oracle.com> Message-ID: <655d5eac-93fd-7bd4-05f2-0058b2d5f5c8@oracle.com> Hi Nils, Thanks for the review. /Erik On 2019-01-10 15:28, Nils Eliasson wrote: > Hi Erik, > > Nice work! Looks good. > > // Nils > > On 2019-01-07 10:51, Erik ?sterlund wrote: >> Hi, >> >> There are SpeculativeTrapData entries in the extra data space of MDOs >> that are currently not being checked for stale Method* entries due to >> concurrent class unloading. >> >> The fix involves lazily cleaning SpeculativeTrapData entries during >> ciMethodData::load_extra_data(), which unpacks the extra data from >> the source MDO to the ci copy of the MDO, that the compiler >> subsequently uses as reference during the ongoing compilation, and >> needs to have live metadata only. >> >> A new ciMethodData::prepare_metadata() method is added to ci MDO >> mirrors that lazily cleans the extra data space and pre-caches the >> ciEnv with all the metadata it encounters. When creating ciMethod >> handles, the Compile_lock might be taken, which strictly requires >> safepoint checking. Therefore, prepare_metadata() loops until it can >> pre-cache all live metadata without any cache misses, because that >> implies the subsequent code copying the MDO can not safepoint while >> extracting the extra data from the MDO, which is a requirement as 1) >> a safepoint may invalidate the metadata again, 2) both the cleaning >> (from the concurrent GC thread) and extraction (from the compiler >> thread) must be done under the mdo->extra_data_lock(). >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8215889 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8215889/webrev.00/ >> >> Testing: hs-tier1-6, and a bunch of local testing, including 24 hours >> kitchensink in fastdebug. >> >> Thanks, >> /Erik From harold.seigel at oracle.com Thu Jan 10 15:12:43 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Thu, 10 Jan 2019 10:12:43 -0500 Subject: RFR 8207964: [TESTBUG] Change stressTime to default to 30 for nsk tests In-Reply-To: References: <895d8aeb-6ee6-492c-bc6f-805e62f530b1@oracle.com> <145a4f18-18f8-f662-9af0-936d7eea6db8@oracle.com> Message-ID: <838900c6-9422-b315-3abe-8d7ef8018977@oracle.com> Hi Coleen, David, Thanks for the reviews!? I'll push the change and then see if I can determine how many tests are affected by this change. Thanks, Harold On 1/9/2019 8:54 PM, David Holmes wrote: > On 10/01/2019 7:50 am, David Holmes wrote: >> On 10/01/2019 7:46 am, coleen.phillimore at oracle.com wrote: >>> On 1/9/19 4:28 PM, David Holmes wrote: >>>> Hi Harold, >>>> >>>> cc'd serviceability as a lot of nsk tests are in that area. >>>> >>>> On 10/01/2019 4:56 am, Harold David Seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this fix to change the default stress time for >>>>> hotspot vmTestbase tests from 60 seconds to 30 seconds. >>>> >>>> Which tests actually use this default value? >>> >>> All tests that don't pass -stressTime eg: >>> vmTestbase/metaspace/stressHierarchy tests. >>> The closed tests also don't pass -stressTime except a couple. >>> >>> Some of the GC unloading tests pass -stressTime 180. >> >> Let me rephrase my question :) What nsk tests actually use stressTime >> to control their execution? I can't tell if we affecting 10 tests or >> 10,000 with this change. :) > > Poking around a bit this is a hard question to answer. The stressTime > is used by the Stresser (AFAICS) and I see 48 uses of the Stresser in > the various test support files, but those files can in turn be used by > multiple tests. > > Anyway this wasn't a blocking query. 30 seconds should be long enough > in general, and a number of tests explicitly bump the time up. This > may help reduce overall test execution time. > > It will be interesting to see if this impacts timeouts. > > Thanks, > David > ----- > >> >> Thanks, >> David >> >>>> >>>>> Open Webrev: >>>>> http://cr.openjdk.java.net/~hseigel/bug_8207964/webrev/index.html >>>> >>>> The actual change to the default appears correct. >>>> >>>> I just don't know what impact this is going to have on any actual >>>> tests. >>>> >>> >>> I ran some of these manually setting -stressTime 30 like >>> vmTestbase/metaspace/stressDictionary/StressDictionary.java and it >>> ran enough iterations to do the work needed in the test.? That's why >>> I suggested making the default lower. >>> >>> This change looks good to me. >>> Thanks! >>> Coleen >>>> Thanks, >>>> David >>>> >>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8207964 >>>>> >>>>> The fix was tested by running Mach5 hotspot tiers 1-5 on >>>>> Linux-x64, Windows, Solaris, and Mac OS X. >>>>> >>>>> Thanks, Harold >>>>> >>> From shade at redhat.com Thu Jan 10 15:21:20 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 10 Jan 2019 16:21:20 +0100 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file Message-ID: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> RFE: https://bugs.openjdk.java.net/browse/JDK-8216308 Fix: http://cr.openjdk.java.net/~shade/8216308/webrev.01/ This is another patch that removes the use of SymbolTable on hot path in stack trace creation. We can inject Class.source_file field to cache the source file name. Some caution is needed to properly handle invalidation when redefinition happens. This makes stack trace generation significantly faster, and finally better than it used to even before StackWalker and StringTable-related regressions in 9 and 11. Benchmark (depth) Mode Cnt Score Error Units # 8u StackTraceBench.test 1 avgt 15 10.851 ? 0.075 us/op StackTraceBench.test 10 avgt 15 15.325 ? 0.089 us/op StackTraceBench.test 100 avgt 15 59.717 ? 0.449 us/op StackTraceBench.test 1000 avgt 15 529.020 ? 3.654 us/op # jdk/jdk baseline StackTraceBench.test 1 avgt 15 15.077 ? 0.065 us/op StackTraceBench.test 10 avgt 15 21.153 ? 0.123 us/op StackTraceBench.test 100 avgt 15 80.758 ? 0.363 us/op StackTraceBench.test 1000 avgt 15 674.888 ? 4.985 us/op # jdk/jdk patched StackTraceBench.test 1 avgt 15 8.892 ? 0.064 us/op StackTraceBench.test 10 avgt 15 12.010 ? 0.079 us/op StackTraceBench.test 100 avgt 15 43.091 ? 0.254 us/op StackTraceBench.test 1000 avgt 15 353.194 ? 2.040 us/op Testing: hotspot tier1, jdk-submit, ad-hoc benchmarks Thanks, -Aleksey From coleen.phillimore at oracle.com Thu Jan 10 15:20:14 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 10 Jan 2019 10:20:14 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> Message-ID: <30ab40d1-b18c-15b5-438a-14a078ebbd9e@oracle.com> On 1/10/19 9:56 AM, Erik ?sterlund wrote: > Hi Coleen, > > share/gc/shared/barrierSet.inline.hpp Was added recently, needs an > update. Fixed by hand. > share/prims/wbtestmethods/parserTests.hpp Still has VM prefix in > include guard, and lacks a whitespace in the #endif part. I already fixed this.?? I updated 02 with the new version so reload... > > share/gc/parallel/psGenerationCounters.hpp Has a leading newline, > which I think should be removed if we are fixing incorrect trailing > newlines. Great, yes, I removed leading blank line. > share/opto/adlcVMDeps.hpp Seems to do something wrong now. This file > is included both by libjvm.so and the adlc parser binary. It > conditionally includes memory/allocation.hpp if it is built in > libjvm.so, which is figured out by checking if the adlc arena > allocation header has already been included or not with the following > hack: > > #ifndef SHARE_VM_ADLC_ARENA_HPP > #include "memory/allocation.hpp" > #endif > > You changed SHARE_VM_ADLC_ARENA_HPP here to SHARE_OPTO_ADLCVMDEPS_HPP > in the patch you posted, probably because your script thought it was > an include guard. But it is not. It's an ugly hack. > > So with the changes you proposed, the condition would always be true, > which is not intended. It should check for SHARE_ADLC_ARENA_HPP > instead now. This file.? I fixed it with the #pragma once change, then forgot which one it was.?? This file doesn't need to include allocation if it doesn't use AllStatic so I changed it like this: diff --git a/src/hotspot/share/opto/adlcVMDeps.hpp b/src/hotspot/share/opto/adlcVMDeps.hpp --- a/src/hotspot/share/opto/adlcVMDeps.hpp +++ b/src/hotspot/share/opto/adlcVMDeps.hpp @@ -22,20 +22,18 @@ ? * ? */ -#ifndef SHARE_VM_OPTO_ADLCVMDEPS_HPP -#define SHARE_VM_OPTO_ADLCVMDEPS_HPP +#ifndef SHARE_OPTO_ADLCVMDEPS_HPP +#define SHARE_OPTO_ADLCVMDEPS_HPP + ?// adlcVMDeps.hpp is used by both adlc and vm builds. -// Only include allocation.hpp when we're not building adlc. -#ifndef SHARE_VM_ADLC_ARENA_HPP -#include "memory/allocation.hpp" -#endif +// Don't inherit from AllStatic to avoid including memory/allocation.hpp. ?// Declare commonly known constant and data structures between the ?// ADLC and the VM ?// -class AdlcVMDeps : public AllStatic { +class AdlcVMDeps {?? // AllStatic ? public: ?? // Mirror of TypeFunc types ?? enum { Control, I_O, Memory, FramePtr, ReturnAdr, Parms }; @@ -52,4 +50,4 @@ ?? static const char* none_reloc_type() { return "relocInfo::none"; } ?}; -#endif // SHARE_VM_OPTO_ADLCVMDEPS_HPP +#endif // SHARE_OPTO_ADLCVMDEPS_HPP > > Otherwise this looks... I don't know... okay. But #pragma once would > have been so much better. :c Yeah.? I know.? Thank you for looking through all of this. Coleen > > Thanks, > /Erik > > On 2019-01-10 13:34, coleen.phillimore at oracle.com wrote: >> >> No takers?? This fixes the include guard to match the file name in >> 1540 header files. >> >> If you add a header file, please use the file name after src/hotspot >> as the include guard name from now on (exclude the VM that used to be >> there). >> >> http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 >> >> Thanks, >> Coleen >> >> On 1/4/19 6:19 PM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 1/4/19 4:52 PM, Kim Barrett wrote: >>>>> On Jan 4, 2019, at 10:36 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Summary: Use script and some manual fixup to fix directores names >>>>> in include guards. >>>>> >>>>> Makes include guards match the current directory rooted at >>>>> src/hotspot (removes VM_ in most cases). >>>>> >>>>> This should be low risk.? Tested with mach5 tier1 and tier2. >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8216167 >>>>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff >>>>> >>>>> I didn't generate a webrev as a space concern for >>>>> cr.openjdk.java.net and nobody should click on it.? Script is >>>>> posted in bug.? Will update and check copyright headers with hg >>>>> commit. >>>>> >>>>> Thanks, >>>>> Coleen >>>> There are incorrect changes in >>>> src/hotspot/cpu/arm/globalDefinitions_arm.hpp >>> >>> Thank you for finding this. >>>> The script is not being careful to *only* modify #include guards.? >>>> I didn?t look for other similar problems. >>>> (This is an example of why I suggested rolling your own script >>>> might not actually be easier than using >>>> the guardonce utilities.) >>>> >>> I looked through again and didn't see any other problems.? I'm not >>> planning on productizing my script, which admittedly is too simple >>> for the general job.? But I still found it useful and entertaining >>> (to me) for this particular task. >>> >>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 >>> >>> Thanks, >>> Coleen >> > From erik.osterlund at oracle.com Thu Jan 10 15:29:58 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 10 Jan 2019 16:29:58 +0100 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: <30ab40d1-b18c-15b5-438a-14a078ebbd9e@oracle.com> References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> <30ab40d1-b18c-15b5-438a-14a078ebbd9e@oracle.com> Message-ID: <6a1764d7-7423-d219-5633-1866d313efbb@oracle.com> Hi Coleen, Reviewed. /Erik On 2019-01-10 16:20, coleen.phillimore at oracle.com wrote: > > > On 1/10/19 9:56 AM, Erik ?sterlund wrote: >> Hi Coleen, >> >> share/gc/shared/barrierSet.inline.hpp Was added recently, needs an >> update. > > Fixed by hand. > >> share/prims/wbtestmethods/parserTests.hpp Still has VM prefix in >> include guard, and lacks a whitespace in the #endif part. > > I already fixed this.?? I updated 02 with the new version so reload... >> >> share/gc/parallel/psGenerationCounters.hpp Has a leading newline, >> which I think should be removed if we are fixing incorrect trailing >> newlines. > > Great, yes, I removed leading blank line. >> share/opto/adlcVMDeps.hpp Seems to do something wrong now. This file >> is included both by libjvm.so and the adlc parser binary. It >> conditionally includes memory/allocation.hpp if it is built in >> libjvm.so, which is figured out by checking if the adlc arena >> allocation header has already been included or not with the following >> hack: >> >> #ifndef SHARE_VM_ADLC_ARENA_HPP >> #include "memory/allocation.hpp" >> #endif >> >> You changed SHARE_VM_ADLC_ARENA_HPP here to SHARE_OPTO_ADLCVMDEPS_HPP >> in the patch you posted, probably because your script thought it was >> an include guard. But it is not. It's an ugly hack. >> >> So with the changes you proposed, the condition would always be true, >> which is not intended. It should check for SHARE_ADLC_ARENA_HPP >> instead now. > > This file.? I fixed it with the #pragma once change, then forgot which > one it was.?? This file doesn't need to include allocation if it > doesn't use AllStatic so I changed it like this: > > diff --git a/src/hotspot/share/opto/adlcVMDeps.hpp > b/src/hotspot/share/opto/adlcVMDeps.hpp > --- a/src/hotspot/share/opto/adlcVMDeps.hpp > +++ b/src/hotspot/share/opto/adlcVMDeps.hpp > @@ -22,20 +22,18 @@ > ? * > ? */ > > -#ifndef SHARE_VM_OPTO_ADLCVMDEPS_HPP > -#define SHARE_VM_OPTO_ADLCVMDEPS_HPP > +#ifndef SHARE_OPTO_ADLCVMDEPS_HPP > +#define SHARE_OPTO_ADLCVMDEPS_HPP > + > > ?// adlcVMDeps.hpp is used by both adlc and vm builds. > -// Only include allocation.hpp when we're not building adlc. > -#ifndef SHARE_VM_ADLC_ARENA_HPP > -#include "memory/allocation.hpp" > -#endif > +// Don't inherit from AllStatic to avoid including > memory/allocation.hpp. > > ?// Declare commonly known constant and data structures between the > ?// ADLC and the VM > ?// > > -class AdlcVMDeps : public AllStatic { > +class AdlcVMDeps {?? // AllStatic > ? public: > ?? // Mirror of TypeFunc types > ?? enum { Control, I_O, Memory, FramePtr, ReturnAdr, Parms }; > @@ -52,4 +50,4 @@ > ?? static const char* none_reloc_type() { return "relocInfo::none"; } > ?}; > > -#endif // SHARE_VM_OPTO_ADLCVMDEPS_HPP > +#endif // SHARE_OPTO_ADLCVMDEPS_HPP > > >> >> Otherwise this looks... I don't know... okay. But #pragma once would >> have been so much better. :c > > Yeah.? I know.? Thank you for looking through all of this. > > Coleen >> >> Thanks, >> /Erik >> >> On 2019-01-10 13:34, coleen.phillimore at oracle.com wrote: >>> >>> No takers?? This fixes the include guard to match the file name in >>> 1540 header files. >>> >>> If you add a header file, please use the file name after src/hotspot >>> as the include guard name from now on (exclude the VM that used to >>> be there). >>> >>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 >>> >>> Thanks, >>> Coleen >>> >>> On 1/4/19 6:19 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 1/4/19 4:52 PM, Kim Barrett wrote: >>>>>> On Jan 4, 2019, at 10:36 AM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Summary: Use script and some manual fixup to fix directores names >>>>>> in include guards. >>>>>> >>>>>> Makes include guards match the current directory rooted at >>>>>> src/hotspot (removes VM_ in most cases). >>>>>> >>>>>> This should be low risk.? Tested with mach5 tier1 and tier2. >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8216167 >>>>>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff >>>>>> >>>>>> I didn't generate a webrev as a space concern for >>>>>> cr.openjdk.java.net and nobody should click on it. Script is >>>>>> posted in bug.? Will update and check copyright headers with hg >>>>>> commit. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> There are incorrect changes in >>>>> src/hotspot/cpu/arm/globalDefinitions_arm.hpp >>>> >>>> Thank you for finding this. >>>>> The script is not being careful to *only* modify #include guards.? >>>>> I didn?t look for other similar problems. >>>>> (This is an example of why I suggested rolling your own script >>>>> might not actually be easier than using >>>>> the guardonce utilities.) >>>>> >>>> I looked through again and didn't see any other problems. I'm not >>>> planning on productizing my script, which admittedly is too simple >>>> for the general job.? But I still found it useful and entertaining >>>> (to me) for this particular task. >>>> >>>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 >>>> >>>> Thanks, >>>> Coleen >>> >> > From coleen.phillimore at oracle.com Thu Jan 10 15:32:57 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 10 Jan 2019 10:32:57 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: <6a1764d7-7423-d219-5633-1866d313efbb@oracle.com> References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> <30ab40d1-b18c-15b5-438a-14a078ebbd9e@oracle.com> <6a1764d7-7423-d219-5633-1866d313efbb@oracle.com> Message-ID: Thank you, Erik! Coleen On 1/10/19 10:29 AM, Erik ?sterlund wrote: > Hi Coleen, > > Reviewed. > > /Erik > > On 2019-01-10 16:20, coleen.phillimore at oracle.com wrote: >> >> >> On 1/10/19 9:56 AM, Erik ?sterlund wrote: >>> Hi Coleen, >>> >>> share/gc/shared/barrierSet.inline.hpp Was added recently, needs an >>> update. >> >> Fixed by hand. >> >>> share/prims/wbtestmethods/parserTests.hpp Still has VM prefix in >>> include guard, and lacks a whitespace in the #endif part. >> >> I already fixed this.?? I updated 02 with the new version so reload... >>> >>> share/gc/parallel/psGenerationCounters.hpp Has a leading newline, >>> which I think should be removed if we are fixing incorrect trailing >>> newlines. >> >> Great, yes, I removed leading blank line. >>> share/opto/adlcVMDeps.hpp Seems to do something wrong now. This file >>> is included both by libjvm.so and the adlc parser binary. It >>> conditionally includes memory/allocation.hpp if it is built in >>> libjvm.so, which is figured out by checking if the adlc arena >>> allocation header has already been included or not with the >>> following hack: >>> >>> #ifndef SHARE_VM_ADLC_ARENA_HPP >>> #include "memory/allocation.hpp" >>> #endif >>> >>> You changed SHARE_VM_ADLC_ARENA_HPP here to >>> SHARE_OPTO_ADLCVMDEPS_HPP in the patch you posted, probably because >>> your script thought it was an include guard. But it is not. It's an >>> ugly hack. >>> >>> So with the changes you proposed, the condition would always be >>> true, which is not intended. It should check for >>> SHARE_ADLC_ARENA_HPP instead now. >> >> This file.? I fixed it with the #pragma once change, then forgot >> which one it was.?? This file doesn't need to include allocation if >> it doesn't use AllStatic so I changed it like this: >> >> diff --git a/src/hotspot/share/opto/adlcVMDeps.hpp >> b/src/hotspot/share/opto/adlcVMDeps.hpp >> --- a/src/hotspot/share/opto/adlcVMDeps.hpp >> +++ b/src/hotspot/share/opto/adlcVMDeps.hpp >> @@ -22,20 +22,18 @@ >> ? * >> ? */ >> >> -#ifndef SHARE_VM_OPTO_ADLCVMDEPS_HPP >> -#define SHARE_VM_OPTO_ADLCVMDEPS_HPP >> +#ifndef SHARE_OPTO_ADLCVMDEPS_HPP >> +#define SHARE_OPTO_ADLCVMDEPS_HPP >> + >> >> ?// adlcVMDeps.hpp is used by both adlc and vm builds. >> -// Only include allocation.hpp when we're not building adlc. >> -#ifndef SHARE_VM_ADLC_ARENA_HPP >> -#include "memory/allocation.hpp" >> -#endif >> +// Don't inherit from AllStatic to avoid including >> memory/allocation.hpp. >> >> ?// Declare commonly known constant and data structures between the >> ?// ADLC and the VM >> ?// >> >> -class AdlcVMDeps : public AllStatic { >> +class AdlcVMDeps {?? // AllStatic >> ? public: >> ?? // Mirror of TypeFunc types >> ?? enum { Control, I_O, Memory, FramePtr, ReturnAdr, Parms }; >> @@ -52,4 +50,4 @@ >> ?? static const char* none_reloc_type() { return "relocInfo::none"; } >> ?}; >> >> -#endif // SHARE_VM_OPTO_ADLCVMDEPS_HPP >> +#endif // SHARE_OPTO_ADLCVMDEPS_HPP >> >> >>> >>> Otherwise this looks... I don't know... okay. But #pragma once would >>> have been so much better. :c >> >> Yeah.? I know.? Thank you for looking through all of this. >> >> Coleen >>> >>> Thanks, >>> /Erik >>> >>> On 2019-01-10 13:34, coleen.phillimore at oracle.com wrote: >>>> >>>> No takers?? This fixes the include guard to match the file name in >>>> 1540 header files. >>>> >>>> If you add a header file, please use the file name after >>>> src/hotspot as the include guard name from now on (exclude the VM >>>> that used to be there). >>>> >>>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 >>>> >>>> Thanks, >>>> Coleen >>>> >>>> On 1/4/19 6:19 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 1/4/19 4:52 PM, Kim Barrett wrote: >>>>>>> On Jan 4, 2019, at 10:36 AM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Summary: Use script and some manual fixup to fix directores >>>>>>> names in include guards. >>>>>>> >>>>>>> Makes include guards match the current directory rooted at >>>>>>> src/hotspot (removes VM_ in most cases). >>>>>>> >>>>>>> This should be low risk.? Tested with mach5 tier1 and tier2. >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8216167 >>>>>>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff >>>>>>> >>>>>>> I didn't generate a webrev as a space concern for >>>>>>> cr.openjdk.java.net and nobody should click on it. Script is >>>>>>> posted in bug.? Will update and check copyright headers with hg >>>>>>> commit. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> There are incorrect changes in >>>>>> src/hotspot/cpu/arm/globalDefinitions_arm.hpp >>>>> >>>>> Thank you for finding this. >>>>>> The script is not being careful to *only* modify #include >>>>>> guards.? I didn?t look for other similar problems. >>>>>> (This is an example of why I suggested rolling your own script >>>>>> might not actually be easier than using >>>>>> the guardonce utilities.) >>>>>> >>>>> I looked through again and didn't see any other problems. I'm not >>>>> planning on productizing my script, which admittedly is too simple >>>>> for the general job.? But I still found it useful and entertaining >>>>> (to me) for this particular task. >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> >> > From daniel.daugherty at oracle.com Thu Jan 10 15:57:11 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 10 Jan 2019 10:57:11 -0500 Subject: RFR(m): 8214271: Fast primitive to wake many threads In-Reply-To: References: <010211e3-93a6-80b9-678c-c84b08812e43@oracle.com> <70669453-e317-a30d-8d5a-e5b938b83c41@oracle.com> <4fb6cd22-cdd0-2419-c863-24b250ac0b16@oracle.com> <2a2679cc-b0e0-f8d0-7336-8666e1a42950@oracle.com> <01873a0f-a0fb-18b9-f7d4-98bb638e9b57@oracle.com> <41f5252b-3eb9-9a9e-70e5-49f6d8f9d670@oracle.com> <755aaf5b-8a49-ef5a-65ce-18550547a91b@oracle.com> Message-ID: <2f31bdb7-71cd-be4d-7135-03dd4e9dd0b1@oracle.com> > Including Dan's comments: > Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/ > Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/ src/hotspot/os/linux/waitBarrier_linux.cpp ??? No comments. src/hotspot/os/linux/waitBarrier_linux.hpp ??? No comments. src/hotspot/share/utilities/waitBarrier.hpp ??? No comments beyond David's. src/hotspot/share/utilities/waitBarrier_generic.cpp ??? No comments beyond David's. src/hotspot/share/utilities/waitBarrier_generic.hpp ??? No comments. test/hotspot/gtest/utilities/test_waitBarrier.cpp ??? No comments. Thumbs up! Dan On 1/9/19 7:57 AM, Robbin Ehn wrote: > Hi David, > > On 1/9/19 4:40 AM, David Holmes wrote: >> Hi Robbin, >> >> No further significant comments, lets just see how this plays out. > > Yes, thanks! > > Fixed nits. > > If Dan have something more non-trivial I'll publish a v7. > > /Robbin > >> >> Some minor nits: >> >> src/hotspot/share/utilities/waitBarrier.hpp >> >> ! // A primary goal of the WaitBarrier implementation is to disarm >> all waiting >> >> s/disarm/wake/ >> >> That was one place a global replace shouldn't have been applied. :) > > Fixed! > >> >> ! //??? - Calling disarm() guarantees any thread calling or called >> wait(tag) will >> >> "or called" is not grammatically correct. Perhaps: >> >> //??? - Calling disarm() guarantees any thread now calling or that >> has called wait(tag) will >> > > Fixed! > >> >> ???? // Guarantees any thread called wait() will be awake when it >> returns. >> >> s/called/that called/ >> > > Fixed! > >> --- >> >> src/hotspot/share/utilities/waitBarrier_generic.cpp >> >> !?? // disarm store must not float below. >> >> s/float/sink/ >> > > Fixed! > >> 74?? // API specifies wake() must provides a trailing fence. >> >> s/wake/disarm/ >> s/provides/provide/ > > Fixed > >> >> ??81???? // API specifies wait() must provides a trailing fence. >> >> s/provides/provide/ > > Fixed > >> >> Thanks, >> David >> >> On 8/01/2019 8:42 pm, Robbin Ehn wrote: >>> Hi David, >>> >>> On 1/2/19 12:35 AM, David Holmes wrote: >>>>>> Further this sounds like a race that could lead to bugs if not >>>>>> used very carefully ie. you can't assume between disarm() and >>>>>> wake() that all threads are blocked. >>>>> >>>>> I didn't realize how subtle this is. I think your original comment >>>>> that >>>>> disarm/wake should be one operation was spot on. >>>>> Investigating... thinking... testing... yes I think this will >>>>> work, fixed! >>>>> Sorry for not looking more into this before. >>>> >>>> I'm now curious how this will actually work in the context of the >>>> safepoint changes? >>> >>> Since code already handle this 'invariant' with threads not being >>> block between disarm() and wake(), doing it one operation just very >>> slightly increases the chance that a thread will be blocked when we >>> actually can handle it to be running, but reduces the chance to hit >>> a false positive TLH poll. >>> (with TLH we have a two-step un-synchronizing out of safepoints >>> where we must change global safepoint state before changing the >>> thread polling state) >>> >>> (I have some thoughts on simplifying TLH/safepoint states) >>> >>>> Nit: I would have kept disarm() rather than wake() as I like the >>>> arm/disarm duality. >>> >>> Yes, me too. Not sure why I did the opposite, fixed! >>> >>>> >>>> ?? void GenericWaitBarrier::wait(int barrier_tag) { >>>> ???? assert(barrier_tag != 0, "Trying to wait on disarmed value"); >>>> +?? if (barrier_tag == 0 && barrier_tag != _barrier_tag) { >>>> +???? OrderAccess::fence(); >>>> +???? return; >>>> +?? } >>>> >>>> I don't understand what the above is doing. A barrier_tag of 0 is a >>>> programming error caught during testing in debug builds. You don't >>>> need to account for it being 0 in product because this isn't >>>> something that can come in from an external source - we have full >>>> code control here. And even if you want to be this paranoid why >>>> would you need the fence? >>> >>> Fixed, but kept the fence, since we say we are providing a trailing >>> fence. >>> Otherwise I would like to add that exception to the description of >>> wait(). >>> >>> Including Dan's comments: >>> Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/ >>> Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/ >>> >>> >>> Thanks, Robbin >>> >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Full: >>>>> http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/ >>>>> >>>>> gtest passes thousands of loops locally and hundreds in mach5. >>>>> >>>>> Thanks, Robbin >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>>> >>>>>>>> s/Implementation/Implementations/ >>>>>>> >>>>>>> Fixed >>>>>>> >>>>>>>> >>>>>>>> The fourth line is no longer needed. >>>>>>> >>>>>>> Above is the reason I would like to keep the fourth line, since >>>>>>> only if you call >>>>>>> both disarm() and wake() you have that guarantee that waiter >>>>>>> threads will >>>>>>> return. >>>>>>> >>>>>>> Thanks, Robbin >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>>> Inc: >>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/ >>>>>>>>> >>>>>>>>> Full: >>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/ >>>>>>>>> >>>>>>>>> /Robbin >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Otherwise this all looks good! >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> ----- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Full: >>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/ >>>>>>>>>>> >>>>>>>>>>> Thanks, Robbin >>>>>>>>>>> >>>>>>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote: >>>>>>>>>>>> Forgot RFR in subject. >>>>>>>>>>>> >>>>>>>>>>>> /Robbin >>>>>>>>>>>> >>>>>>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote: >>>>>>>>>>>>> Hi all, please review. >>>>>>>>>>>>> >>>>>>>>>>>>> When a safepoint is ended we need a way to get back to >>>>>>>>>>>>> 100% utilization as fast >>>>>>>>>>>>> as possible. 100% utilization means no idle cpu in the >>>>>>>>>>>>> system if there is a >>>>>>>>>>>>> JavaThread that could be executed. The traditional ways to >>>>>>>>>>>>> wake many, e.g. >>>>>>>>>>>>> semaphore, pthread_cond, is not implemented with a single >>>>>>>>>>>>> syscall instead they >>>>>>>>>>>>> typical do one syscall per thread to wake. >>>>>>>>>>>>> >>>>>>>>>>>>> This change-set contains that primitive, the WaitBarrier, >>>>>>>>>>>>> and a gtest for it. >>>>>>>>>>>>> No actual users, which is in coming patches. >>>>>>>>>>>>> >>>>>>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore >>>>>>>>>>>>> posting, threads woken >>>>>>>>>>>>> will also post. On Linux we can instead directly use a >>>>>>>>>>>>> futex and with one >>>>>>>>>>>>> syscall wake all. Depending on how many threads and cpus >>>>>>>>>>>>> the performance vary, >>>>>>>>>>>>> but a good utilization of the machine, just on the edge of >>>>>>>>>>>>> saturated, the time to reach 100% utilization is around 3 >>>>>>>>>>>>> times faster with the WaitBarrier (where futex is faster >>>>>>>>>>>>> than semaphore). >>>>>>>>>>>>> >>>>>>>>>>>>> Webrev: >>>>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/ >>>>>>>>>>>>> >>>>>>>>>>>>> CR: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271 >>>>>>>>>>>>> >>>>>>>>>>>>> Passes 100 iterations of gtest on our platforms, both >>>>>>>>>>>>> fastdebug and release. >>>>>>>>>>>>> And have been stable when used in safepoints (t1-8) >>>>>>>>>>>>> (coming patches). >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, Robbin From markus.gronlund at oracle.com Thu Jan 10 18:19:18 2019 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Thu, 10 Jan 2019 10:19:18 -0800 (PST) Subject: RFR (S): 8214816: os::read() should not transition to _thread_blocked with safepoint check on Solaris In-Reply-To: References: Message-ID: <1b51143f-cbf7-4429-bcbd-9b4a45240fe6@default> Hi David, Looks good, thanks for detailing the historical context related to this. Markus -----Original Message----- From: David Holmes Sent: den 10 januari 2019 02:00 To: hotspot-dev developers Subject: RFR (S): 8214816: os::read() should not transition to _thread_blocked with safepoint check on Solaris Bug: https://bugs.openjdk.java.net/browse/JDK-8214816 webrev: http://cr.openjdk.java.net/~dholmes/8214816/webrev/ Please see the bug report for detailed background. In short summary all platforms now have the same os::read and os::read_at that doesn't do safepoint checks. Most of the changes are code deletions: - removed unused os::restartable_read() method - removed os::read from all os_*.cpp files and added shared inline definition in os.inline.hpp (checked all callsites already have os.inline.hpp included) - removed os::read_at from all non-Windows os_*.cpp files and added shared definition in os_posix.cpp (simple wrapper to pread()) - fixed the return type of os::read and os::read_at to be ssize_t not size_t. Also fixed os::read error handling in src/hotspot/share/compiler/directivesParser.cpp, and filed JDK-8216461 to have a JFR usage of os::read_at fixed. Also changed src/hotspot/share/runtime/arguments.cpp to use os::read as it no longer needs to avoid the thread-state-transition. Arguably we could go the other way here and remove os::read completely and use the native ::read on all platforms - there are already uses of ::read elsewhere in the code. Testing: Mach5 tiers 1 - 3 Thanks, David From david.holmes at oracle.com Thu Jan 10 20:42:43 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Jan 2019 06:42:43 +1000 Subject: RFR (S): 8214816: os::read() should not transition to _thread_blocked with safepoint check on Solaris In-Reply-To: <1b51143f-cbf7-4429-bcbd-9b4a45240fe6@default> References: <1b51143f-cbf7-4429-bcbd-9b4a45240fe6@default> Message-ID: On 11/01/2019 4:19 am, Markus Gronlund wrote: > Hi David, > > Looks good, thanks for detailing the historical context related to this. Thanks for looking at this Markus. David ----- > Markus > > -----Original Message----- > From: David Holmes > Sent: den 10 januari 2019 02:00 > To: hotspot-dev developers > Subject: RFR (S): 8214816: os::read() should not transition to _thread_blocked with safepoint check on Solaris > > Bug: https://bugs.openjdk.java.net/browse/JDK-8214816 > webrev: http://cr.openjdk.java.net/~dholmes/8214816/webrev/ > > Please see the bug report for detailed background. In short summary all platforms now have the same os::read and os::read_at that doesn't do safepoint checks. Most of the changes are code deletions: > > - removed unused os::restartable_read() method > - removed os::read from all os_*.cpp files and added shared inline definition in os.inline.hpp (checked all callsites already have os.inline.hpp included) > - removed os::read_at from all non-Windows os_*.cpp files and added shared definition in os_posix.cpp (simple wrapper to pread()) > - fixed the return type of os::read and os::read_at to be ssize_t not size_t. > > Also fixed os::read error handling in > src/hotspot/share/compiler/directivesParser.cpp, and filed JDK-8216461 to have a JFR usage of os::read_at fixed. > > Also changed src/hotspot/share/runtime/arguments.cpp to use os::read as it no longer needs to avoid the thread-state-transition. Arguably we could go the other way here and remove os::read completely and use the native ::read on all platforms - there are already uses of ::read elsewhere in the code. > > Testing: Mach5 tiers 1 - 3 > > Thanks, > David > From kim.barrett at oracle.com Thu Jan 10 20:41:03 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 10 Jan 2019 15:41:03 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> Message-ID: > On Jan 10, 2019, at 7:34 AM, coleen.phillimore at oracle.com wrote: > > > No takers? This fixes the include guard to match the file name in 1540 header files. > > If you add a header file, please use the file name after src/hotspot as the include guard name from now on (exclude the VM that used to be there). > > http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 This deals with my issue (from the #pragma once thread) about the share/include files by excluding them from the changes. The pre-existing guard name for share/include/cds.h might not be good, but that?s not a problem with this change. This has also dealt with the incorrect changes to globalDefinitions_arm.hpp that I previously mentioned. Looks like Erik has done a good job of checking your patch. I don?t have anything to add to that. Looks good to me. From david.holmes at oracle.com Thu Jan 10 20:41:39 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Jan 2019 06:41:39 +1000 Subject: RFR (S): 8214816: os::read() should not transition to _thread_blocked with safepoint check on Solaris In-Reply-To: <8a5c5305-a624-3994-e681-d847671f0772@oracle.com> References: <8a5c5305-a624-3994-e681-d847671f0772@oracle.com> Message-ID: On 10/01/2019 3:27 pm, Jiangli Zhou wrote: > Hi David, > > The history lesson on os::read is fascinating! The cleanup looks nice to > me. Thanks for looking at it Jiangli! > src/hotspot/os/windows/os_windows.cpp copyright year also needs update. Fixed. Thanks, David > Thanks, > > Jiangli > > > On 1/9/19 5:00 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8214816 >> webrev: http://cr.openjdk.java.net/~dholmes/8214816/webrev/ >> >> Please see the bug report for detailed background. In short summary >> all platforms now have the same os::read and os::read_at that doesn't >> do safepoint checks. Most of the changes are code deletions: >> >> - removed unused os::restartable_read() method >> - removed os::read from all os_*.cpp files and added shared inline >> definition in os.inline.hpp (checked all callsites already have >> os.inline.hpp included) >> - removed os::read_at from all non-Windows os_*.cpp files and added >> shared definition in os_posix.cpp (simple wrapper to pread()) >> - fixed the return type of os::read and os::read_at to be ssize_t not >> size_t. >> >> Also fixed os::read error handling in >> src/hotspot/share/compiler/directivesParser.cpp, and filed JDK-8216461 >> to have a JFR usage of os::read_at fixed. >> >> Also changed src/hotspot/share/runtime/arguments.cpp to use os::read >> as it no longer needs to avoid the thread-state-transition. Arguably >> we could go the other way here and remove os::read completely and use >> the native ::read on all platforms - there are already uses of ::read >> elsewhere in the code. >> >> Testing: Mach5 tiers 1 - 3 >> >> Thanks, >> David > From coleen.phillimore at oracle.com Thu Jan 10 20:43:29 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 10 Jan 2019 15:43:29 -0500 Subject: RFR (also tedious) 8216167: Update include guards to reflect correct directories In-Reply-To: References: <1284b762-6bba-4154-2b69-4433a1ccbb1e@oracle.com> <696310B2-19C7-45BE-A52E-C121D7CBB3D2@oracle.com> <5572e32f-ab2f-f02d-533a-535896612e74@oracle.com> Message-ID: <9e9d3780-79b6-a83c-54c1-206e7d16e48f@oracle.com> On 1/10/19 3:41 PM, Kim Barrett wrote: >> On Jan 10, 2019, at 7:34 AM, coleen.phillimore at oracle.com wrote: >> >> >> No takers? This fixes the include guard to match the file name in 1540 header files. >> >> If you add a header file, please use the file name after src/hotspot as the include guard name from now on (exclude the VM that used to be there). >> >> http://cr.openjdk.java.net/~coleenp/8216167.01.diff.02 > This deals with my issue (from the #pragma once thread) about the share/include files by excluding them from the changes. > > The pre-existing guard name for share/include/cds.h might not be good, but that?s not a problem with this change. > > This has also dealt with the incorrect changes to globalDefinitions_arm.hpp that I previously mentioned. > > Looks like Erik has done a good job of checking your patch. I don?t have anything to add to that. Looks good to me. > > Thanks Kim. Coleen From david.holmes at oracle.com Fri Jan 11 00:22:54 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Jan 2019 10:22:54 +1000 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> Message-ID: <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> Hi Aleksey, On 11/01/2019 1:21 am, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8216308 > > Fix: > http://cr.openjdk.java.net/~shade/8216308/webrev.01/ > > This is another patch that removes the use of SymbolTable on hot path in stack trace creation. We > can inject Class.source_file field to cache the source file name. Some caution is needed to properly > handle invalidation when redefinition happens. I'm struggling a bit with the redefinition logic. IIRC redefinition can only happen at a safepoint so if there are concurrent calls to fillInStackTrace that involve a given class Foo, then they must all see the same version of Foo, and we can not have the case where one execution of the code is clearing the stale cache, while another is setting it to the new value - right? That said, IIRC Coleen stated that intern can lead to a safepoint, which would then invalidate the existing redefinition logic because we would get the line number after the intern and it may now be incorrect. So I think we have to reorder the code so the get_line_number occurs before the call to intern. I'm also very unclear about how the redefinition case is currently handled. It seems that we will normally intern NULL (and presumably get a NULL or empty-string oop?) unless ShowHiddenFrames is set, in which case we use the unknown_class_name() - regardless of whether the frame is actually hidden or not! This seems broken to me. (Separate bug to fix that is okay if it is indeed broken.) A couple of comments on comments: 2616 if (source != NULL) { 2617 // Class was not redefined, can trust its cache. 2618 if (source_file == NULL) { Can you expand the comment as follows: // Class was not redefined. We can trust its cache if set, // else we have to initialize it. 2622 } else { 2623 // Dump the cache in case class had it: it must be have been redefined. 2624 if (source_file != NULL) { Can you change the comment to be more consistent with the previous one: // Class was redefined. Dump the cache if it was set. Thanks, David ----- > This makes stack trace generation significantly faster, and finally better than it used to even > before StackWalker and StringTable-related regressions in 9 and 11. > > Benchmark (depth) Mode Cnt Score Error Units > > # 8u > StackTraceBench.test 1 avgt 15 10.851 ? 0.075 us/op > StackTraceBench.test 10 avgt 15 15.325 ? 0.089 us/op > StackTraceBench.test 100 avgt 15 59.717 ? 0.449 us/op > StackTraceBench.test 1000 avgt 15 529.020 ? 3.654 us/op > > # jdk/jdk baseline > StackTraceBench.test 1 avgt 15 15.077 ? 0.065 us/op > StackTraceBench.test 10 avgt 15 21.153 ? 0.123 us/op > StackTraceBench.test 100 avgt 15 80.758 ? 0.363 us/op > StackTraceBench.test 1000 avgt 15 674.888 ? 4.985 us/op > > # jdk/jdk patched > StackTraceBench.test 1 avgt 15 8.892 ? 0.064 us/op > StackTraceBench.test 10 avgt 15 12.010 ? 0.079 us/op > StackTraceBench.test 100 avgt 15 43.091 ? 0.254 us/op > StackTraceBench.test 1000 avgt 15 353.194 ? 2.040 us/op > > Testing: hotspot tier1, jdk-submit, ad-hoc benchmarks > > Thanks, > -Aleksey > From joe.darcy at oracle.com Fri Jan 11 06:13:43 2019 From: joe.darcy at oracle.com (Joe Darcy) Date: Thu, 10 Jan 2019 22:13:43 -0800 Subject: JDK 12 RFR of JDK-8213299: runtime/appcds/jigsaw/classpathtests/EmptyClassInBootClassPath.java failed with java.lang.NoSuchMethodException Message-ID: <5e5ae7b3-af70-81d4-0bc1-c56fd2b20165@oracle.com> Hello, Please review the changes to fix: ??? JDK-8213299: runtime/appcds/jigsaw/classpathtests/EmptyClassInBootClassPath.java failed with java.lang.NoSuchMethodException ??? http://cr.openjdk.java.net/~darcy/8213299.0/ For background, in the changes made for ??? JDK-6304578: (reflect) toGenericString fails to print bounds of type variables on generic methods the revised logic in methodToString which is used to build messages for exceptions mistakenly omits adding information about method parameters. The HotSpot test EmptyClassInBootClassPath.java examines information about the exception messages and fails due to this omission. Thanks to Sergei Tsypanov for noticing the root cause of this issue. Patch below. Thanks, -Joe --- old/src/java.base/share/classes/java/lang/Class.java 2019-01-10 21:28:40.586005000 -0800 +++ new/src/java.base/share/classes/java/lang/Class.java 2019-01-10 21:28:40.338005000 -0800 @@ -1,5 +1,5 @@ ?/* - * Copyright (c) 1994, 2018, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 1994, 2019, Oracle and/or its affiliates. All rights reserved. ? * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ? * ? * This code is free software; you can redistribute it and/or modify it @@ -3420,8 +3420,8 @@ ???????? StringBuilder sb = new StringBuilder(); ???????? sb.append(getName() + "." + name + "("); ???????? if (argTypes != null) { -??????????? Stream.of(argTypes).map(c -> {return (c == null) ? "null" : c.getName();}). -??????????????? collect(Collectors.joining(",")); +??????????? sb.append(Stream.of(argTypes).map(c -> {return (c == null) ? "null" : c.getName();}). +??? ??? ????? collect(Collectors.joining(","))); ???????? } ???????? sb.append(")"); ???????? return sb.toString(); --- old/test/hotspot/jtreg/ProblemList.txt??? 2019-01-10 21:28:41.210005000 -0800 +++ new/test/hotspot/jtreg/ProblemList.txt??? 2019-01-10 21:28:40.954005000 -0800 @@ -85,7 +85,6 @@ ?runtime/appcds/javaldr/GCSharedStringsDuringDump.java 8208778 macosx-x64 ?runtime/appcds/javaldr/GCDuringDump.java 8208778 macosx-x64 -runtime/appcds/jigsaw/classpathtests/EmptyClassInBootClassPath.java 8213299 generic-all ?runtime/CompressedOops/UseCompressedOops.java 8079353 generic-all ?runtime/handshake/HandshakeWalkSuspendExitTest.java 8214174 generic-all ?runtime/RedefineTests/RedefineRunningMethods.java 8208778 macosx-x64 From david.holmes at oracle.com Fri Jan 11 06:31:17 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 11 Jan 2019 16:31:17 +1000 Subject: JDK 12 RFR of JDK-8213299: runtime/appcds/jigsaw/classpathtests/EmptyClassInBootClassPath.java failed with java.lang.NoSuchMethodException In-Reply-To: <5e5ae7b3-af70-81d4-0bc1-c56fd2b20165@oracle.com> References: <5e5ae7b3-af70-81d4-0bc1-c56fd2b20165@oracle.com> Message-ID: <93d9c722-80f8-e891-6e5d-ff3371d8f2ca@oracle.com> Looks good Joe! Thanks, David On 11/01/2019 4:13 pm, Joe Darcy wrote: > Hello, > > Please review the changes to fix: > > ??? JDK-8213299: > runtime/appcds/jigsaw/classpathtests/EmptyClassInBootClassPath.java > failed with java.lang.NoSuchMethodException > ??? http://cr.openjdk.java.net/~darcy/8213299.0/ > > For background, in the changes made for > > ??? JDK-6304578: (reflect) toGenericString fails to print bounds of > type variables on generic methods > > the revised logic in methodToString which is used to build messages for > exceptions mistakenly omits adding information about method parameters. > The HotSpot test EmptyClassInBootClassPath.java examines information > about the exception messages and fails due to this omission. > > Thanks to Sergei Tsypanov for noticing the root cause of this issue. > > Patch below. > > Thanks, > > -Joe > > --- old/src/java.base/share/classes/java/lang/Class.java 2019-01-10 > 21:28:40.586005000 -0800 > +++ new/src/java.base/share/classes/java/lang/Class.java 2019-01-10 > 21:28:40.338005000 -0800 > @@ -1,5 +1,5 @@ > ?/* > - * Copyright (c) 1994, 2018, Oracle and/or its affiliates. All rights > reserved. > + * Copyright (c) 1994, 2019, Oracle and/or its affiliates. All rights > reserved. > ? * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > ? * > ? * This code is free software; you can redistribute it and/or modify it > @@ -3420,8 +3420,8 @@ > ???????? StringBuilder sb = new StringBuilder(); > ???????? sb.append(getName() + "." + name + "("); > ???????? if (argTypes != null) { > -??????????? Stream.of(argTypes).map(c -> {return (c == null) ? "null" : > c.getName();}). > -??????????????? collect(Collectors.joining(",")); > +??????????? sb.append(Stream.of(argTypes).map(c -> {return (c == null) > ? "null" : c.getName();}). > +??? ??? ????? collect(Collectors.joining(","))); > ???????? } > ???????? sb.append(")"); > ???????? return sb.toString(); > --- old/test/hotspot/jtreg/ProblemList.txt??? 2019-01-10 > 21:28:41.210005000 -0800 > +++ new/test/hotspot/jtreg/ProblemList.txt??? 2019-01-10 > 21:28:40.954005000 -0800 > @@ -85,7 +85,6 @@ > > ?runtime/appcds/javaldr/GCSharedStringsDuringDump.java 8208778 macosx-x64 > ?runtime/appcds/javaldr/GCDuringDump.java 8208778 macosx-x64 > -runtime/appcds/jigsaw/classpathtests/EmptyClassInBootClassPath.java > 8213299 generic-all > ?runtime/CompressedOops/UseCompressedOops.java 8079353 generic-all > ?runtime/handshake/HandshakeWalkSuspendExitTest.java 8214174 generic-all > ?runtime/RedefineTests/RedefineRunningMethods.java 8208778 macosx-x64 > From shade at redhat.com Fri Jan 11 09:10:25 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 11 Jan 2019 10:10:25 +0100 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> Message-ID: <0c2b1d41-8e34-2011-d630-1534bf0823f0@redhat.com> On 1/11/19 1:22 AM, David Holmes wrote: > Hi Aleksey, > > On 11/01/2019 1:21 am, Aleksey Shipilev wrote: >> RFE: >> ?? https://bugs.openjdk.java.net/browse/JDK-8216308 >> >> Fix: >> ?? http://cr.openjdk.java.net/~shade/8216308/webrev.01/ >> >> This is another patch that removes the use of SymbolTable on hot path in stack trace creation. We >> can inject Class.source_file field to cache the source file name. Some caution is needed to properly >> handle invalidation when redefinition happens. > > I'm struggling a bit with the redefinition logic. IIRC redefinition can only happen at a safepoint > so if there are concurrent calls to fillInStackTrace that involve a given class Foo, then they must > all see the same version of Foo, and we can not have the case where one execution of the code is > clearing the stale cache, while another is setting it to the new value - right? Mmm. I *hope* so. But, since we are reading the source_file into local, NULL-checking it, and only then using it, whatever happens with the class cache should not have immediate effect, and current (racy) caller would use the non-NULL value even if cache is being concurrently cleared. (There are silly C/C++ memory model issues that may still expose us to NULL even after NULL-check, e.g. by re-reading the memory instead of using the local, but that would break lots of other places too, I think) > That said, IIRC Coleen stated that intern can lead to a safepoint, which would then invalidate the > existing redefinition logic because we would get the line number after the intern and it may now be > incorrect. So I think we have to reorder the code so the get_line_number occurs before the call to > intern. Yeah, looks like it. Well, if that is so, we need to do that move in a separate bug and backport it. But I'd like someone more savvy in whole redefinition deal to see what is up. This patch can wait that fix, and apply the caching on top. > I'm also very unclear about how the redefinition case is currently handled. It seems that we will > normally intern NULL (and presumably get a NULL or empty-string oop?) unless ShowHiddenFrames is > set, in which case we use the unknown_class_name() - regardless of whether the frame is actually > hidden or not! This seems broken to me. (Separate bug to fix that is okay if it is indeed broken.) I *guess* that was the tradeoff for returning the nulls transiently while class was momentarily redefined... This patch tried to maintain whatever current behavior there is. > A couple of comments on comments: Thanks, these are fixed in-place. -Aleksey From robbin.ehn at oracle.com Fri Jan 11 09:20:38 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 11 Jan 2019 10:20:38 +0100 Subject: RFR(m): 8214271: Fast primitive to wake many threads In-Reply-To: <2f31bdb7-71cd-be4d-7135-03dd4e9dd0b1@oracle.com> References: <010211e3-93a6-80b9-678c-c84b08812e43@oracle.com> <70669453-e317-a30d-8d5a-e5b938b83c41@oracle.com> <4fb6cd22-cdd0-2419-c863-24b250ac0b16@oracle.com> <2a2679cc-b0e0-f8d0-7336-8666e1a42950@oracle.com> <01873a0f-a0fb-18b9-f7d4-98bb638e9b57@oracle.com> <41f5252b-3eb9-9a9e-70e5-49f6d8f9d670@oracle.com> <755aaf5b-8a49-ef5a-65ce-18550547a91b@oracle.com> <2f31bdb7-71cd-be4d-7135-03dd4e9dd0b1@oracle.com> Message-ID: Thanks Dan! /Robbin On 1/10/19 4:57 PM, Daniel D. Daugherty wrote: > > Including Dan's comments: > > Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/ > > Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/ > > src/hotspot/os/linux/waitBarrier_linux.cpp > ??? No comments. > > src/hotspot/os/linux/waitBarrier_linux.hpp > ??? No comments. > > src/hotspot/share/utilities/waitBarrier.hpp > ??? No comments beyond David's. > > src/hotspot/share/utilities/waitBarrier_generic.cpp > ??? No comments beyond David's. > > src/hotspot/share/utilities/waitBarrier_generic.hpp > ??? No comments. > > test/hotspot/gtest/utilities/test_waitBarrier.cpp > ??? No comments. > > Thumbs up! > > Dan > > > > On 1/9/19 7:57 AM, Robbin Ehn wrote: >> Hi David, >> >> On 1/9/19 4:40 AM, David Holmes wrote: >>> Hi Robbin, >>> >>> No further significant comments, lets just see how this plays out. >> >> Yes, thanks! >> >> Fixed nits. >> >> If Dan have something more non-trivial I'll publish a v7. >> >> /Robbin >> >>> >>> Some minor nits: >>> >>> src/hotspot/share/utilities/waitBarrier.hpp >>> >>> ! // A primary goal of the WaitBarrier implementation is to disarm all waiting >>> >>> s/disarm/wake/ >>> >>> That was one place a global replace shouldn't have been applied. :) >> >> Fixed! >> >>> >>> ! //??? - Calling disarm() guarantees any thread calling or called wait(tag) >>> will >>> >>> "or called" is not grammatically correct. Perhaps: >>> >>> //??? - Calling disarm() guarantees any thread now calling or that has called >>> wait(tag) will >>> >> >> Fixed! >> >>> >>> ???? // Guarantees any thread called wait() will be awake when it returns. >>> >>> s/called/that called/ >>> >> >> Fixed! >> >>> --- >>> >>> src/hotspot/share/utilities/waitBarrier_generic.cpp >>> >>> !?? // disarm store must not float below. >>> >>> s/float/sink/ >>> >> >> Fixed! >> >>> 74?? // API specifies wake() must provides a trailing fence. >>> >>> s/wake/disarm/ >>> s/provides/provide/ >> >> Fixed >> >>> >>> ??81???? // API specifies wait() must provides a trailing fence. >>> >>> s/provides/provide/ >> >> Fixed >> >>> >>> Thanks, >>> David >>> >>> On 8/01/2019 8:42 pm, Robbin Ehn wrote: >>>> Hi David, >>>> >>>> On 1/2/19 12:35 AM, David Holmes wrote: >>>>>>> Further this sounds like a race that could lead to bugs if not used very >>>>>>> carefully ie. you can't assume between disarm() and wake() that all >>>>>>> threads are blocked. >>>>>> >>>>>> I didn't realize how subtle this is. I think your original comment that >>>>>> disarm/wake should be one operation was spot on. >>>>>> Investigating... thinking... testing... yes I think this will work, fixed! >>>>>> Sorry for not looking more into this before. >>>>> >>>>> I'm now curious how this will actually work in the context of the safepoint >>>>> changes? >>>> >>>> Since code already handle this 'invariant' with threads not being block >>>> between disarm() and wake(), doing it one operation just very slightly >>>> increases the chance that a thread will be blocked when we actually can >>>> handle it to be running, but reduces the chance to hit a false positive TLH >>>> poll. >>>> (with TLH we have a two-step un-synchronizing out of safepoints where we >>>> must change global safepoint state before changing the thread polling state) >>>> >>>> (I have some thoughts on simplifying TLH/safepoint states) >>>> >>>>> Nit: I would have kept disarm() rather than wake() as I like the arm/disarm >>>>> duality. >>>> >>>> Yes, me too. Not sure why I did the opposite, fixed! >>>> >>>>> >>>>> ?? void GenericWaitBarrier::wait(int barrier_tag) { >>>>> ???? assert(barrier_tag != 0, "Trying to wait on disarmed value"); >>>>> +?? if (barrier_tag == 0 && barrier_tag != _barrier_tag) { >>>>> +???? OrderAccess::fence(); >>>>> +???? return; >>>>> +?? } >>>>> >>>>> I don't understand what the above is doing. A barrier_tag of 0 is a >>>>> programming error caught during testing in debug builds. You don't need to >>>>> account for it being 0 in product because this isn't something that can >>>>> come in from an external source - we have full code control here. And even >>>>> if you want to be this paranoid why would you need the fence? >>>> >>>> Fixed, but kept the fence, since we say we are providing a trailing fence. >>>> Otherwise I would like to add that exception to the description of wait(). >>>> >>>> Including Dan's comments: >>>> Full: http://cr.openjdk.java.net/~rehn/8214271/6/full/webrev/ >>>> Inc : http://cr.openjdk.java.net/~rehn/8214271/6/inc/webrev/ >>>> >>>> >>>> Thanks, Robbin >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>>> Full: >>>>>> http://cr.openjdk.java.net/~rehn/8214271/5/full/webrev/ >>>>>> >>>>>> gtest passes thousands of loops locally and hundreds in mach5. >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>>> >>>>>>>>> s/Implementation/Implementations/ >>>>>>>> >>>>>>>> Fixed >>>>>>>> >>>>>>>>> >>>>>>>>> The fourth line is no longer needed. >>>>>>>> >>>>>>>> Above is the reason I would like to keep the fourth line, since only if >>>>>>>> you call >>>>>>>> both disarm() and wake() you have that guarantee that waiter threads will >>>>>>>> return. >>>>>>>> >>>>>>>> Thanks, Robbin >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>>> Inc: >>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/inc/webrev/ >>>>>>>>>> >>>>>>>>>> Full: >>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/4/full/webrev/ >>>>>>>>>> >>>>>>>>>> /Robbin >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Otherwise this all looks good! >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>>>> ----- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Full: >>>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/3/full/webrev/ >>>>>>>>>>>> >>>>>>>>>>>> Thanks, Robbin >>>>>>>>>>>> >>>>>>>>>>>> On 11/23/18 5:55 PM, Robbin Ehn wrote: >>>>>>>>>>>>> Forgot RFR in subject. >>>>>>>>>>>>> >>>>>>>>>>>>> /Robbin >>>>>>>>>>>>> >>>>>>>>>>>>> On 2018-11-23 17:51, Robbin Ehn wrote: >>>>>>>>>>>>>> Hi all, please review. >>>>>>>>>>>>>> >>>>>>>>>>>>>> When a safepoint is ended we need a way to get back to 100% >>>>>>>>>>>>>> utilization as fast >>>>>>>>>>>>>> as possible. 100% utilization means no idle cpu in the system if >>>>>>>>>>>>>> there is a >>>>>>>>>>>>>> JavaThread that could be executed. The traditional ways to wake >>>>>>>>>>>>>> many, e.g. >>>>>>>>>>>>>> semaphore, pthread_cond, is not implemented with a single syscall >>>>>>>>>>>>>> instead they >>>>>>>>>>>>>> typical do one syscall per thread to wake. >>>>>>>>>>>>>> >>>>>>>>>>>>>> This change-set contains that primitive, the WaitBarrier, and a >>>>>>>>>>>>>> gtest for it. >>>>>>>>>>>>>> No actual users, which is in coming patches. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The WaitBarrier solves by doing a cooperative semaphore posting, >>>>>>>>>>>>>> threads woken >>>>>>>>>>>>>> will also post. On Linux we can instead directly use a futex and >>>>>>>>>>>>>> with one >>>>>>>>>>>>>> syscall wake all. Depending on how many threads and cpus the >>>>>>>>>>>>>> performance vary, >>>>>>>>>>>>>> but a good utilization of the machine, just on the edge of >>>>>>>>>>>>>> saturated, the time to reach 100% utilization is around 3 times >>>>>>>>>>>>>> faster with the WaitBarrier (where futex is faster than semaphore). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> CR: >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214271 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Passes 100 iterations of gtest on our platforms, both fastdebug >>>>>>>>>>>>>> and release. >>>>>>>>>>>>>> And have been stable when used in safepoints (t1-8) (coming patches). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, Robbin > From matthias.baesken at sap.com Fri Jan 11 12:54:40 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 11 Jan 2019 12:54:40 +0000 Subject: RFR 8007606 : Handle realloc() failure in unix/native/libnet/net_util_md.c correctly In-Reply-To: References: Message-ID: (and btw. There seem to be a few other places in the coding where the realloc return value is not checked , see for example : jdk/src/hotspot/share/adlc/forms.cpp 50void NameList::addName(const char *name) { 51 if (_cur == _max) _names =(const char**)realloc(_names,(_max *=2)*sizeof(char*)); 52 _names[_cur++] = name; 53} jdk/src/java.base/share/native/libjimage/imageFile.cpp 225// Add a new image entry to the table. 226void ImageFileReaderTable::add(ImageFileReader* image) { 227 if (_count == _max) { 228 _max += _growth; 229 _table = static_cast(realloc(_table, _max * sizeof(ImageFileReader*))); 230 } 231 _table[_count++] = image; 232} However I think adjustments of those cod parts is out of scope of 8007606 . Best regards, Matthias > -----Original Message----- > From: Baesken, Matthias > Sent: Freitag, 11. Januar 2019 13:43 > To: net-dev at openjdk.java.net > Subject: Re: RFR 8007606 : Handle realloc() failure in > unix/native/libnet/net_util_md.c correctly > > Hi Ivan, > > Shouldn't you reset localifsSize to 0 in case of the early return ? The > comment says localifsSize is the size of the array so the size of the array is 0 > again after freeing. > > > 637 static struct localinterface *localifs = 0; > 638 static int localifsSize = 0; /* size of array */ > 639 static int nifs = 0; /* number of entries used in array */ > > ... > > 679 if (localifsTemp == 0) { > 680 free(localifs); > 681 localifs = 0; > 682 nifs = 0; > 683 fclose(f); > 684 return; > 685 } > > > > > Best regards, Matthias > > > > > Date: Thu, 10 Jan 2019 20:29:08 -0800 > > From: Ivan Gerasimov > > To: "net-dev at openjdk.java.net" > > Subject: RFR 8007606 : Handle realloc() failure in > > unix/native/libnet/net_util_md.c correctly > > Message-ID: <3dc3c26b-fea7-2538-2c7a-bfa623f2fc86 at oracle.com> > > Content-Type: text/plain; charset=utf-8; format=flowed > > > > Hello! > > > > This seems to be the last use of realloc() without proper handling of a > > failure. > > > > Would you please help review a trivial fix? > > > > BUGURL: https://bugs.openjdk.java.net/browse/JDK-8007606 > > WEBREV: http://cr.openjdk.java.net/~igerasim/8007606/00/webrev/ > > > > Thanks in advance! > > > > -- > > With kind regards, > > Ivan Gerasimov > > > > From sgehwolf at redhat.com Fri Jan 11 13:35:00 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 11 Jan 2019 14:35:00 +0100 Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly parsed from /proc/self/maps Message-ID: <4172e9c3682846f8a1132f7d3718ceb57b119359.camel@redhat.com> Hi, Could I please get a review of this tiny fix for native libraries parsing from /proc//maps? It's a Linux-only change and seems to surface only on certain systems (with 3-digit device ids in /proc). The issue has been noticed since TestNativeLibrariesEvent.java was failing for me. What's the problem? The Linux impl of os::get_loaded_modules_info reads /proc/self/maps and scans each line into variables via sscanf(). The format for the device pointer allowed up to 5 characters. In my case it was 6 characters long. As a result, the last digit of the device id got interpreted as inode value and the inode value as name. Thus, the failing test was showing strange library names like "25847051". According to the spec, major:minor device numbers may be up to 3-digits in length. Hence, I propose to change the format from '%5s' to '%7s': 3 digits for major/minor each, plus one for ':'. Bug: https://bugs.openjdk.java.net/browse/JDK-8216559 webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8216559/webrev.01/ Testing: jdk_jfr tests on Linux x86_64 including TestNativeLibrariesEvent.java which now passes. Thoughts? Thanks, Severin From thomas.schatzl at oracle.com Fri Jan 11 14:27:52 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 11 Jan 2019 15:27:52 +0100 Subject: RFR (S/M): 8213827: NUMA heap allocation does not respect process membind/interleave settings [Was: Re: [PATCH] JDK NUMA Interleaving issue] In-Reply-To: References: <6e5b102d07b4ceded09115a649be020410240fe7.camel@oracle.com> <9bea7b0957bbfc2f0ac34306ee162f2d98e44bfe.camel@oracle.com> <99164b92f47f264978339ed327da9d41098a7e1d.camel@oracle.com> <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> <3b8edd37-80cd-0f06-55ed-326972db98de@oracle.com> Message-ID: <2bc2fce2403324739a030d929b847fece95b0e25.camel@oracle.com> Hi all, I prepared new webrevs with the suggestions from Sangheon, and I think the changes Amit did: http://cr.openjdk.java.net/~tschatzl/8213827/webrev.3_to_4/ (diff) http://cr.openjdk.java.net/~tschatzl/8213827/webrev.4/ (full) There were additional fixes: - crash at startup with -XX:+UseNUMA since numa_interleave_memory() is called before NUMA support has been initialized due to some earlier refactoring (the cause is wrong library initialization order at startup imho, but is to be handled separately). - also CamelCased the enum values. - removed some too compliated code ("return (a == b) ? true : false;" and one other). - removed some more random spacing This patch passes hs-tier1-5 without additional failures now. Some manual testing showed that it seems to do the right thing too now. @Amith: please check if the change still works on your applications and still gives the expected performance improvements. Thanks, Thomas On Thu, 2019-01-10 at 16:48 +0530, amith pawar wrote: > Hi Sangheon, > > Thanks again. I have done the required changes and created webrev. > Please use following link to download the same as gmail is not > allowing to attach. > > https://drive.google.com/open?id=1QzmW6LdmKbBNHp4-hlcIr9DKY7anUMBy > > Thanks, > Amit > > On Thu, Jan 10, 2019 at 1:03 AM wrote: > > Hi Amith, > > > > On 1/9/19 3:30 AM, amith pawar wrote: > > > Hi Sangheon, > > > > > > Thanks for reviewing and updated with suggested changes. please > > > check. > > Thank you for addressing my comments. > > But I can't see below comments addressed: > > > > - Looking at 'enum' at os.hpp, we use Camel style. > > I meant to change from 'Numa_allocation_policy' to > > 'NumaAllocationPolicy'. > > > > - As usual, copyright year updates. I know it was correct when > > > > you posted. :) > > Looking at the latest source code, only os_linux.hpp needs a new > > copyright year. > > - * Copyright (c) 1999, 2018, Oracle and/or its affiliates. All > > rights reserved. > > + * Copyright (c) 1999, 2019, Oracle and/or its affiliates. All > > rights reserved. > > > > Looking at the v5, > > ??+ ls.print("UseNUMA is enabled and invoked in '%s' mode." > > + " Heap will be configured using NUMA memory > > nodes:", numa_mode); > > There is one more space before " Heap.... ", please remove it. > > > > I see the latest version that Thomas posted is v3, but your > > attached version is v5. :) > > > > In addition, it would be better to provide webrev instead of a > > patch. ( http://openjdk.java.net/guide/codeReview.html ) > > > > Thanks, > > Sangheon > > > > > Thanks, > > > Amit Pawar > > > > > > On Wed, Jan 9, 2019 at 12:45 AM wrote: > > > > Hi Thomas, > > > > > > > > On 12/13/18 2:33 AM, Thomas Schatzl wrote: > > > > > Hi Amit, > > > > > On Thu, 2018-12-13 at 15:11 +0530, amith pawar wrote: > > > > > > Hi Thomas, > > > > > > > > > > > > Please find the attached patch updated as per your > > > > > > suggestion. > > > > > > If everything OK then can you please commit this to repo ? > > > > > > > > > > looks good. We will need a second reviewer though, I am > > > > > going to ask > > > > > around. > > > > > > > > > > Latest webrev: > > > > > http://cr.openjdk.java.net/~tschatzl/8213827/webrev.3/ > > > > Webrev.3 looks good to me. > > > > > > > > I have some minor nits: > > > > ---------------------------------------- > > > > src/hotspot/os/linux/os_linux.cpp > > > > 5012 for (int node = 0; node < Linux::numa_max_node(); > > > > node++) { > > > > - Looks like 'node <= Linux::numa_max_node()' is the right one > > > > to print the latest node? > > > > > > > > ---------------------------------------- > > > > src/hotspot/os/linux/os_linux.hpp > > > > 271 enum Numa_allocation_policy{ > > > > - Looking at 'enum' at os.hpp, we use Camel style. > > > > - There are missing space before '{'. > > > > > > > > - As usual, copyright year updates. I know it was correct when > > > > you posted. :) > > > > > > > > Thanks, > > > > Sangheon > > > > > > > > > > > > > Thanks, > > > > > Thomas > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > With best regards, > > > amit pawar > > > > From markus.gronlund at oracle.com Fri Jan 11 16:43:38 2019 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Fri, 11 Jan 2019 08:43:38 -0800 (PST) Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly parsed from /proc/self/maps In-Reply-To: <4172e9c3682846f8a1132f7d3718ceb57b119359.camel@redhat.com> References: <4172e9c3682846f8a1132f7d3718ceb57b119359.camel@redhat.com> Message-ID: <3243fba1-f880-451e-9ef7-fecd2381f806@default> Hi Severin, Looks good, thanks for fixing. Markus -----Original Message----- From: Severin Gehwolf Sent: den 11 januari 2019 14:35 To: hotspot-dev ; hotspot-jfr-dev at openjdk.java.net Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly parsed from /proc/self/maps Hi, Could I please get a review of this tiny fix for native libraries parsing from /proc//maps? It's a Linux-only change and seems to surface only on certain systems (with 3-digit device ids in /proc). The issue has been noticed since TestNativeLibrariesEvent.java was failing for me. What's the problem? The Linux impl of os::get_loaded_modules_info reads /proc/self/maps and scans each line into variables via sscanf(). The format for the device pointer allowed up to 5 characters. In my case it was 6 characters long. As a result, the last digit of the device id got interpreted as inode value and the inode value as name. Thus, the failing test was showing strange library names like "25847051". According to the spec, major:minor device numbers may be up to 3-digits in length. Hence, I propose to change the format from '%5s' to '%7s': 3 digits for major/minor each, plus one for ':'. Bug: https://bugs.openjdk.java.net/browse/JDK-8216559 webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8216559/webrev.01/ Testing: jdk_jfr tests on Linux x86_64 including TestNativeLibrariesEvent.java which now passes. Thoughts? Thanks, Severin From sgehwolf at redhat.com Fri Jan 11 17:15:37 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 11 Jan 2019 18:15:37 +0100 Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly parsed from /proc/self/maps In-Reply-To: <3243fba1-f880-451e-9ef7-fecd2381f806@default> References: <4172e9c3682846f8a1132f7d3718ceb57b119359.camel@redhat.com> <3243fba1-f880-451e-9ef7-fecd2381f806@default> Message-ID: <24c042a989d2bdaf5703a25d7ffc158d45883cd1.camel@redhat.com> Hi, On Fri, 2019-01-11 at 08:43 -0800, Markus Gronlund wrote: > Hi Severin, > > Looks good, thanks for fixing. Thanks for the review, Markus! Any other reviewers or is this trivial enough for me to push with 1 review? Thanks, Severin > Markus > > -----Original Message----- > From: Severin Gehwolf > Sent: den 11 januari 2019 14:35 > To: hotspot-dev ; > hotspot-jfr-dev at openjdk.java.net > Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly > parsed from /proc/self/maps > > Hi, > > Could I please get a review of this tiny fix for native libraries > parsing from /proc//maps? It's a Linux-only change and seems to > surface only on certain systems (with 3-digit device ids in /proc). > The issue has been noticed since TestNativeLibrariesEvent.java was > failing for me. > > What's the problem? The Linux impl of os::get_loaded_modules_info > reads /proc/self/maps and scans each line into variables via > sscanf(). The format for the device pointer allowed up to 5 > characters. In my case it was 6 characters long. As a result, the > last digit of the device id got interpreted as inode value and the > inode value as name. Thus, the failing test was showing strange > library names like "25847051". > > According to the spec, major:minor device numbers may be up to 3- > digits in length. Hence, I propose to change the format from '%5s' to > '%7s': 3 digits for major/minor each, plus one for ':'. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8216559 > webrev: > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8216559/webrev.01/ > > Testing: jdk_jfr tests on Linux x86_64 including > TestNativeLibrariesEvent.java > which now passes. > > Thoughts? > > Thanks, > Severin > From jesper.wilhelmsson at oracle.com Fri Jan 11 17:29:38 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Fri, 11 Jan 2019 18:29:38 +0100 Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly parsed from /proc/self/maps In-Reply-To: <24c042a989d2bdaf5703a25d7ffc158d45883cd1.camel@redhat.com> References: <4172e9c3682846f8a1132f7d3718ceb57b119359.camel@redhat.com> <3243fba1-f880-451e-9ef7-fecd2381f806@default> <24c042a989d2bdaf5703a25d7ffc158d45883cd1.camel@redhat.com> Message-ID: <89F8D62A-1150-4501-90AD-70DD3654187C@oracle.com> Looks good and trivial enough :-) /Jesper > On 11 Jan 2019, at 18:15, Severin Gehwolf wrote: > > Hi, > > On Fri, 2019-01-11 at 08:43 -0800, Markus Gronlund wrote: >> Hi Severin, >> >> Looks good, thanks for fixing. > > Thanks for the review, Markus! > > Any other reviewers or is this trivial enough for me to push with 1 > review? > > Thanks, > Severin > >> Markus >> >> -----Original Message----- >> From: Severin Gehwolf >> Sent: den 11 januari 2019 14:35 >> To: hotspot-dev ; >> hotspot-jfr-dev at openjdk.java.net >> Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly >> parsed from /proc/self/maps >> >> Hi, >> >> Could I please get a review of this tiny fix for native libraries >> parsing from /proc//maps? It's a Linux-only change and seems to >> surface only on certain systems (with 3-digit device ids in /proc). >> The issue has been noticed since TestNativeLibrariesEvent.java was >> failing for me. >> >> What's the problem? The Linux impl of os::get_loaded_modules_info >> reads /proc/self/maps and scans each line into variables via >> sscanf(). The format for the device pointer allowed up to 5 >> characters. In my case it was 6 characters long. As a result, the >> last digit of the device id got interpreted as inode value and the >> inode value as name. Thus, the failing test was showing strange >> library names like "25847051". >> >> According to the spec, major:minor device numbers may be up to 3- >> digits in length. Hence, I propose to change the format from '%5s' to >> '%7s': 3 digits for major/minor each, plus one for ':'. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8216559 >> webrev: >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8216559/webrev.01/ >> >> Testing: jdk_jfr tests on Linux x86_64 including >> TestNativeLibrariesEvent.java >> which now passes. >> >> Thoughts? >> >> Thanks, >> Severin >> > From sgehwolf at redhat.com Fri Jan 11 17:39:16 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 11 Jan 2019 18:39:16 +0100 Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly parsed from /proc/self/maps In-Reply-To: <89F8D62A-1150-4501-90AD-70DD3654187C@oracle.com> References: <4172e9c3682846f8a1132f7d3718ceb57b119359.camel@redhat.com> <3243fba1-f880-451e-9ef7-fecd2381f806@default> <24c042a989d2bdaf5703a25d7ffc158d45883cd1.camel@redhat.com> <89F8D62A-1150-4501-90AD-70DD3654187C@oracle.com> Message-ID: On Fri, 2019-01-11 at 18:29 +0100, jesper.wilhelmsson at oracle.com wrote: > Looks good and trivial enough :-) > /Jesper Thanks for the review, Jesper! Cheers, Severin > > > On 11 Jan 2019, at 18:15, Severin Gehwolf wrote: > > > > Hi, > > > > On Fri, 2019-01-11 at 08:43 -0800, Markus Gronlund wrote: > > > Hi Severin, > > > > > > Looks good, thanks for fixing. > > > > Thanks for the review, Markus! > > > > Any other reviewers or is this trivial enough for me to push with 1 > > review? > > > > Thanks, > > Severin > > > > > Markus > > > > > > -----Original Message----- > > > From: Severin Gehwolf > > > Sent: den 11 januari 2019 14:35 > > > To: hotspot-dev ; > > > hotspot-jfr-dev at openjdk.java.net > > > Subject: RFR(xs): 8216559: [JFR] Native libraries not correctly > > > parsed from /proc/self/maps > > > > > > Hi, > > > > > > Could I please get a review of this tiny fix for native libraries > > > parsing from /proc//maps? It's a Linux-only change and seems to > > > surface only on certain systems (with 3-digit device ids in /proc). > > > The issue has been noticed since TestNativeLibrariesEvent.java was > > > failing for me. > > > > > > What's the problem? The Linux impl of os::get_loaded_modules_info > > > reads /proc/self/maps and scans each line into variables via > > > sscanf(). The format for the device pointer allowed up to 5 > > > characters. In my case it was 6 characters long. As a result, the > > > last digit of the device id got interpreted as inode value and the > > > inode value as name. Thus, the failing test was showing strange > > > library names like "25847051". > > > > > > According to the spec, major:minor device numbers may be up to 3- > > > digits in length. Hence, I propose to change the format from '%5s' to > > > '%7s': 3 digits for major/minor each, plus one for ':'. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8216559 > > > webrev: > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8216559/webrev.01/ > > > > > > Testing: jdk_jfr tests on Linux x86_64 including > > > TestNativeLibrariesEvent.java > > > which now passes. > > > > > > Thoughts? > > > > > > Thanks, > > > Severin > > > > > From stuart.marks at oracle.com Fri Jan 11 19:08:29 2019 From: stuart.marks at oracle.com (Stuart Marks) Date: Fri, 11 Jan 2019 11:08:29 -0800 Subject: JDK 12 RFR of JDK-8213299: runtime/appcds/jigsaw/classpathtests/EmptyClassInBootClassPath.java failed with java.lang.NoSuchMethodException In-Reply-To: <5e5ae7b3-af70-81d4-0bc1-c56fd2b20165@oracle.com> References: <5e5ae7b3-af70-81d4-0bc1-c56fd2b20165@oracle.com> Message-ID: <8175ebfe-2951-6c71-29ee-f09a6b4da4f3@oracle.com> Drat, you pushed this already. But I wanted to mention a couple style points: On 1/10/19 10:13 PM, Joe Darcy wrote: > +??????????? sb.append(Stream.of(argTypes).map(c -> {return (c == null) ? "null" > : c.getName();}). > +??? ??? ????? collect(Collectors.joining(","))); Since argTypes is an array, I usually prefer Arrays.stream() over Stream.of(). The issue is that Stream.of() is varargs, and while this case isn't formally ambiguous, it can create a question in the reader's mind about whether the stream consists of the array elements or of just one element that's the array itself. The statement lambda can probably be replaced with an expression lambda. I think it makes the ternary easier to read. Also, indentation. sb.append(Arrays.stream(argTypes) .map(c -> (c == null) ? "null" : c.getName()) .collect(Collectors.joining(","))); I'm not sure it's worth tracking this, but I could file a bug if you'd like. s'marks From coleen.phillimore at oracle.com Fri Jan 11 23:33:03 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 11 Jan 2019 18:33:03 -0500 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> Message-ID: <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> On 1/10/19 7:22 PM, David Holmes wrote: > Hi Aleksey, > > On 11/01/2019 1:21 am, Aleksey Shipilev wrote: >> RFE: >> ?? https://bugs.openjdk.java.net/browse/JDK-8216308 >> >> Fix: >> ?? http://cr.openjdk.java.net/~shade/8216308/webrev.01/ >> >> This is another patch that removes the use of SymbolTable on hot path >> in stack trace creation. We >> can inject Class.source_file field to cache the source file name. >> Some caution is needed to properly >> handle invalidation when redefinition happens. > > I'm struggling a bit with the redefinition logic. IIRC redefinition > can only happen at a safepoint so if there are concurrent calls to > fillInStackTrace that involve a given class Foo, then they must all > see the same version of Foo, and we can not have the case where one > execution of the code is clearing the stale cache, while another is > setting it to the new value - right? > > That said, IIRC Coleen stated that intern can lead to a safepoint, > which would then invalidate the existing redefinition logic because we > would get the line number after the intern and it may now be > incorrect. So I think we have to reorder the code so the > get_line_number occurs before the call to intern. In either case, if you redefine the method while calling StringTable::intern, you'll get the line number from the old method.? Either before or after the call to String.intern.? It won't crash though, and actually the exception likely happened at the old method's line number.? The only reason we need to specially handle source_file_name is that the redefined class replaces it in the InstanceKlass and we don't have the old one available. > > I'm also very unclear about how the redefinition case is currently > handled. It seems that we will normally intern NULL (and presumably > get a NULL or empty-string oop?) unless ShowHiddenFrames is set, in > which case we use the unknown_class_name() - regardless of whether the > frame is actually hidden or not! This seems broken to me. (Separate > bug to fix that is okay if it is indeed broken.) This looks like a bug, but I'm not sure what ShowHiddenFrames is supposed to do here, or how it got there.? I think if Aleksey removed that with this patch it would be fine with me. Coleen > > A couple of comments on comments: > > 2616???? if (source != NULL) { > 2617?????? // Class was not redefined, can trust its cache. > 2618?????? if (source_file == NULL) { > > Can you expand the comment as follows: > > // Class was not redefined. We can trust its cache if set, > // else we have to initialize it. > > 2622???? } else { > 2623?????? // Dump the cache in case class had it: it must be have > been redefined. > 2624?????? if (source_file != NULL) { > > Can you change the comment to be more consistent with the previous one: > > // Class was redefined. Dump the cache if it was set. > > Thanks, > David > ----- > >> This makes stack trace generation significantly faster, and finally >> better than it used to even >> before StackWalker and StringTable-related regressions in 9 and 11. >> >> Benchmark??????????? (depth) Mode Cnt??? Score?? Error Units >> >> # 8u >> StackTraceBench.test?????? 1 avgt? 15?? 10.851 ? 0.075 us/op >> StackTraceBench.test????? 10 avgt? 15?? 15.325 ? 0.089 us/op >> StackTraceBench.test???? 100 avgt? 15?? 59.717 ? 0.449 us/op >> StackTraceBench.test??? 1000 avgt? 15? 529.020 ? 3.654 us/op >> >> # jdk/jdk baseline >> StackTraceBench.test????? 1? avgt? 15?? 15.077 ? 0.065 us/op >> StackTraceBench.test???? 10? avgt? 15?? 21.153 ? 0.123 us/op >> StackTraceBench.test??? 100? avgt? 15?? 80.758 ? 0.363 us/op >> StackTraceBench.test?? 1000? avgt? 15? 674.888 ? 4.985 us/op >> >> # jdk/jdk patched >> StackTraceBench.test????? 1? avgt? 15??? 8.892 ? 0.064 us/op >> StackTraceBench.test???? 10? avgt? 15?? 12.010 ? 0.079 us/op >> StackTraceBench.test??? 100? avgt? 15?? 43.091 ? 0.254 us/op >> StackTraceBench.test?? 1000? avgt? 15? 353.194 ? 2.040 us/op >> >> Testing: hotspot tier1, jdk-submit, ad-hoc benchmarks >> >> Thanks, >> -Aleksey >> From coleen.phillimore at oracle.com Fri Jan 11 23:45:18 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 11 Jan 2019 18:45:18 -0500 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> Message-ID: http://cr.openjdk.java.net/~shade/8216308/webrev.01/src/hotspot/share/classfile/javaClasses.cpp.frames.html 2629 if (ShowHiddenFrames) { 2630 source = vmSymbols::unknown_class_name(); 2631 source_file = StringTable::intern(source, CHECK); 2632 } I wonder if ShowHiddenFrames works at all.? This seems okay though. Maybe we should file a bug report to figure out what ShowHiddenFrames is supposed to do. I think this change looks good and handles the redefinition code as correctly as we can (this is sort of a corner case that is rarely found). Thanks, Coleen On 1/10/19 10:21 AM, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8216308 > > Fix: > http://cr.openjdk.java.net/~shade/8216308/webrev.01/ > > This is another patch that removes the use of SymbolTable on hot path in stack trace creation. We > can inject Class.source_file field to cache the source file name. Some caution is needed to properly > handle invalidation when redefinition happens. > > This makes stack trace generation significantly faster, and finally better than it used to even > before StackWalker and StringTable-related regressions in 9 and 11. > > Benchmark (depth) Mode Cnt Score Error Units > > # 8u > StackTraceBench.test 1 avgt 15 10.851 ? 0.075 us/op > StackTraceBench.test 10 avgt 15 15.325 ? 0.089 us/op > StackTraceBench.test 100 avgt 15 59.717 ? 0.449 us/op > StackTraceBench.test 1000 avgt 15 529.020 ? 3.654 us/op > > # jdk/jdk baseline > StackTraceBench.test 1 avgt 15 15.077 ? 0.065 us/op > StackTraceBench.test 10 avgt 15 21.153 ? 0.123 us/op > StackTraceBench.test 100 avgt 15 80.758 ? 0.363 us/op > StackTraceBench.test 1000 avgt 15 674.888 ? 4.985 us/op > > # jdk/jdk patched > StackTraceBench.test 1 avgt 15 8.892 ? 0.064 us/op > StackTraceBench.test 10 avgt 15 12.010 ? 0.079 us/op > StackTraceBench.test 100 avgt 15 43.091 ? 0.254 us/op > StackTraceBench.test 1000 avgt 15 353.194 ? 2.040 us/op > > Testing: hotspot tier1, jdk-submit, ad-hoc benchmarks > > Thanks, > -Aleksey > From david.holmes at oracle.com Sat Jan 12 00:43:33 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 12 Jan 2019 10:43:33 +1000 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> Message-ID: <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> On 12/01/2019 9:33 am, coleen.phillimore at oracle.com wrote: > > > On 1/10/19 7:22 PM, David Holmes wrote: >> Hi Aleksey, >> >> On 11/01/2019 1:21 am, Aleksey Shipilev wrote: >>> RFE: >>> ?? https://bugs.openjdk.java.net/browse/JDK-8216308 >>> >>> Fix: >>> ?? http://cr.openjdk.java.net/~shade/8216308/webrev.01/ >>> >>> This is another patch that removes the use of SymbolTable on hot path >>> in stack trace creation. We >>> can inject Class.source_file field to cache the source file name. >>> Some caution is needed to properly >>> handle invalidation when redefinition happens. >> >> I'm struggling a bit with the redefinition logic. IIRC redefinition >> can only happen at a safepoint so if there are concurrent calls to >> fillInStackTrace that involve a given class Foo, then they must all >> see the same version of Foo, and we can not have the case where one >> execution of the code is clearing the stale cache, while another is >> setting it to the new value - right? >> >> That said, IIRC Coleen stated that intern can lead to a safepoint, >> which would then invalidate the existing redefinition logic because we >> would get the line number after the intern and it may now be >> incorrect. So I think we have to reorder the code so the >> get_line_number occurs before the call to intern. > > In either case, if you redefine the method while calling > StringTable::intern, you'll get the line number from the old method. > Either before or after the call to String.intern.? It won't crash > though, and actually the exception likely happened at the old method's > line number.? The only reason we need to specially handle > source_file_name is that the redefined class replaces it in the > InstanceKlass and we don't have the old one available. But we already have special handling for the line number if redefinition of the method occurred. So we should ensure we don't allow redefinition to occur before actually capturing the line number. >> >> I'm also very unclear about how the redefinition case is currently >> handled. It seems that we will normally intern NULL (and presumably >> get a NULL or empty-string oop?) unless ShowHiddenFrames is set, in >> which case we use the unknown_class_name() - regardless of whether the >> frame is actually hidden or not! This seems broken to me. (Separate >> bug to fix that is okay if it is indeed broken.) > > This looks like a bug, but I'm not sure what ShowHiddenFrames is > supposed to do here, or how it got there.? I think if Aleksey removed > that with this patch it would be fine with me. I think use of ShowHiddenFrames here is completely broken. But a seperate bug and some suitable archaeology is needed to fix it the right way. David > Coleen >> >> A couple of comments on comments: >> >> 2616???? if (source != NULL) { >> 2617?????? // Class was not redefined, can trust its cache. >> 2618?????? if (source_file == NULL) { >> >> Can you expand the comment as follows: >> >> // Class was not redefined. We can trust its cache if set, >> // else we have to initialize it. >> >> 2622???? } else { >> 2623?????? // Dump the cache in case class had it: it must be have >> been redefined. >> 2624?????? if (source_file != NULL) { >> >> Can you change the comment to be more consistent with the previous one: >> >> // Class was redefined. Dump the cache if it was set. >> >> Thanks, >> David >> ----- >> >>> This makes stack trace generation significantly faster, and finally >>> better than it used to even >>> before StackWalker and StringTable-related regressions in 9 and 11. >>> >>> Benchmark??????????? (depth) Mode Cnt??? Score?? Error Units >>> >>> # 8u >>> StackTraceBench.test?????? 1 avgt? 15?? 10.851 ? 0.075 us/op >>> StackTraceBench.test????? 10 avgt? 15?? 15.325 ? 0.089 us/op >>> StackTraceBench.test???? 100 avgt? 15?? 59.717 ? 0.449 us/op >>> StackTraceBench.test??? 1000 avgt? 15? 529.020 ? 3.654 us/op >>> >>> # jdk/jdk baseline >>> StackTraceBench.test????? 1? avgt? 15?? 15.077 ? 0.065 us/op >>> StackTraceBench.test???? 10? avgt? 15?? 21.153 ? 0.123 us/op >>> StackTraceBench.test??? 100? avgt? 15?? 80.758 ? 0.363 us/op >>> StackTraceBench.test?? 1000? avgt? 15? 674.888 ? 4.985 us/op >>> >>> # jdk/jdk patched >>> StackTraceBench.test????? 1? avgt? 15??? 8.892 ? 0.064 us/op >>> StackTraceBench.test???? 10? avgt? 15?? 12.010 ? 0.079 us/op >>> StackTraceBench.test??? 100? avgt? 15?? 43.091 ? 0.254 us/op >>> StackTraceBench.test?? 1000? avgt? 15? 353.194 ? 2.040 us/op >>> >>> Testing: hotspot tier1, jdk-submit, ad-hoc benchmarks >>> >>> Thanks, >>> -Aleksey >>> > From shade at redhat.com Sat Jan 12 12:28:05 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Sat, 12 Jan 2019 13:28:05 +0100 Subject: RFR (XS) 8216589: s390x build failures after JDK-8216167 (Update include guards to reflect correct directories) Message-ID: <6f3b90a8-bd39-23a9-f412-383ab44f7f66@redhat.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8216589 Fix: diff -r 2969ff55c29b src/hotspot/cpu/s390/codeBuffer_s390.hpp --- a/src/hotspot/cpu/s390/codeBuffer_s390.hpp Fri Jan 11 14:24:23 2019 -0800 +++ b/src/hotspot/cpu/s390/codeBuffer_s390.hpp Sat Jan 12 13:22:19 2019 +0100 @@ -32,7 +32,6 @@ public: void flush_bundle(bool start_new_bundle) {} void getCpuData(const CodeBuffer * const cb) {} -#endif // CPU_S390_VM_CODEBUFFER_S390_HPP #endif // CPU_S390_CODEBUFFER_S390_HPP diff -r 2969ff55c29b src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp --- a/src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp Fri Jan 11 14:24:23 2019 -0800 +++ b/src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp Sat Jan 12 13:22:19 2019 +0100 @@ -47,7 +47,6 @@ define_pd_global(size_t, JVMInvokeMethodSlack, 8192); // Only used on 64 bit platforms. define_pd_global(size_t, HeapBaseMinAddress, 2*G); -#endif // OS_CPU_LINUX_S390_VM_GLOBALS_LINUX_S390_HPP #endif // OS_CPU_LINUX_S390_GLOBALS_LINUX_S390_HPP Verified no other files like that exist by running: $ find src/hotspot -type f -exec grep -H HPP$ {} \; | grep endif | cut -d":" -f 1 | sort | uniq -d Testing: s390x {fastdebug,release} cross-compile, x86_64 compile, grepping source Thanks, -Aleksey From david.holmes at oracle.com Sat Jan 12 12:30:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 12 Jan 2019 22:30:36 +1000 Subject: RFR (XS) 8216589: s390x build failures after JDK-8216167 (Update include guards to reflect correct directories) In-Reply-To: <6f3b90a8-bd39-23a9-f412-383ab44f7f66@redhat.com> References: <6f3b90a8-bd39-23a9-f412-383ab44f7f66@redhat.com> Message-ID: <2b98e0f2-cae2-212b-ed70-1411dece2430@oracle.com> Ship it! Trivial fix. David On 12/01/2019 10:28 pm, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8216589 > > Fix: > > diff -r 2969ff55c29b src/hotspot/cpu/s390/codeBuffer_s390.hpp > --- a/src/hotspot/cpu/s390/codeBuffer_s390.hpp Fri Jan 11 14:24:23 2019 -0800 > +++ b/src/hotspot/cpu/s390/codeBuffer_s390.hpp Sat Jan 12 13:22:19 2019 +0100 > @@ -32,7 +32,6 @@ > public: > void flush_bundle(bool start_new_bundle) {} > > void getCpuData(const CodeBuffer * const cb) {} > > -#endif // CPU_S390_VM_CODEBUFFER_S390_HPP > #endif // CPU_S390_CODEBUFFER_S390_HPP > diff -r 2969ff55c29b src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp > --- a/src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp Fri Jan 11 14:24:23 2019 -0800 > +++ b/src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp Sat Jan 12 13:22:19 2019 +0100 > @@ -47,7 +47,6 @@ > define_pd_global(size_t, JVMInvokeMethodSlack, 8192); > > // Only used on 64 bit platforms. > define_pd_global(size_t, HeapBaseMinAddress, 2*G); > > -#endif // OS_CPU_LINUX_S390_VM_GLOBALS_LINUX_S390_HPP > #endif // OS_CPU_LINUX_S390_GLOBALS_LINUX_S390_HPP > > > Verified no other files like that exist by running: > $ find src/hotspot -type f -exec grep -H HPP$ {} \; | grep endif | cut -d":" -f 1 | sort | uniq -d > > Testing: s390x {fastdebug,release} cross-compile, x86_64 compile, grepping source > > Thanks, > -Aleksey > From shade at redhat.com Sat Jan 12 12:37:42 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Sat, 12 Jan 2019 13:37:42 +0100 Subject: RFR (XS) 8216589: s390x build failures after JDK-8216167 (Update include guards to reflect correct directories) In-Reply-To: <2b98e0f2-cae2-212b-ed70-1411dece2430@oracle.com> References: <6f3b90a8-bd39-23a9-f412-383ab44f7f66@redhat.com> <2b98e0f2-cae2-212b-ed70-1411dece2430@oracle.com> Message-ID: <3722bf23-45e4-8ca3-2ea8-32a46dbf5c60@redhat.com> Aye. Pushed. -Aleksey On 1/12/19 1:30 PM, David Holmes wrote: > Ship it! Trivial fix. > > David > > On 12/01/2019 10:28 pm, Aleksey Shipilev wrote: >> Bug: >> ?? https://bugs.openjdk.java.net/browse/JDK-8216589 >> >> Fix: >> >> diff -r 2969ff55c29b src/hotspot/cpu/s390/codeBuffer_s390.hpp >> --- a/src/hotspot/cpu/s390/codeBuffer_s390.hpp? Fri Jan 11 14:24:23 2019 -0800 >> +++ b/src/hotspot/cpu/s390/codeBuffer_s390.hpp? Sat Jan 12 13:22:19 2019 +0100 >> @@ -32,7 +32,6 @@ >> ?? public: >> ??? void flush_bundle(bool start_new_bundle) {} >> >> ??? void getCpuData(const CodeBuffer * const cb) {} >> >> -#endif // CPU_S390_VM_CODEBUFFER_S390_HPP >> ? #endif // CPU_S390_CODEBUFFER_S390_HPP >> diff -r 2969ff55c29b src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp >> --- a/src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp????? Fri Jan 11 14:24:23 2019 -0800 >> +++ b/src/hotspot/os_cpu/linux_s390/globals_linux_s390.hpp????? Sat Jan 12 13:22:19 2019 +0100 >> @@ -47,7 +47,6 @@ >> ? define_pd_global(size_t, JVMInvokeMethodSlack,??? 8192); >> >> ? // Only used on 64 bit platforms. >> ? define_pd_global(size_t, HeapBaseMinAddress,????? 2*G); >> >> -#endif // OS_CPU_LINUX_S390_VM_GLOBALS_LINUX_S390_HPP >> ? #endif // OS_CPU_LINUX_S390_GLOBALS_LINUX_S390_HPP >> >> >> Verified no other files like that exist by running: >> ? $ find src/hotspot -type f -exec grep -H HPP$ {} \; | grep endif | cut -d":" -f 1 | sort | uniq -d >> >> Testing: s390x {fastdebug,release} cross-compile, x86_64 compile, grepping source >> >> Thanks, >> -Aleksey >> From amith.pawar at gmail.com Sat Jan 12 17:27:44 2019 From: amith.pawar at gmail.com (amith pawar) Date: Sat, 12 Jan 2019 22:57:44 +0530 Subject: RFR (S/M): 8213827: NUMA heap allocation does not respect process membind/interleave settings [Was: Re: [PATCH] JDK NUMA Interleaving issue] In-Reply-To: <2bc2fce2403324739a030d929b847fece95b0e25.camel@oracle.com> References: <6e5b102d07b4ceded09115a649be020410240fe7.camel@oracle.com> <9bea7b0957bbfc2f0ac34306ee162f2d98e44bfe.camel@oracle.com> <99164b92f47f264978339ed327da9d41098a7e1d.camel@oracle.com> <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> <3b8edd37-80cd-0f06-55ed-326972db98de@oracle.com> <2bc2fce2403324739a030d929b847fece95b0e25.camel@oracle.com> Message-ID: Hi Thomas, SPECJBB shows following improvements with latest patch. 1. max-jOPs around 7-9% 2. critical-jOPS around 4-50% In webrev.4, Sangheon suggested following change is missing + ls.print("UseNUMA is enabled and invoked in '%s' mode." + " Heap will be configured using NUMA memory nodes:", numa_mode); There is one more space before " Heap.... ", please remove it. Also os_linux.hpp is already updated for new copyright year so patch import fails. The attached patch contains these changes. Please do check. Thanks, Amit On Fri, Jan 11, 2019 at 7:58 PM Thomas Schatzl wrote: > Hi all, > > I prepared new webrevs with the suggestions from Sangheon, and I > think the changes Amit did: > > http://cr.openjdk.java.net/~tschatzl/8213827/webrev.3_to_4/ (diff) > http://cr.openjdk.java.net/~tschatzl/8213827/webrev.4/ (full) > > There were additional fixes: > - crash at startup with -XX:+UseNUMA since numa_interleave_memory() is > called before NUMA support has been initialized due to some earlier > refactoring (the cause is wrong library initialization order at startup > imho, but is to be handled separately). > - also CamelCased the enum values. > - removed some too compliated code ("return (a == b) ? true : false;" > and one other). > - removed some more random spacing > > This patch passes hs-tier1-5 without additional failures now. Some > manual testing showed that it seems to do the right thing too now. > > @Amith: please check if the change still works on your applications and > still gives the expected performance improvements. > > Thanks, > Thomas > > On Thu, 2019-01-10 at 16:48 +0530, amith pawar wrote: > > Hi Sangheon, > > > > Thanks again. I have done the required changes and created webrev. > > Please use following link to download the same as gmail is not > > allowing to attach. > > > > https://drive.google.com/open?id=1QzmW6LdmKbBNHp4-hlcIr9DKY7anUMBy > > > > Thanks, > > Amit > > > > On Thu, Jan 10, 2019 at 1:03 AM wrote: > > > Hi Amith, > > > > > > On 1/9/19 3:30 AM, amith pawar wrote: > > > > Hi Sangheon, > > > > > > > > Thanks for reviewing and updated with suggested changes. please > > > > check. > > > Thank you for addressing my comments. > > > But I can't see below comments addressed: > > > > > - Looking at 'enum' at os.hpp, we use Camel style. > > > I meant to change from 'Numa_allocation_policy' to > > > 'NumaAllocationPolicy'. > > > > > - As usual, copyright year updates. I know it was correct when > > > > > you posted. :) > > > Looking at the latest source code, only os_linux.hpp needs a new > > > copyright year. > > > - * Copyright (c) 1999, 2018, Oracle and/or its affiliates. All > > > rights reserved. > > > + * Copyright (c) 1999, 2019, Oracle and/or its affiliates. All > > > rights reserved. > > > > > > Looking at the v5, > > > ??+ ls.print("UseNUMA is enabled and invoked in '%s' mode." > > > + " Heap will be configured using NUMA memory > > > nodes:", numa_mode); > > > There is one more space before " Heap.... ", please remove it. > > > > > > I see the latest version that Thomas posted is v3, but your > > > attached version is v5. :) > > > > > > In addition, it would be better to provide webrev instead of a > > > patch. ( http://openjdk.java.net/guide/codeReview.html ) > > > > > > Thanks, > > > Sangheon > > > > > > > Thanks, > > > > Amit Pawar > > > > > > > > On Wed, Jan 9, 2019 at 12:45 AM wrote: > > > > > Hi Thomas, > > > > > > > > > > On 12/13/18 2:33 AM, Thomas Schatzl wrote: > > > > > > Hi Amit, > > > > > > On Thu, 2018-12-13 at 15:11 +0530, amith pawar wrote: > > > > > > > Hi Thomas, > > > > > > > > > > > > > > Please find the attached patch updated as per your > > > > > > > suggestion. > > > > > > > If everything OK then can you please commit this to repo ? > > > > > > > > > > > > looks good. We will need a second reviewer though, I am > > > > > > going to ask > > > > > > around. > > > > > > > > > > > > Latest webrev: > > > > > > http://cr.openjdk.java.net/~tschatzl/8213827/webrev.3/ > > > > > Webrev.3 looks good to me. > > > > > > > > > > I have some minor nits: > > > > > ---------------------------------------- > > > > > src/hotspot/os/linux/os_linux.cpp > > > > > 5012 for (int node = 0; node < Linux::numa_max_node(); > > > > > node++) { > > > > > - Looks like 'node <= Linux::numa_max_node()' is the right one > > > > > to print the latest node? > > > > > > > > > > ---------------------------------------- > > > > > src/hotspot/os/linux/os_linux.hpp > > > > > 271 enum Numa_allocation_policy{ > > > > > - Looking at 'enum' at os.hpp, we use Camel style. > > > > > - There are missing space before '{'. > > > > > > > > > > - As usual, copyright year updates. I know it was correct when > > > > > you posted. :) > > > > > > > > > > Thanks, > > > > > Sangheon > > > > > > > > > > > > > > > > Thanks, > > > > > > Thomas > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > With best regards, > > > > amit pawar > > > > > > > > > > -- With best regards, amit pawar -------------- next part -------------- diff -r 5d7e4d832868 src/hotspot/os/linux/os_linux.cpp --- a/src/hotspot/os/linux/os_linux.cpp Sat Jan 12 13:33:18 2019 +0100 +++ b/src/hotspot/os/linux/os_linux.cpp Sat Jan 12 23:11:54 2019 +0530 @@ -33,6 +33,7 @@ #include "compiler/disassembler.hpp" #include "interpreter/interpreter.hpp" #include "logging/log.hpp" +#include "logging/logStream.hpp" #include "memory/allocation.inline.hpp" #include "memory/filemap.hpp" #include "oops/oop.inline.hpp" @@ -2780,7 +2781,7 @@ // Get the total number of nodes in the system including nodes without memory. for (node = 0; node <= highest_node_number; node++) { - if (isnode_in_existing_nodes(node)) { + if (is_node_in_existing_nodes(node)) { num_nodes++; } } @@ -2796,7 +2797,7 @@ // node number. If the nodes have been bound explicitly using numactl membind, // then allocate memory from those nodes only. for (int node = 0; node <= highest_node_number; node++) { - if (Linux::isnode_in_bound_nodes((unsigned int)node)) { + if (Linux::is_node_in_bound_nodes((unsigned int)node)) { ids[i++] = node; } } @@ -2899,11 +2900,15 @@ libnuma_dlsym(handle, "numa_distance"))); set_numa_get_membind(CAST_TO_FN_PTR(numa_get_membind_func_t, libnuma_v2_dlsym(handle, "numa_get_membind"))); + set_numa_get_interleave_mask(CAST_TO_FN_PTR(numa_get_interleave_mask_func_t, + libnuma_v2_dlsym(handle, "numa_get_interleave_mask"))); if (numa_available() != -1) { set_numa_all_nodes((unsigned long*)libnuma_dlsym(handle, "numa_all_nodes")); set_numa_all_nodes_ptr((struct bitmask **)libnuma_dlsym(handle, "numa_all_nodes_ptr")); set_numa_nodes_ptr((struct bitmask **)libnuma_dlsym(handle, "numa_nodes_ptr")); + set_numa_interleave_bitmask(_numa_get_interleave_mask()); + set_numa_membind_bitmask(_numa_get_membind()); // Create an index -> node mapping, since nodes are not always consecutive _nindex_to_node = new (ResourceObj::C_HEAP, mtInternal) GrowableArray(0, true); rebuild_nindex_to_node_map(); @@ -2929,7 +2934,7 @@ nindex_to_node()->clear(); for (int node = 0; node <= highest_node_number; node++) { - if (Linux::isnode_in_existing_nodes(node)) { + if (Linux::is_node_in_existing_nodes(node)) { nindex_to_node()->append(node); } } @@ -2966,16 +2971,16 @@ // the closest configured node. Check also if node is bound, i.e. it's allowed // to allocate memory from the node. If it's not allowed, map cpus in that node // to the closest node from which memory allocation is allowed. - if (!isnode_in_configured_nodes(nindex_to_node()->at(i)) || - !isnode_in_bound_nodes(nindex_to_node()->at(i))) { + if (!is_node_in_configured_nodes(nindex_to_node()->at(i)) || + !is_node_in_bound_nodes(nindex_to_node()->at(i))) { closest_distance = INT_MAX; // Check distance from all remaining nodes in the system. Ignore distance // from itself, from another non-configured node, and from another non-bound // node. for (size_t m = 0; m < node_num; m++) { if (m != i && - isnode_in_configured_nodes(nindex_to_node()->at(m)) && - isnode_in_bound_nodes(nindex_to_node()->at(m))) { + is_node_in_configured_nodes(nindex_to_node()->at(m)) && + is_node_in_bound_nodes(nindex_to_node()->at(m))) { distance = numa_distance(nindex_to_node()->at(i), nindex_to_node()->at(m)); // If a closest node is found, update. There is always at least one // configured and bound node in the system so there is always at least @@ -3030,9 +3035,13 @@ os::Linux::numa_bitmask_isbitset_func_t os::Linux::_numa_bitmask_isbitset; os::Linux::numa_distance_func_t os::Linux::_numa_distance; os::Linux::numa_get_membind_func_t os::Linux::_numa_get_membind; +os::Linux::numa_get_interleave_mask_func_t os::Linux::_numa_get_interleave_mask; +os::Linux::NumaAllocationPolicy os::Linux::_current_numa_policy; unsigned long* os::Linux::_numa_all_nodes; struct bitmask* os::Linux::_numa_all_nodes_ptr; struct bitmask* os::Linux::_numa_nodes_ptr; +struct bitmask* os::Linux::_numa_interleave_bitmask; +struct bitmask* os::Linux::_numa_membind_bitmask; bool os::pd_uncommit_memory(char* addr, size_t size) { uintptr_t res = (uintptr_t) ::mmap(addr, size, PROT_NONE, @@ -4944,6 +4953,74 @@ OSContainer::init(); } +void os::Linux::numa_init() { + + // Java can be invoked as + // 1. Without numactl and heap will be allocated/configured on all nodes as + // per the system policy. + // 2. With numactl --interleave: + // Use numa_get_interleave_mask(v2) API to get nodes bitmask. The same + // API for membind case bitmask is reset. + // Interleave is only hint and Kernel can fallback to other nodes if + // no memory is available on the target nodes. + // 3. With numactl --membind: + // Use numa_get_membind(v2) API to get nodes bitmask. The same API for + // interleave case returns bitmask of all nodes. + // numa_all_nodes_ptr holds bitmask of all nodes. + // numa_get_interleave_mask(v2) and numa_get_membind(v2) APIs returns correct + // bitmask when externally configured to run on all or fewer nodes. + + if (!Linux::libnuma_init()) { + UseNUMA = false; + } else { + if ((Linux::numa_max_node() < 1) || Linux::is_bound_to_single_node()) { + // If there's only one node (they start from 0) or if the process + // is bound explicitly to a single node using membind, disable NUMA. + UseNUMA = false; + } else { + + LogTarget(Info,os) log; + LogStream ls(log); + + Linux::set_configured_numa_policy(Linux::identify_numa_policy()); + + struct bitmask* bmp = Linux::_numa_membind_bitmask; + const char* numa_mode = "membind"; + + if (Linux::is_running_in_interleave_mode()) { + bmp = Linux::_numa_interleave_bitmask; + numa_mode = "interleave"; + } + + ls.print("UseNUMA is enabled and invoked in '%s' mode." + "Heap will be configured using NUMA memory nodes:", numa_mode); + + for (int node = 0; node <= Linux::numa_max_node(); node++) { + if (Linux::_numa_bitmask_isbitset(bmp, node)) { + ls.print(" %d", node); + } + } + } + } + + if (UseParallelGC && UseNUMA && UseLargePages && !can_commit_large_page_memory()) { + // With SHM and HugeTLBFS large pages we cannot uncommit a page, so there's no way + // we can make the adaptive lgrp chunk resizing work. If the user specified both + // UseNUMA and UseLargePages (or UseSHM/UseHugeTLBFS) on the command line - warn + // and disable adaptive resizing. + if (UseAdaptiveSizePolicy || UseAdaptiveNUMAChunkSizing) { + warning("UseNUMA is not fully compatible with SHM/HugeTLBFS large pages, " + "disabling adaptive resizing (-XX:-UseAdaptiveSizePolicy -XX:-UseAdaptiveNUMAChunkSizing)"); + UseAdaptiveSizePolicy = false; + UseAdaptiveNUMAChunkSizing = false; + } + } + + if (!UseNUMA && ForceNUMA) { + UseNUMA = true; + } +} + // this is called _after_ the global arguments have been parsed jint os::init_2(void) { @@ -4988,32 +5065,7 @@ Linux::glibc_version(), Linux::libpthread_version()); if (UseNUMA) { - if (!Linux::libnuma_init()) { - UseNUMA = false; - } else { - if ((Linux::numa_max_node() < 1) || Linux::isbound_to_single_node()) { - // If there's only one node (they start from 0) or if the process - // is bound explicitly to a single node using membind, disable NUMA. - UseNUMA = false; - } - } - - if (UseParallelGC && UseNUMA && UseLargePages && !can_commit_large_page_memory()) { - // With SHM and HugeTLBFS large pages we cannot uncommit a page, so there's no way - // we can make the adaptive lgrp chunk resizing work. If the user specified both - // UseNUMA and UseLargePages (or UseSHM/UseHugeTLBFS) on the command line - warn - // and disable adaptive resizing. - if (UseAdaptiveSizePolicy || UseAdaptiveNUMAChunkSizing) { - warning("UseNUMA is not fully compatible with SHM/HugeTLBFS large pages, " - "disabling adaptive resizing (-XX:-UseAdaptiveSizePolicy -XX:-UseAdaptiveNUMAChunkSizing)"); - UseAdaptiveSizePolicy = false; - UseAdaptiveNUMAChunkSizing = false; - } - } - - if (!UseNUMA && ForceNUMA) { - UseNUMA = true; - } + Linux::numa_init(); } if (MaxFDLimit) { diff -r 5d7e4d832868 src/hotspot/os/linux/os_linux.hpp --- a/src/hotspot/os/linux/os_linux.hpp Sat Jan 12 13:33:18 2019 +0100 +++ b/src/hotspot/os/linux/os_linux.hpp Sat Jan 12 23:11:54 2019 +0530 @@ -211,6 +211,7 @@ // none present private: + static void numa_init(); static void expand_stack_to(address bottom); typedef int (*sched_getcpu_func_t)(void); @@ -222,6 +223,7 @@ typedef void (*numa_interleave_memory_func_t)(void *start, size_t size, unsigned long *nodemask); typedef void (*numa_interleave_memory_v2_func_t)(void *start, size_t size, struct bitmask* mask); typedef struct bitmask* (*numa_get_membind_func_t)(void); + typedef struct bitmask* (*numa_get_interleave_mask_func_t)(void); typedef void (*numa_set_bind_policy_func_t)(int policy); typedef int (*numa_bitmask_isbitset_func_t)(struct bitmask *bmp, unsigned int n); @@ -239,9 +241,12 @@ static numa_bitmask_isbitset_func_t _numa_bitmask_isbitset; static numa_distance_func_t _numa_distance; static numa_get_membind_func_t _numa_get_membind; + static numa_get_interleave_mask_func_t _numa_get_interleave_mask; static unsigned long* _numa_all_nodes; static struct bitmask* _numa_all_nodes_ptr; static struct bitmask* _numa_nodes_ptr; + static struct bitmask* _numa_interleave_bitmask; + static struct bitmask* _numa_membind_bitmask; static void set_sched_getcpu(sched_getcpu_func_t func) { _sched_getcpu = func; } static void set_numa_node_to_cpus(numa_node_to_cpus_func_t func) { _numa_node_to_cpus = func; } @@ -255,10 +260,21 @@ static void set_numa_bitmask_isbitset(numa_bitmask_isbitset_func_t func) { _numa_bitmask_isbitset = func; } static void set_numa_distance(numa_distance_func_t func) { _numa_distance = func; } static void set_numa_get_membind(numa_get_membind_func_t func) { _numa_get_membind = func; } + static void set_numa_get_interleave_mask(numa_get_interleave_mask_func_t func) { _numa_get_interleave_mask = func; } static void set_numa_all_nodes(unsigned long* ptr) { _numa_all_nodes = ptr; } static void set_numa_all_nodes_ptr(struct bitmask **ptr) { _numa_all_nodes_ptr = (ptr == NULL ? NULL : *ptr); } static void set_numa_nodes_ptr(struct bitmask **ptr) { _numa_nodes_ptr = (ptr == NULL ? NULL : *ptr); } + static void set_numa_interleave_bitmask(struct bitmask* ptr) { _numa_interleave_bitmask = ptr ; } + static void set_numa_membind_bitmask(struct bitmask* ptr) { _numa_membind_bitmask = ptr ; } static int sched_getcpu_syscall(void); + + enum NumaAllocationPolicy { + NotInitialized, + Membind, + Interleave + }; + static NumaAllocationPolicy _current_numa_policy; + public: static int sched_getcpu() { return _sched_getcpu != NULL ? _sched_getcpu() : -1; } static int numa_node_to_cpus(int node, unsigned long *buffer, int bufferlen) { @@ -272,11 +288,33 @@ static int numa_tonode_memory(void *start, size_t size, int node) { return _numa_tonode_memory != NULL ? _numa_tonode_memory(start, size, node) : -1; } + + static bool is_running_in_interleave_mode() { + return _current_numa_policy == Interleave; + } + + static void set_configured_numa_policy(NumaAllocationPolicy numa_policy) { + _current_numa_policy = numa_policy ; + } + + static NumaAllocationPolicy identify_numa_policy() { + for (int node = 0; node <= Linux::numa_max_node(); node++) { + if (Linux::_numa_bitmask_isbitset(Linux::_numa_interleave_bitmask, node)) { + return Interleave; + } + } + return Membind; + } + static void numa_interleave_memory(void *start, size_t size) { - // Use v2 api if available - if (_numa_interleave_memory_v2 != NULL && _numa_all_nodes_ptr != NULL) { - _numa_interleave_memory_v2(start, size, _numa_all_nodes_ptr); - } else if (_numa_interleave_memory != NULL && _numa_all_nodes != NULL) { + // Prefer v2 API + if (_numa_interleave_memory_v2 != NULL) { + if (is_running_in_interleave_mode()) { + _numa_interleave_memory_v2(start, size, _numa_interleave_bitmask); + } else if (_numa_membind_bitmask != NULL) { + _numa_interleave_memory_v2(start, size, _numa_membind_bitmask); + } + } else if (_numa_interleave_memory != NULL) { _numa_interleave_memory(start, size, _numa_all_nodes); } } @@ -291,14 +329,14 @@ static int get_node_by_cpu(int cpu_id); static int get_existing_num_nodes(); // Check if numa node is configured (non-zero memory node). - static bool isnode_in_configured_nodes(unsigned int n) { + static bool is_node_in_configured_nodes(unsigned int n) { if (_numa_bitmask_isbitset != NULL && _numa_all_nodes_ptr != NULL) { return _numa_bitmask_isbitset(_numa_all_nodes_ptr, n); } else return false; } // Check if numa node exists in the system (including zero memory nodes). - static bool isnode_in_existing_nodes(unsigned int n) { + static bool is_node_in_existing_nodes(unsigned int n) { if (_numa_bitmask_isbitset != NULL && _numa_nodes_ptr != NULL) { return _numa_bitmask_isbitset(_numa_nodes_ptr, n); } else if (_numa_bitmask_isbitset != NULL && _numa_all_nodes_ptr != NULL) { @@ -317,16 +355,19 @@ return false; } // Check if node is in bound node set. - static bool isnode_in_bound_nodes(int node) { - if (_numa_get_membind != NULL && _numa_bitmask_isbitset != NULL) { - return _numa_bitmask_isbitset(_numa_get_membind(), node); - } else { - return false; + static bool is_node_in_bound_nodes(int node) { + if (_numa_bitmask_isbitset != NULL) { + if (is_running_in_interleave_mode()) { + return _numa_bitmask_isbitset(_numa_interleave_bitmask, node); + } else { + return _numa_membind_bitmask != NULL ? _numa_bitmask_isbitset(_numa_membind_bitmask, node) : false; + } } + return false; } // Check if bound to only one numa node. // Returns true if bound to a single numa node, otherwise returns false. - static bool isbound_to_single_node() { + static bool is_bound_to_single_node() { int nodes = 0; struct bitmask* bmp = NULL; unsigned int node = 0; From per.liden at oracle.com Sun Jan 13 16:19:33 2019 From: per.liden at oracle.com (Per Liden) Date: Sun, 13 Jan 2019 17:19:33 +0100 Subject: [URGENT] RFR: 8216595: Fix broken builds after JDK-8216424 Message-ID: JDK-8216424 removed the variable _total_visits, but left a use of it causing non-product builds to fail. The fix is pretty straight forward, so I propose we fix the issue rather than back out JDK-8216424. Bug: https://bugs.openjdk.java.net/browse/JDK-8216595 Webrev: http://cr.openjdk.java.net/~pliden/8216595/webrev.0 Testing: tier1 /Per From claes.redestad at oracle.com Sun Jan 13 16:27:48 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Sun, 13 Jan 2019 17:27:48 +0100 Subject: [URGENT] RFR: 8216595: Fix broken builds after JDK-8216424 In-Reply-To: References: Message-ID: <48185377-B458-4796-8493-AD20F70FF97B@oracle.com> Looks good and trivial, thanks for fixing! /Claes Per Liden skrev: (13 januari 2019 17:19:33 CET) >JDK-8216424 removed the variable _total_visits, but left a use of it >causing non-product builds to fail. The fix is pretty straight forward, > >so I propose we fix the issue rather than back out JDK-8216424. > >Bug: https://bugs.openjdk.java.net/browse/JDK-8216595 >Webrev: http://cr.openjdk.java.net/~pliden/8216595/webrev.0 > >Testing: tier1 > >/Per From per.liden at oracle.com Sun Jan 13 16:30:14 2019 From: per.liden at oracle.com (Per Liden) Date: Sun, 13 Jan 2019 17:30:14 +0100 Subject: [URGENT] RFR: 8216595: Fix broken builds after JDK-8216424 In-Reply-To: <48185377-B458-4796-8493-AD20F70FF97B@oracle.com> References: <48185377-B458-4796-8493-AD20F70FF97B@oracle.com> Message-ID: <33a42242-f11c-7ccf-f484-84d93eaf4d80@oracle.com> Thanks for reviewing. Since it's urgent and trivial I will go ahead and push. /Per On 01/13/2019 05:27 PM, Claes Redestad wrote: > Looks good and trivial, thanks for fixing! > > /Claes > > Per Liden skrev: (13 januari 2019 17:19:33 CET) > > JDK-8216424 removed the variable _total_visits, but left a use of it > causing non-product builds to fail. The fix is pretty straight forward, > so I propose we fix the issue rather than back out JDK-8216424. > > Bug:https://bugs.openjdk.java.net/browse/JDK-8216595 > Webrev:http://cr.openjdk.java.net/~pliden/8216595/webrev.0 > > Testing: tier1 > > /Per > From Alan.Bateman at oracle.com Sun Jan 13 16:31:17 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sun, 13 Jan 2019 16:31:17 +0000 Subject: [URGENT] RFR: 8216595: Fix broken builds after JDK-8216424 In-Reply-To: References: Message-ID: On 13/01/2019 16:19, Per Liden wrote: > JDK-8216424 removed the variable _total_visits, but left a use of it > causing non-product builds to fail. The fix is pretty straight > forward, so I propose we fix the issue rather than back out JDK-8216424. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8216595 > Webrev: http://cr.openjdk.java.net/~pliden/8216595/webrev.0 > Just read the thread on JDK-8216424. Making sense to just remove this usage rather than backing out the change. The patch looks okay to me (in second you need a second Reviewer to fix this build break). -Alan From per.liden at oracle.com Sun Jan 13 16:32:43 2019 From: per.liden at oracle.com (Per Liden) Date: Sun, 13 Jan 2019 17:32:43 +0100 Subject: [URGENT] RFR: 8216595: Fix broken builds after JDK-8216424 In-Reply-To: References: Message-ID: Thanks Alan! /Per On 01/13/2019 05:31 PM, Alan Bateman wrote: > On 13/01/2019 16:19, Per Liden wrote: >> JDK-8216424 removed the variable _total_visits, but left a use of it >> causing non-product builds to fail. The fix is pretty straight >> forward, so I propose we fix the issue rather than back out JDK-8216424. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8216595 >> Webrev: http://cr.openjdk.java.net/~pliden/8216595/webrev.0 >> > Just read the thread on JDK-8216424. Making sense to just remove this > usage rather than backing out the change. The patch looks okay to me (in > second you need a second Reviewer to fix this build break). > > -Alan From vladimir.kozlov at oracle.com Sun Jan 13 16:52:47 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sun, 13 Jan 2019 08:52:47 -0800 Subject: [URGENT] RFR: 8216595: Fix broken builds after JDK-8216424 In-Reply-To: <48185377-B458-4796-8493-AD20F70FF97B@oracle.com> References: <48185377-B458-4796-8493-AD20F70FF97B@oracle.com> Message-ID: <101ac2ab-4d1d-678f-8c9d-d5984bf80711@oracle.com> +1 Vladimir On 1/13/19 8:27 AM, Claes Redestad wrote: > Looks good and trivial, thanks for fixing! > > /Claes > > Per Liden skrev: (13 januari 2019 17:19:33 CET) >> JDK-8216424 removed the variable _total_visits, but left a use of it >> causing non-product builds to fail. The fix is pretty straight forward, >> >> so I propose we fix the issue rather than back out JDK-8216424. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8216595 >> Webrev: http://cr.openjdk.java.net/~pliden/8216595/webrev.0 >> >> Testing: tier1 >> >> /Per From thomas.schatzl at oracle.com Mon Jan 14 09:34:24 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 14 Jan 2019 10:34:24 +0100 Subject: RFR (S/M): 8213827: NUMA heap allocation does not respect process membind/interleave settings [Was: Re: [PATCH] JDK NUMA Interleaving issue] In-Reply-To: References: <6e5b102d07b4ceded09115a649be020410240fe7.camel@oracle.com> <9bea7b0957bbfc2f0ac34306ee162f2d98e44bfe.camel@oracle.com> <99164b92f47f264978339ed327da9d41098a7e1d.camel@oracle.com> <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> <3b8edd37-80cd-0f06-55ed-326972db98de@oracle.com> <2bc2fce2403324739a030d929b847fece95b0e25.camel@oracle.com> Message-ID: <7694bb5a63c02bc94c1380982cad2a5ddb27916f.camel@oracle.com> Hi Amith, On Sat, 2019-01-12 at 22:57 +0530, amith pawar wrote: > Hi Thomas, > > SPECJBB shows following improvements with latest patch. > 1. max-jOPs around 7-9% > 2. critical-jOPS around 4-50% thanks! > > In webrev.4, Sangheon suggested following change is missing > + ls.print("UseNUMA is enabled and invoked in '%s' mode." > + " Heap will be configured using NUMA memory nodes:", > numa_mode); > There is one more space before " Heap.... ", please remove it. The space before "Heap" is the space after the full stop in the preceding sentence so needed; I moved the space to the previous line though. > > Also os_linux.hpp is already updated for new copyright year so patch > import fails. > > The attached patch contains these changes. Please do check. Regenerated the v4 webrevs. Thanks, Thomas From thomas.schatzl at oracle.com Mon Jan 14 09:49:02 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 14 Jan 2019 10:49:02 +0100 Subject: Assert poisoning initialization before os memory subsystem Message-ID: <8d1e736f3764f02ffd534c8d8fd6ff715271ea1a.camel@oracle.com> Hi Thomas, the change "JDK-8191101: Show register content in hs-err file on assert" added some code that initializes some memory page for retrieving the register context on assert failure. This reserves, commits and mprotects a single VM page. The problem is that this occurs before the os subsystem completely initialized itself, causing issues if not guarded against in the commit code - i.e. if UseNUMA is enabled, the VM did not get a chance to initialize the NUMA subsystem (even if it just disables it). Obviously all os implementations guard already against this (because they do not crash; during review of JDK-8213827 we temporarily removed this guard and hence noticed this), and from a functionality pov NUMA initialization is not needed for this feature. However I was wondering whether it is really necessary to do this assert poisoning setup before the os::init_2 call. Do you have any recollection why the poisoining (needs to?) occurs that early? It looks like a bit of an ugly wart to use the os component without having it completely initialized. Thanks, Thomas :P From shade at redhat.com Mon Jan 14 12:32:31 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 14 Jan 2019 13:32:31 +0100 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> Message-ID: On 1/12/19 1:43 AM, David Holmes wrote: >>> I'm also very unclear about how the redefinition case is currently handled. It seems that we will >>> normally intern NULL (and presumably get a NULL or empty-string oop?) unless ShowHiddenFrames is >>> set, in which case we use the unknown_class_name() - regardless of whether the frame is actually >>> hidden or not! This seems broken to me. (Separate bug to fix that is okay if it is indeed broken.) >> >> This looks like a bug, but I'm not sure what ShowHiddenFrames is supposed to do here, or how it >> got there.? I think if Aleksey removed that with this patch it would be fine with me. > > I think use of ShowHiddenFrames here is completely broken. But a seperate bug and some suitable > archaeology is needed to fix it the right way. Okay, are we in agreement that current patch does not break anything new? If so, let's push the current patch in its current form, and then follow up on ShowHiddenFrames in a separate issue. This would also make current patch simply backportable to 11. Current patch (no changes since last time): http://cr.openjdk.java.net/~shade/8216308/webrev.01/ -Aleksey From coleen.phillimore at oracle.com Mon Jan 14 12:47:16 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Jan 2019 07:47:16 -0500 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> Message-ID: I agree. https://bugs.openjdk.java.net/browse/JDK-8216977 Coleen On 1/14/19 7:32 AM, Aleksey Shipilev wrote: > On 1/12/19 1:43 AM, David Holmes wrote: >>>> I'm also very unclear about how the redefinition case is currently handled. It seems that we will >>>> normally intern NULL (and presumably get a NULL or empty-string oop?) unless ShowHiddenFrames is >>>> set, in which case we use the unknown_class_name() - regardless of whether the frame is actually >>>> hidden or not! This seems broken to me. (Separate bug to fix that is okay if it is indeed broken.) >>> This looks like a bug, but I'm not sure what ShowHiddenFrames is supposed to do here, or how it >>> got there.? I think if Aleksey removed that with this patch it would be fine with me. >> I think use of ShowHiddenFrames here is completely broken. But a seperate bug and some suitable >> archaeology is needed to fix it the right way. > Okay, are we in agreement that current patch does not break anything new? If so, let's push the > current patch in its current form, and then follow up on ShowHiddenFrames in a separate issue. This > would also make current patch simply backportable to 11. > > Current patch (no changes since last time): > http://cr.openjdk.java.net/~shade/8216308/webrev.01/ > > -Aleksey > From david.holmes at oracle.com Mon Jan 14 13:06:53 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 14 Jan 2019 23:06:53 +1000 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> Message-ID: <36682829-e2d5-c1aa-edbb-c49b8f1265cf@oracle.com> On 14/01/2019 10:32 pm, Aleksey Shipilev wrote: > On 1/12/19 1:43 AM, David Holmes wrote: >>>> I'm also very unclear about how the redefinition case is currently handled. It seems that we will >>>> normally intern NULL (and presumably get a NULL or empty-string oop?) unless ShowHiddenFrames is >>>> set, in which case we use the unknown_class_name() - regardless of whether the frame is actually >>>> hidden or not! This seems broken to me. (Separate bug to fix that is okay if it is indeed broken.) >>> >>> This looks like a bug, but I'm not sure what ShowHiddenFrames is supposed to do here, or how it >>> got there.? I think if Aleksey removed that with this patch it would be fine with me. >> >> I think use of ShowHiddenFrames here is completely broken. But a seperate bug and some suitable >> archaeology is needed to fix it the right way. > > Okay, are we in agreement that current patch does not break anything new? If so, let's push the Okay I agree it doesn't break anything new though I'd be happier if the Backtrace::get_line_number issue was fixed. Otherwise it needs a follow up bug too - I'm starting to think it makes no sense to allow redefinition to occur within a method like this! And this is a distinct issue from ShowHiddenFrames. David ----- > current patch in its current form, and then follow up on ShowHiddenFrames in a separate issue. This > would also make current patch simply backportable to 11. > > Current patch (no changes since last time): > http://cr.openjdk.java.net/~shade/8216308/webrev.01/ > > -Aleksey > From coleen.phillimore at oracle.com Mon Jan 14 13:23:12 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Jan 2019 08:23:12 -0500 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <36682829-e2d5-c1aa-edbb-c49b8f1265cf@oracle.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> <36682829-e2d5-c1aa-edbb-c49b8f1265cf@oracle.com> Message-ID: <9d5df7a6-f374-6d6d-1ff1-a9e19814ce14@oracle.com> On 1/14/19 8:06 AM, David Holmes wrote: > On 14/01/2019 10:32 pm, Aleksey Shipilev wrote: >> On 1/12/19 1:43 AM, David Holmes wrote: >>>>> I'm also very unclear about how the redefinition case is currently >>>>> handled. It seems that we will >>>>> normally intern NULL (and presumably get a NULL or empty-string >>>>> oop?) unless ShowHiddenFrames is >>>>> set, in which case we use the unknown_class_name() - regardless of >>>>> whether the frame is actually >>>>> hidden or not! This seems broken to me. (Separate bug to fix that >>>>> is okay if it is indeed broken.) >>>> >>>> This looks like a bug, but I'm not sure what ShowHiddenFrames is >>>> supposed to do here, or how it >>>> got there.? I think if Aleksey removed that with this patch it >>>> would be fine with me. >>> >>> I think use of ShowHiddenFrames here is completely broken. But a >>> seperate bug and some suitable >>> archaeology is needed to fix it the right way. >> >> Okay, are we in agreement that current patch does not break anything >> new? If so, let's push the > > Okay I agree it doesn't break anything new though I'd be happier if > the Backtrace::get_line_number issue was fixed. Otherwise it needs a > follow up bug too - I'm starting to think it makes no sense to allow > redefinition to occur within a method like this! And this is a > distinct issue from ShowHiddenFrames. There's no practical way to disable redefinition here and it's not something that happens.? You can file a bug for it if you like, but it's not something worth fixing. thanks, Coleen > > David > ----- > > >> current patch in its current form, and then follow up on >> ShowHiddenFrames in a separate issue. This >> would also make current patch simply backportable to 11. >> >> Current patch (no changes since last time): >> ?? http://cr.openjdk.java.net/~shade/8216308/webrev.01/ >> >> -Aleksey >> From coleen.phillimore at oracle.com Mon Jan 14 13:49:19 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Jan 2019 08:49:19 -0500 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <9d5df7a6-f374-6d6d-1ff1-a9e19814ce14@oracle.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> <36682829-e2d5-c1aa-edbb-c49b8f1265cf@oracle.com> <9d5df7a6-f374-6d6d-1ff1-a9e19814ce14@oracle.com> Message-ID: <4df73169-dec5-cad0-cca2-28ac9689cbc1@oracle.com> On 1/14/19 8:23 AM, coleen.phillimore at oracle.com wrote: > > > On 1/14/19 8:06 AM, David Holmes wrote: >> On 14/01/2019 10:32 pm, Aleksey Shipilev wrote: >>> On 1/12/19 1:43 AM, David Holmes wrote: >>>>>> I'm also very unclear about how the redefinition case is >>>>>> currently handled. It seems that we will >>>>>> normally intern NULL (and presumably get a NULL or empty-string >>>>>> oop?) unless ShowHiddenFrames is >>>>>> set, in which case we use the unknown_class_name() - regardless >>>>>> of whether the frame is actually >>>>>> hidden or not! This seems broken to me. (Separate bug to fix that >>>>>> is okay if it is indeed broken.) >>>>> >>>>> This looks like a bug, but I'm not sure what ShowHiddenFrames is >>>>> supposed to do here, or how it >>>>> got there.? I think if Aleksey removed that with this patch it >>>>> would be fine with me. >>>> >>>> I think use of ShowHiddenFrames here is completely broken. But a >>>> seperate bug and some suitable >>>> archaeology is needed to fix it the right way. >>> >>> Okay, are we in agreement that current patch does not break anything >>> new? If so, let's push the >> >> Okay I agree it doesn't break anything new though I'd be happier if >> the Backtrace::get_line_number issue was fixed. Otherwise it needs a >> follow up bug too - I'm starting to think it makes no sense to allow >> redefinition to occur within a method like this! And this is a >> distinct issue from ShowHiddenFrames. > > There's no practical way to disable redefinition here and it's not > something that happens.? You can file a bug for it if you like, but > it's not something worth fixing. Looking again at the original code.? If you have a redefinition at StringTable::intern() the line number from the methodHandle (Method*) is correct.? In this case it's the "old" method.? It's where the original method had the exception, which is what you want to print. Coleen > > thanks, > Coleen > >> >> David >> ----- >> >> >>> current patch in its current form, and then follow up on >>> ShowHiddenFrames in a separate issue. This >>> would also make current patch simply backportable to 11. >>> >>> Current patch (no changes since last time): >>> ?? http://cr.openjdk.java.net/~shade/8216308/webrev.01/ >>> >>> -Aleksey >>> > From thomas.stuefe at gmail.com Mon Jan 14 14:29:25 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 14 Jan 2019 15:29:25 +0100 Subject: Assert poisoning initialization before os memory subsystem In-Reply-To: <8d1e736f3764f02ffd534c8d8fd6ff715271ea1a.camel@oracle.com> References: <8d1e736f3764f02ffd534c8d8fd6ff715271ea1a.camel@oracle.com> Message-ID: Hi Thomas yes, that is ugly and unnecessary. I think in the original version I used raw mmap/VirtualAlloc and later switched to os::reserve_memory() because of reuse-guilt, but forgot to think about init dependencies. Initialization can happen at a later point. If an assert happens before the poison page is initialized nothing bad happens, we just won't see register values which is no big deal. I opened https://bugs.openjdk.java.net/browse/JDK-8216982 Cheers, Thomas :) On Mon, Jan 14, 2019 at 10:49 AM Thomas Schatzl wrote: > > Hi Thomas, > > the change "JDK-8191101: Show register content in hs-err file on > assert" added some code that initializes some memory page for > retrieving the register context on assert failure. > > This reserves, commits and mprotects a single VM page. > > The problem is that this occurs before the os subsystem completely > initialized itself, causing issues if not guarded against in the commit > code - i.e. if UseNUMA is enabled, the VM did not get a chance to > initialize the NUMA subsystem (even if it just disables it). > > Obviously all os implementations guard already against this (because > they do not crash; during review of JDK-8213827 we temporarily removed > this guard and hence noticed this), and from a functionality pov NUMA > initialization is not needed for this feature. > > However I was wondering whether it is really necessary to do this > assert poisoning setup before the os::init_2 call. Do you have any > recollection why the poisoining (needs to?) occurs that early? > > It looks like a bit of an ugly wart to use the os component without > having it completely initialized. > > Thanks, > Thomas :P > From martin.doerr at sap.com Mon Jan 14 15:12:33 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 14 Jan 2019 15:12:33 +0000 Subject: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. In-Reply-To: <7cefd8a46ae647969894c43cab72bc88@sap.com> References: <9349eed214ce46ee81868840c0dbd54d@sap.com> <6277c580-0397-d4ab-7b03-7721544048ff@oracle.com> <129ed17946754b9c896fa41dd44d031f@sap.com> <7cefd8a46ae647969894c43cab72bc88@sap.com> Message-ID: Hi G?tz, do we have an indentation rule for " .toAbsolutePath().toString();" in JliLaunchTest.java? Nice cleanup! Looks good. Best regards, Martin -----Original Message----- From: hotspot-dev On Behalf Of Lindenmaier, Goetz Sent: Mittwoch, 9. Januar 2019 13:52 To: David Holmes ; 'hotspot-dev at openjdk.java.net' ; gary.adams at oracle.com Subject: RE: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. Hi David, I fixed these locally. Best regards, Goetz. > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 9. Januar 2019 13:28 > To: Lindenmaier, Goetz ; 'hotspot- > dev at openjdk.java.net' ; > gary.adams at oracle.com > Subject: Re: RFR(M): 8216265: [testbug] Introduce > Platform.sharedLibraryPathVariableName() and adapt all tests. > > Hi Goetz, > > On 9/01/2019 8:34 pm, Lindenmaier, Goetz wrote: > > Hi David, > > > > thanks for looking at my change. > > It was asked for by Gary when he reviewed > https://bugs.openjdk.java.net/browse/JDK-8215975 > > > > New webrev: > > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02-incremental/ > > Looks good. Two further pre-existing nits spotted: > > test/hotspot/jtreg/gtest/GTestWrapper.java > > ! * Copyright (c) 2016, 2019 Oracle > > Need a comma after 2019. > > Ditto for: > > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > ChannelTest.java > > Actually I now see quite a number of files missing the comma so > I'll file a general bug to fix that. > > Thanks, > David > > > > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02/ > > > > See my comments inline below. > > > > Best regards, > > Goetz. > > > >> test/hotspot/jtreg/gtest/GTestWrapper.java > >> > >> 75 env.put(pathVar, path + ":" + ldLibraryPath); > >> > >> Shouldn't ":" be File.pathSeparator? > > Fixed. > > > >> > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > >> ChannelTest.java > >> > >> Copyright year needs updating. > > Done. > > > >> > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > >> ChannelTest.java > >> > >> 70 private static final Path pathEnvVar > >> > >> The variable isn't an env var, it's just a path - I suggest libraryPath. > > A cleanup not directly related. But makes sense, done. > > > >> 101 > >> System.out.println(Platform.sharedLibraryPathVariableName() + "=" + > >> pathEnvVar); > >> ... > >> 114 env.put(Platform.sharedLibraryPathVariableName(), > >> pathEnvVar.toString()); > >> > >> I suggest storing the name in a local to avoid the second call. > > Done. > > > >> test/jdk/tools/launcher/JliLaunchTest.java > >> > >> 57 env.compute(pathEnvVar, (k, v) -> (v == null) ? libdir > >> : libdir + ":" + v); > >> > >> Shouldn't ":" be File.pathSeparator? > > This is because there is anyways a switch about the OS. > > Did some more cleaning up. > > > >> test/jdk/tools/launcher/Test7029048.java > >> > >> 39 import jdk.test.lib.Platform; > >> > >> Why do you need this? > > Removed. > > > >> test/jdk/vm/JniInvocationTest.java > >> > >> This is a Mac only test so no changes needed. > > I would like to change this anyways. I think this makes > > it look more consistent. > > > >> test/lib/jdk/test/lib/Platform.java > >> > >> The javadoc comments is unnecessary as we don't generate javadoc here. > I > >> see you copied the preceding sharedLibraryExt() style. The @return is > >> superfluous. > > Changed. Better? > > > > > >> > >> Thanks, > >> David > >> > >>> Best regards, > >>> Goetz. > >>> From goetz.lindenmaier at sap.com Mon Jan 14 15:24:21 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 14 Jan 2019 15:24:21 +0000 Subject: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. In-Reply-To: References: <9349eed214ce46ee81868840c0dbd54d@sap.com> <6277c580-0397-d4ab-7b03-7721544048ff@oracle.com> <129ed17946754b9c896fa41dd44d031f@sap.com> <7cefd8a46ae647969894c43cab72bc88@sap.com> Message-ID: <9ae6d6c231d24ba699b5785e06faebe9@sap.com> Hi Martin, thanks for reveiwing. I'll fix the indentation to 4 before pushing. Best regards, Goetz. > -----Original Message----- > From: Doerr, Martin > Sent: Montag, 14. Januar 2019 16:13 > To: Lindenmaier, Goetz ; David Holmes > ; 'hotspot-dev at openjdk.java.net' dev at openjdk.java.net>; gary.adams at oracle.com > Subject: RE: RFR(M): 8216265: [testbug] Introduce > Platform.sharedLibraryPathVariableName() and adapt all tests. > > Hi G?tz, > > do we have an indentation rule for " .toAbsolutePath().toString();" in > JliLaunchTest.java? > > Nice cleanup! Looks good. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-dev On Behalf Of > Lindenmaier, Goetz > Sent: Mittwoch, 9. Januar 2019 13:52 > To: David Holmes ; 'hotspot- > dev at openjdk.java.net' ; > gary.adams at oracle.com > Subject: RE: RFR(M): 8216265: [testbug] Introduce > Platform.sharedLibraryPathVariableName() and adapt all tests. > > Hi David, > > I fixed these locally. > > Best regards, > Goetz. > > > > > -----Original Message----- > > From: David Holmes > > Sent: Mittwoch, 9. Januar 2019 13:28 > > To: Lindenmaier, Goetz ; 'hotspot- > > dev at openjdk.java.net' ; > > gary.adams at oracle.com > > Subject: Re: RFR(M): 8216265: [testbug] Introduce > > Platform.sharedLibraryPathVariableName() and adapt all tests. > > > > Hi Goetz, > > > > On 9/01/2019 8:34 pm, Lindenmaier, Goetz wrote: > > > Hi David, > > > > > > thanks for looking at my change. > > > It was asked for by Gary when he reviewed > > https://bugs.openjdk.java.net/browse/JDK-8215975 > > > > > > New webrev: > > > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02- > incremental/ > > > > Looks good. Two further pre-existing nits spotted: > > > > test/hotspot/jtreg/gtest/GTestWrapper.java > > > > ! * Copyright (c) 2016, 2019 Oracle > > > > Need a comma after 2019. > > > > Ditto for: > > > > > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > > ChannelTest.java > > > > Actually I now see quite a number of files missing the comma so > > I'll file a general bug to fix that. > > > > Thanks, > > David > > > > > > > http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02/ > > > > > > See my comments inline below. > > > > > > Best regards, > > > Goetz. > > > > > >> test/hotspot/jtreg/gtest/GTestWrapper.java > > >> > > >> 75 env.put(pathVar, path + ":" + ldLibraryPath); > > >> > > >> Shouldn't ":" be File.pathSeparator? > > > Fixed. > > > > > >> > > > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > > >> ChannelTest.java > > >> > > >> Copyright year needs updating. > > > Done. > > > > > >> > > > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > > >> ChannelTest.java > > >> > > >> 70 private static final Path pathEnvVar > > >> > > >> The variable isn't an env var, it's just a path - I suggest libraryPath. > > > A cleanup not directly related. But makes sense, done. > > > > > >> 101 > > >> System.out.println(Platform.sharedLibraryPathVariableName() + "=" + > > >> pathEnvVar); > > >> ... > > >> 114 env.put(Platform.sharedLibraryPathVariableName(), > > >> pathEnvVar.toString()); > > >> > > >> I suggest storing the name in a local to avoid the second call. > > > Done. > > > > > >> test/jdk/tools/launcher/JliLaunchTest.java > > >> > > >> 57 env.compute(pathEnvVar, (k, v) -> (v == null) ? libdir > > >> : libdir + ":" + v); > > >> > > >> Shouldn't ":" be File.pathSeparator? > > > This is because there is anyways a switch about the OS. > > > Did some more cleaning up. > > > > > >> test/jdk/tools/launcher/Test7029048.java > > >> > > >> 39 import jdk.test.lib.Platform; > > >> > > >> Why do you need this? > > > Removed. > > > > > >> test/jdk/vm/JniInvocationTest.java > > >> > > >> This is a Mac only test so no changes needed. > > > I would like to change this anyways. I think this makes > > > it look more consistent. > > > > > >> test/lib/jdk/test/lib/Platform.java > > >> > > >> The javadoc comments is unnecessary as we don't generate javadoc > here. > > I > > >> see you copied the preceding sharedLibraryExt() style. The @return is > > >> superfluous. > > > Changed. Better? > > > > > > > > >> > > >> Thanks, > > >> David > > >> > > >>> Best regards, > > >>> Goetz. > > >>> From harold.seigel at oracle.com Mon Jan 14 15:34:43 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Mon, 14 Jan 2019 10:34:43 -0500 Subject: RFR 8216563: [TESTBUG] Change stressTime to default to 30 for nsk tests (part 2) Message-ID: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> Hi, Please review this fix to change the default stress time for hotspot vmTestbase tests from 60 seconds to 30 seconds.? The fix for JDK-8207964 intended to do this but was incomplete.? This fix provides the additional needed changes. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8216563/webrev/index.html JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216563 The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, Windows, Solaris, and Mac OS X. Thanks, Harold From jcbeyler at google.com Mon Jan 14 15:43:48 2019 From: jcbeyler at google.com (JC Beyler) Date: Mon, 14 Jan 2019 07:43:48 -0800 Subject: RFR (L) 8213501 : Deploy ExceptionJniWrapper for a few tests In-Reply-To: References: <895ef766-9c96-7185-4222-178379629ce4@oracle.com> <04a464fa-c1c8-5d86-3633-0b532840561c@oracle.com> <7ef06464-a614-8941-bb51-ce1c467889b2@oracle.com> <45341168-e7e0-90d1-449f-210500882b8f@oracle.com> <55283958-de3d-07f2-51e3-ad34c5046a96@oracle.com> <31613f88-5f7d-938d-e9f6-69cdaf857268@oracle.com> <839301b7-c247-df3b-e485-283e8bb7388b@oracle.com> <95fe277d-ba6e-4fec-77aa-d1f1051751aa@oracle.com> <72bf2f4a-5bf7-98de-5f00-68485072923d@oracle.com> Message-ID: Hi all, Friendly ping on this one, I know that it has been a long process with back and forths, to which I apologize... But is there any way I could get a final LGTM for version 6? Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 Thanks! Jc On Tue, Jan 8, 2019 at 10:05 AM JC Beyler wrote: > Happy new year all! > > Could I get a final LGTM for version 6? > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > > Thanks! > Jc > > On Mon, Dec 17, 2018 at 8:43 AM JC Beyler wrote: > >> Hi all, >> >> I don't believe I got actual LGTM for this version: >> >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >> >> >> It removed the namespaces and uses explicit static instead :) >> >> Thanks! >> Jc >> >> On Wed, Dec 12, 2018 at 8:06 PM JC Beyler wrote: >> >>> So did I Alexey but with David & Serguei preferring static, it seems >>> more reasonable to go down their route :-) >>> >>> So here is the latest webrev with static instead of an anonymous >>> namespace: >>> >>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>> Let me know what you think, can I get a webrev 06 review? >>> >>> Thanks! >>> Jc >>> >>> On Wed, Dec 12, 2018 at 3:10 PM Alex Menkov >>> wrote: >>> >>>> Hm.. >>>> I considered unnamed namespaces "C++ style" (and static globals as "C >>>> style"). >>>> Static globals were deprecated in C++ (but some time ago the >>>> deprecation >>>> was reverted). >>>> >>>> --alex >>>> >>>> On 12/12/2018 13:55, serguei.spitsyn at oracle.com wrote: >>>> > Agreed. >>>> > >>>> > Thanks, >>>> > Serguei >>>> > >>>> > >>>> > On 12/12/18 13:52, David Holmes wrote: >>>> >> FWIW I think namespaces are overkill in all of this test code and >>>> just >>>> >> obfuscates things - the declaration is easily missed. A static >>>> >> variable in a .cpp is clearly a global variable to the file. >>>> >> >>>> >> Cheers, >>>> >> David >>>> >> >>>> >> >>>> >> >>>> >> On 13/12/2018 5:37 am, serguei.spitsyn at oracle.com wrote: >>>> >>> Hi Jc, >>>> >>> >>>> >>> >>>> >>> On 12/11/18 21:16, JC Beyler wrote: >>>> >>>> Hi all, >>>> >>>> >>>> >>>> Here is the new webrev with the TEST.groups change. Serguei, let >>>> me >>>> >>>> know if I convinced you with the static vs anonymous namespaces or >>>> >>>> if you'd still rather have a "static" for now :-) >>>> >>> >>>> >>> >>>> >>> What do you think about this post? : >>>> >>> >>>> https://stackoverflow.com/questions/11623451/static-vs-non-static-variables-in-namespace >>>> >>> >>>> >>> >>>> >>>> >>>> >>>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.05/ >>>> >>>> >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>> >>>> >>> The update looks fine. >>>> >>> >>>> >>> Thanks, >>>> >>> Serguei >>>> >>> >>>> >>> >>>> >>> Thanks, >>>> >>> Serguei >>>> >>> >>>> >>>> >>>> >>>> Thanks again for the reviews! >>>> >>>> Jc >>>> >>>> >>>> >>>> On Mon, Dec 10, 2018 at 3:10 PM JC Beyler >>> >>>> > wrote: >>>> >>>> >>>> >>>> Hi Serguei, >>>> >>>> >>>> >>>> Yes basically it is equivalent :) I can put them in but they >>>> are >>>> >>>> not required. The norm actually wanted to deprecate it but then >>>> >>>> remembered that C compatibility would require the static >>>> key-word >>>> >>>> for this case [1] >>>> >>>> >>>> >>>> So, really, they are not required here and will amount to the >>>> same >>>> >>>> thing: only that file can refer to them and you cannot get to >>>> them >>>> >>>> without a globally available method to return a pointer to them >>>> >>>> (ie same as a static variable in C). >>>> >>>> >>>> >>>> I can put static if it makes it easier to see but, by being in >>>> an >>>> >>>> anonymous namespace they are only available for the file's >>>> >>>> translation unit. For example: >>>> >>>> >>>> >>>> $ cat main.cpp >>>> >>>> >>>> >>>> int totally_global; >>>> >>>> static int explictly_static; >>>> >>>> >>>> >>>> namespace { >>>> >>>> int implicitly_static; >>>> >>>> } >>>> >>>> >>>> >>>> void foo(); >>>> >>>> int main() { >>>> >>>> foo(); >>>> >>>> } >>>> >>>> >>>> >>>> $ g++ -O3 main.cpp -c >>>> >>>> $ nm main.o >>>> >>>> U _GLOBAL_OFFSET_TABLE_ >>>> >>>> 0000000000000000 T main >>>> >>>> 0000000000000000 B totally_global >>>> >>>> U _Z3foov >>>> >>>> >>>> >>>> As you can see, the static and anonymous namespace variables >>>> are >>>> >>>> not in the file due to not being used. If you were to use them, >>>> >>>> you'd see them show up as something like: >>>> >>>> 0000000000000008 b _ZL17explicitly_static >>>> >>>> 0000000000000004 b _ZN12_GLOBAL__N_117implicitly_staticE >>>> >>>> >>>> >>>> Where again, it shows that it is mangling the names so that no >>>> >>>> external usage can happen without tinkering. >>>> >>>> >>>> >>>> Hopefully that helps :-), >>>> >>>> Jc >>>> >>>> >>>> >>>> [1] >>>> >>>> http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1012 >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Dec 10, 2018 at 2:04 PM serguei.spitsyn at oracle.com >>>> >>>> < >>>> serguei.spitsyn at oracle.com >>>> >>>> > wrote: >>>> >>>> >>>> >>>> Hi Jc, >>>> >>>> >>>> >>>> I had little experience with the C++ namespaces. >>>> >>>> My understanding is that static in this context should mean >>>> >>>> internal linkage. >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Serguei >>>> >>>> >>>> >>>> >>>> >>>> On 12/10/18 13:57, JC Beyler wrote: >>>> >>>>> Hi Serguei, >>>> >>>>> >>>> >>>>> The variables and functions are in a anonymous namespace; >>>> my >>>> >>>>> understanding of C++ is that this is equivalent to >>>> putting it >>>> >>>>> as static.Hence, I didn't add them there. Does that make >>>> >>>>> sense? >>>> >>>>> >>>> >>>>> Thanks! >>>> >>>>> Jc >>>> >>>>> >>>> >>>>> On Mon, Dec 10, 2018 at 1:33 PM >>>> serguei.spitsyn at oracle.com >>>> >>>>> >>>> >>>>> >>> >>>>> > wrote: >>>> >>>>> >>>> >>>>> Hi Jc, >>>> >>>>> >>>> >>>>> It looks good in general. >>>> >>>>> One question though. >>>> >>>>> >>>> >>>>> >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a_04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html >>>> >>>>> >>>> >>>>> >>>> >>>>> I wonder if the variables and functions have to be >>>> static. >>>> >>>>> >>>> >>>>> Thanks, >>>> >>>>> Serguei >>>> >>>>> >>>> >>>>> >>>> >>>>> On 12/5/18 11:36, JC Beyler wrote: >>>> >>>>>> Hi all, >>>> >>>>>> >>>> >>>>>> My apologies to having to come back for another >>>> review >>>> >>>>>> for this change: I ran into a snag when trying to >>>> pull >>>> >>>>>> the latest changes compared to the base I was working >>>> >>>>>> on. I basically forgot that there was an issue with >>>> >>>>>> snprintf and that I had solved it via JDK-8213622. >>>> >>>>>> >>>> >>>>>> Could I have a new review of this webrev: >>>> >>>>>> Webrev: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/ >>>> >>>>>> >>>> >>>>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> Incremental from the port of webrev.03 that got >>>> LGTMs: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04/ >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> A few comments on this because it took me a while to >>>> get >>>> >>>>>> things in a state I thought was good: >>>> >>>>>> - I had to implement an itoa method, do we have >>>> >>>>>> something like that in the test base (remember that >>>> >>>>>> JDK-8213622 could not use sprintf due to being in the >>>> >>>>>> test code)? >>>> >>>>>> >>>> >>>>>> - The differences here compared to the one you all >>>> >>>>>> reviewed are: >>>> >>>>>> - I found that adding to the strlen/memcpy >>>> error >>>> >>>>>> prone and thought that I would try to make it less >>>> so. >>>> >>>>>> If you want to compare, I extended the strlen/memcpy >>>> >>>>>> with the new format to show you if you prefer [1] >>>> >>>>>> - Note that the diff between the "old >>>> >>>>>> extended way from [1]" to the webrev.04 can be found >>>> >>>>>> in [2] >>>> >>>>>> >>>> >>>>>> - I added a test to test the exception wrapper >>>> in >>>> >>>>>> tests :); I'm not sure it is deemed useful or not but >>>> >>>>>> helped me assure myself that I was not doing things >>>> >>>>>> wrong; you can find the base test file here [3]; >>>> should >>>> >>>>>> we have this or not? (I know that normally we don't >>>> add >>>> >>>>>> tests to vmTestbase but thought this might be an >>>> >>>>>> exception) >>>> >>>>>> >>>> >>>>>> Thanks for your help and my apologies for the snag, >>>> >>>>>> Jc >>>> >>>>>> >>>> >>>>>> [1]: >>>> >>>>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html >>>> >>>>>> >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html> >>>> >>>> >>>>>> >>>> >>>>>> [2]: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04 >>>> >>>>>> >>>> >>>>>> [3] >>>> >>>>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html >>>> >>>>>> >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html> >>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> On Mon, Dec 3, 2018 at 11:29 PM David Holmes >>>> >>>>>> >>> >>>>>> > wrote: >>>> >>>>>> >>>> >>>>>> Looks fine to me. >>>> >>>>>> >>>> >>>>>> Thanks, >>>> >>>>>> David >>>> >>>>>> >>>> >>>>>> On 4/12/2018 4:04 pm, JC Beyler wrote: >>>> >>>>>> > Hi both, >>>> >>>>>> > >>>> >>>>>> > Thanks for the reviews! Since Serguei did not >>>> >>>>>> insist on get_basename, I >>>> >>>>>> > went for get_dirname since the method is a >>>> local >>>> >>>>>> static method and won't >>>> >>>>>> > have its name start spreading, I think it's ok >>>> too. >>>> >>>>>> > >>>> >>>>>> > For the naming of the local variable, the idea >>>> >>>>>> initially was to use the >>>> >>>>>> > same name as the local variable for JNIEnv >>>> already >>>> >>>>>> used to reduce the >>>> >>>>>> > code change. Since I'm now adding the line >>>> macro >>>> >>>>>> at the end anyway, this >>>> >>>>>> > does not matter anymore so I converged all >>>> local >>>> >>>>>> variables to "jni". >>>> >>>>>> > >>>> >>>>>> > So, without further ado, here is the new >>>> version: >>>> >>>>>> > Webrev: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03/ >>>> >>>>>> >>>> >>>>>> > Bug: >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>>> >>>>>> > This passes the various tests changed by the >>>> >>>>>> webrev on my dev machine. >>>> >>>>>> > >>>> >>>>>> > Let me know what you think, >>>> >>>>>> > Jc >>>> >>>>>> > >>>> >>>>>> > On Mon, Dec 3, 2018 at 8:40 PM >>>> >>>>>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >> wrote: >>>> >>>>>> > >>>> >>>>>> > On 12/3/18 20:15, Chris Plummer wrote: >>>> >>>>>> > > Hi JC, >>>> >>>>>> > > >>>> >>>>>> > > Overall it looks good. A few naming nits >>>> >>>>>> thought: >>>> >>>>>> > > >>>> >>>>>> > > In bi01t001.cpp, why have you declared >>>> the >>>> >>>>>> > ExceptionCheckingJniEnvPtr >>>> >>>>>> > > using jni_env(jni). Elsewhere you use >>>> >>>>>> jni(jni_env) and rename the >>>> >>>>>> > > method argument passed in from jni to >>>> >>>>>> jni_env. >>>> >>>>>> > > >>>> >>>>>> > > Related to this, I also noticed in some >>>> >>>>>> files that already are using >>>> >>>>>> > > ExceptionCheckingJniEnvPtr, such as >>>> >>>>>> CharArrayCriticalLocker.cpp, you >>>> >>>>>> > > delcared it as env(jni_env). So that >>>> means >>>> >>>>>> there are 3 different >>>> >>>>>> > names >>>> >>>>>> > > you have used for the >>>> >>>>>> ExceptionCheckingJniEnvPtr local variable. >>>> >>>>>> > They >>>> >>>>>> > > should be consistent. >>>> >>>>>> > > >>>> >>>>>> > > Also, can you rename get_basename() to >>>> >>>>>> get_dirname()? I know Serguei >>>> >>>>>> > > suggested get_basename() a while back, >>>> but >>>> >>>>>> unless "basename" is >>>> >>>>>> > > commonly used for this purpose, I think >>>> >>>>>> "dirname" is more self >>>> >>>>>> > > explanatory. >>>> >>>>>> > >>>> >>>>>> > In general, I'm Okay with get_dirname(). >>>> >>>>>> > Just to mention dirname can be both short >>>> or >>>> >>>>>> full, so it is a little >>>> >>>>>> > confusing as well. >>>> >>>>>> > It is the reason why the get_basename() was >>>> >>>>>> suggested. >>>> >>>>>> > However, I do not insist on get_basename() >>>> nor >>>> >>>>>> get_full_dirname(). :) >>>> >>>>>> > >>>> >>>>>> > Thanks, >>>> >>>>>> > Serguei >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > thanks, >>>> >>>>>> > > >>>> >>>>>> > > Chris >>>> >>>>>> > > >>>> >>>>>> > > On 12/2/18 10:29 PM, David Holmes wrote: >>>> >>>>>> > >> Hi Jc, >>>> >>>>>> > >> >>>> >>>>>> > >> I've been lurking on this one and have >>>> had >>>> >>>>>> a look through. I'm okay >>>> >>>>>> > >> with the FatalError approach for the >>>> tests >>>> >>>>>> - we don't expect >>>> >>>>>> > anything >>>> >>>>>> > >> to go wrong in a well written test in a >>>> >>>>>> correctly functioning VM. >>>> >>>>>> > >> >>>> >>>>>> > >> Thanks, >>>> >>>>>> > >> David >>>> >>>>>> > >> >>>> >>>>>> > >> >>>> >>>>>> > >> >>>> >>>>>> > >> On 3/12/2018 3:24 pm, JC Beyler wrote: >>>> >>>>>> > >>> Hi all, >>>> >>>>>> > >>> >>>> >>>>>> > >>> Would someone on the GC or runtime >>>> team >>>> >>>>>> be motivated to give >>>> >>>>>> > this a >>>> >>>>>> > >>> review? :) >>>> >>>>>> > >>> >>>> >>>>>> > >>> It would be much appreciated! >>>> >>>>>> > >>> >>>> >>>>>> > >>> Webrev: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ >>>> >>>>>> >>>> >>>>>> > >>> Bug: >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>> >>>> >>>>>> > >>> Thanks for your help, >>>> >>>>>> > >>> Jc >>>> >>>>>> > >>> >>>> >>>>>> > >>> On Tue, Nov 27, 2018 at 4:36 PM JC >>>> Beyler >>>> >>>>>> >>> > >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>> >>> >>>>>> >>>> >>>>>> >>> >>>>>> >>> wrote: >>>> >>>>>> > >>> >>>> >>>>>> > >>> Hi Chris, >>>> >>>>>> > >>> >>>> >>>>>> > >>> Yes I was waiting for another >>>> review >>>> >>>>>> since you had explicitly >>>> >>>>>> > >>> asked :) >>>> >>>>>> > >>> >>>> >>>>>> > >>> And sounds good that when someone >>>> >>>>>> from GC or runtime gives a >>>> >>>>>> > >>> review, >>>> >>>>>> > >>> I'll wait for your full review on >>>> the >>>> >>>>>> webrev.02! >>>> >>>>>> > >>> >>>> >>>>>> > >>> Thanks again for your help, >>>> >>>>>> > >>> Jc >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> On Tue, Nov 27, 2018 at 12:48 PM >>>> >>>>>> Chris Plummer >>>> >>>>>> > >>> >>> >>>>>> >>>> >>>>>> >>> >>>>>> > >>>> >>>>>> > >>> >>>>>> >>>> >>>>>> >>> >>>>>> >>> >>>> >>>>>> > wrote: >>>> >>>>>> > >>> >>>> >>>>>> > >>> Hi JC, >>>> >>>>>> > >>> >>>> >>>>>> > >>> I think it would be good to >>>> get a >>>> >>>>>> review from the gc or >>>> >>>>>> > runtime >>>> >>>>>> > >>> teams, since this also affects >>>> >>>>>> their tests. >>>> >>>>>> > >>> >>>> >>>>>> > >>> Also, once we are settled on >>>> this >>>> >>>>>> FatalError approach, >>>> >>>>>> > I still >>>> >>>>>> > >>> need to give your webrev-02 a >>>> >>>>>> full review. I only >>>> >>>>>> > skimmed over >>>> >>>>>> > >>> parts of it (I did look at all >>>> >>>>>> the changes in webrevo-01). >>>> >>>>>> > >>> >>>> >>>>>> > >>> thanks, >>>> >>>>>> > >>> >>>> >>>>>> > >>> Chris >>>> >>>>>> > >>> >>>> >>>>>> > >>> On 11/27/18 8:58 AM, >>>> >>>>>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >> wrote: >>>> >>>>>> > >>>> Hi Jc, >>>> >>>>>> > >>>> >>>> >>>>>> > >>>> I've already reviewed this >>>> too. >>>> >>>>>> > >>>> >>>> >>>>>> > >>>> Thanks, >>>> >>>>>> > >>>> Serguei >>>> >>>>>> > >>>> >>>> >>>>>> > >>>> >>>> >>>>>> > >>>> On 11/27/18 06:56, JC Beyler >>>> >>>>>> wrote: >>>> >>>>>> > >>>>> Thanks Chris, >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> Anybody else motivated to look at >>>> this >>>> >>>>>> and review it? :) >>>> >>>>>> > >>>>> Jc >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> On Mon, Nov 26, 2018 at >>>> 1:26 PM >>>> >>>>>> Chris Plummer >>>> >>>>>> > >>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> >>>> >>>>>> > >>>>> wrote: >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> Hi JC, >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> I'm ok with the FatalError approach, >>>> >>>>>> but would >>>> >>>>>> > like to >>>> >>>>>> > >>>>> hear opinions from others also. >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> thanks, >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> Chris >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> On 11/21/18 8:19 AM, JC Beyler >>>> wrote: >>>> >>>>>> > >>>>>> Hi Chris, >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Thanks for taking the >>>> time >>>> >>>>>> to look at it and yes you >>>> >>>>>> > >>>>>> have raised exactly why >>>> >>>>>> the webrev is between two >>>> >>>>>> > >>>>>> worlds: in cases where >>>> a >>>> >>>>>> fatal error on failure is >>>> >>>>>> > >>>>>> wanted, should we >>>> simplify >>>> >>>>>> the code to remove >>>> >>>>>> > the return >>>> >>>>>> > >>>>>> tests since we do them >>>> >>>>>> internally? Now that I've >>>> >>>>>> > looked >>>> >>>>>> > >>>>>> around for non-fatal >>>> >>>>>> cases, I think the answer >>>> >>>>>> > is yes, >>>> >>>>>> > >>>>>> it simplifies the code >>>> >>>>>> while maintaining the checks. >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> I looked a bit and it >>>> >>>>>> seems that I can't find >>>> >>>>>> > easily a >>>> >>>>>> > >>>>>> case where the test >>>> >>>>>> accepts a JNI failure to >>>> >>>>>> > then move >>>> >>>>>> > >>>>>> on. Therefore, perhaps, >>>> >>>>>> for now, the fail with a >>>> >>>>>> > Fatal >>>> >>>>>> > >>>>>> is enough and we can >>>> work >>>> >>>>>> on the tests to clean >>>> >>>>>> > them up? >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> That means that this is >>>> >>>>>> the new webrev with only >>>> >>>>>> > Fatal >>>> >>>>>> > >>>>>> and cleans up the >>>> tests so >>>> >>>>>> that it is no longer in >>>> >>>>>> > >>>>>> between two worlds: >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Webrev: >>>> >>>>>> > >>>>>> >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ >>>> >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> >>>> >>>>>> > >>>>>> Bug: >>>> >>>>>> > >>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> (This passes testing >>>> on my >>>> >>>>>> dev machine for all the >>>> >>>>>> > >>>>>> modified tests) >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> with the example you >>>> >>>>>> provided, it now looks like: >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html >>>> >>>>>> >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Where it does, to me at >>>> >>>>>> least, seem cleaner and less >>>> >>>>>> > >>>>>> "noisy". >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Let me know what you >>>> think, >>>> >>>>>> > >>>>>> Jc >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> On Tue, Nov 20, 2018 at >>>> >>>>>> 9:33 PM Chris Plummer >>>> >>>>>> > >>>>>> < >>>> chris.plummer at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> wrote: >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Hi JC, >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Sorry about the >>>> delay. >>>> >>>>>> I had to go back an >>>> >>>>>> > look at >>>> >>>>>> > >>>>>> the initial 8210842 >>>> >>>>>> webrev and RFR thread to see >>>> >>>>>> > >>>>>> what this was >>>> >>>>>> initially all about. >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> In general the >>>> changes >>>> >>>>>> look good. >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> I don't have a good >>>> >>>>>> answer to your >>>> >>>>>> > >>>>>> FatalError/NonFatalError question. It >>>> makes >>>> >>>>>> > the code >>>> >>>>>> > >>>>>> a lot cleaner to >>>> use >>>> >>>>>> FatalError, but then it >>>> >>>>>> > is a >>>> >>>>>> > >>>>>> behavior change, >>>> and >>>> >>>>>> you also need to deal with >>>> >>>>>> > >>>>>> tests that >>>> >>>>>> intentionally induce errors (do >>>> >>>>>> > you have >>>> >>>>>> > >>>>>> an example of >>>> that). >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> In any case, right >>>> now >>>> >>>>>> your webrev seems to be >>>> >>>>>> > >>>>>> between two worlds. >>>> >>>>>> You are producing >>>> >>>>>> > FatalError, >>>> >>>>>> > >>>>>> but still checking >>>> >>>>>> results. Here's a good >>>> >>>>>> > example: >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html >>>> >>>>>> >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> I'm not sure if >>>> this >>>> >>>>>> is just a temporary >>>> >>>>>> > state until >>>> >>>>>> > >>>>>> it was decided >>>> which >>>> >>>>>> approach to take. >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> thanks, >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Chris >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> On 11/20/18 2:14 >>>> PM, >>>> >>>>>> JC Beyler wrote: >>>> >>>>>> > >>>>>>> Hi all, >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Chris thought it >>>> made >>>> >>>>>> sense to have more >>>> >>>>>> > eyes on >>>> >>>>>> > >>>>>>> this change than >>>> just >>>> >>>>>> serviceability as it will >>>> >>>>>> > >>>>>>> modify to tests >>>> that >>>> >>>>>> are not only >>>> >>>>>> > serviceability >>>> >>>>>> > >>>>>>> tests so I've >>>> moved >>>> >>>>>> this to conversation >>>> >>>>>> > here :) >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> For convenience, >>>> I've >>>> >>>>>> copy-pasted the >>>> >>>>>> > initial RFR: >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Could I have a >>>> review >>>> >>>>>> for the extension and >>>> >>>>>> > usage >>>> >>>>>> > >>>>>>> of the >>>> >>>>>> ExceptionJniWrapper. This adds lines and >>>> >>>>>> > >>>>>>> filenames to the >>>> end >>>> >>>>>> of the wrapper JNI >>>> >>>>>> > methods, >>>> >>>>>> > >>>>>>> adds tracing, and >>>> >>>>>> throws an error if need >>>> >>>>>> > be. I've >>>> >>>>>> > >>>>>>> ported the gc/lock >>>> >>>>>> files to use the new >>>> >>>>>> > >>>>>>> TRACE_JNI_CALL >>>> add-on >>>> >>>>>> and I've ported a few >>>> >>>>>> > of the >>>> >>>>>> > >>>>>>> tests that were >>>> >>>>>> already changed for the >>>> >>>>>> > assignment >>>> >>>>>> > >>>>>>> webrev for >>>> >>>>>> JDK-8212884. >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Webrev: >>>> >>>>>> > >>>>>>> >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01 >>>> >>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> >>>> >>>>>> > >>>>>>> Bug: >>>> >>>>>> > >>>>>>> >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> For illustration, >>>> if >>>> >>>>>> I force an error to the >>>> >>>>>> > >>>>>>> AP04/ap04t03 test >>>> and >>>> >>>>>> set the verbosity on, >>>> >>>>>> > I get >>>> >>>>>> > >>>>>>> something like: >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >> Calling JNI >>>> method >>>> >>>>>> FindClass from >>>> >>>>>> > >>>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>> >> Calling with >>>> these >>>> >>>>>> parameter(s): >>>> >>>>>> > >>>>>>> java/lang/Threadd >>>> >>>>>> > >>>>>>> Wait for thread >>>> to >>>> >>>>>> finish >>>> >>>>>> > >>>>>>> << Called JNI >>>> method >>>> >>>>>> FindClass from >>>> >>>>>> > >>>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>> Exception in >>>> thread >>>> >>>>>> "Thread-0" >>>> >>>>>> > >>>>>>> java.lang.NoClassDefFoundError: >>>> >>>>>> > java/lang/Threadd >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>>> >>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Method) >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Caused by: >>>> >>>>>> java.lang.ClassNotFoundException: >>>> >>>>>> > >>>>>>> java.lang.Threadd >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) >>>> >>>>>> > >>>>>>> ... 3 more >>>> >>>>>> > >>>>>>> FATAL ERROR in >>>> native >>>> >>>>>> method: JNI method >>>> >>>>>> > FindClass >>>> >>>>>> > >>>>>>> : internal error >>>> from >>>> >>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>>> >>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Method) >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> Questions/comments I >>>> >>>>>> have about this are: >>>> >>>>>> > >>>>>>> - Do we want to >>>> >>>>>> force fatal errors when a JNI >>>> >>>>>> > >>>>>>> call fails in >>>> >>>>>> general? Most of these tests >>>> >>>>>> > do the >>>> >>>>>> > >>>>>>> right thing and >>>> test >>>> >>>>>> the return of the JNI >>>> >>>>>> > calls, >>>> >>>>>> > >>>>>>> for example: >>>> >>>>>> > >>>>>>> thrClass = >>>> >>>>>> > jni->FindClass("java/lang/Threadd", >>>> >>>>>> > >>>>>>> TRACE_JNI_CALL); >>>> >>>>>> > >>>>>>> if (thrClass >>>> == >>>> >>>>>> NULL) { >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> but now the >>>> wrapper >>>> >>>>>> actually would do a >>>> >>>>>> > fatal if >>>> >>>>>> > >>>>>>> the FindClass call >>>> >>>>>> would return a nullptr, >>>> >>>>>> > so we >>>> >>>>>> > >>>>>>> could remove that >>>> >>>>>> test altogether. What do you >>>> >>>>>> > >>>>>>> think? >>>> >>>>>> > >>>>>>> - I prefer to >>>> >>>>>> leave them as the tests then >>>> >>>>>> > >>>>>>> become closer to >>>> what >>>> >>>>>> real users would have in >>>> >>>>>> > >>>>>>> their code and is >>>> the >>>> >>>>>> "recommended" way of >>>> >>>>>> > doing it >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> - The >>>> alternative >>>> >>>>>> is to use the >>>> >>>>>> > NonFatalError I >>>> >>>>>> > >>>>>>> added which then >>>> just >>>> >>>>>> prints out that something >>>> >>>>>> > >>>>>>> went wrong, >>>> letting >>>> >>>>>> the test continue. Question >>>> >>>>>> > >>>>>>> will be what >>>> should >>>> >>>>>> be the default? The >>>> >>>>>> > fatal or >>>> >>>>>> > >>>>>>> the non-fatal >>>> error >>>> >>>>>> handling? >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> On a different >>>> >>>>>> subject: >>>> >>>>>> > >>>>>>> - On the new >>>> tests, >>>> >>>>>> I've removed the >>>> >>>>>> > >>>>>>> NSK_JNI_VERIFY >>>> since >>>> >>>>>> the JNI wrapper >>>> >>>>>> > handles the >>>> >>>>>> > >>>>>>> tracing and the >>>> >>>>>> verify in almost the same >>>> >>>>>> > way; only >>>> >>>>>> > >>>>>>> difference I can >>>> >>>>>> really tell is that the >>>> >>>>>> > complain >>>> >>>>>> > >>>>>>> method from NSK >>>> has a >>>> >>>>>> max complain before >>>> >>>>>> > stopping >>>> >>>>>> > >>>>>>> to "complain"; I >>>> have >>>> >>>>>> not added that part >>>> >>>>>> > of the >>>> >>>>>> > >>>>>>> code in this >>>> webrev >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Once we decide on >>>> >>>>>> these, I can continue on the >>>> >>>>>> > >>>>>>> files from >>>> >>>>>> JDK-8212884 and then do both the >>>> >>>>>> > >>>>>>> assignment in an >>>> if >>>> >>>>>> extraction followed-by this >>>> >>>>>> > >>>>>>> type of webrev in >>>> an >>>> >>>>>> easier fashion. >>>> >>>>>> > Depending on >>>> >>>>>> > >>>>>>> decisions here, >>>> >>>>>> NSK*VERIFY can be deprecated as >>>> >>>>>> > >>>>>>> well as we go >>>> forward. >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Thanks! >>>> >>>>>> > >>>>>>> Jc >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> On Mon, Nov 19, >>>> 2018 >>>> >>>>>> at 11:34 AM Chris Plummer >>>> >>>>>> > >>>>>>> < >>>> chris.plummer at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> wrote: >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> On 11/19/18 >>>> 10:07 >>>> >>>>>> AM, JC Beyler wrote: >>>> >>>>>> > >>>>>>>> Hi all, >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> @David/Chris: >>>> >>>>>> should I then push this >>>> >>>>>> > RFR to >>>> >>>>>> > >>>>>>>> the hotspot >>>> >>>>>> mailing or the runtime >>>> >>>>>> > one? For >>>> >>>>>> > >>>>>>>> what it's >>>> worth, >>>> >>>>>> a lot of the tests >>>> >>>>>> > under the >>>> >>>>>> > >>>>>>>> vmTestbase >>>> are >>>> >>>>>> jvmti so the review also >>>> >>>>>> > >>>>>>>> affects >>>> >>>>>> serviceability; it just turns >>>> >>>>>> > out I >>>> >>>>>> > >>>>>>>> started with >>>> the >>>> >>>>>> GC originally and >>>> >>>>>> > then hit >>>> >>>>>> > >>>>>>>> some other >>>> tests >>>> >>>>>> I had touched via the >>>> >>>>>> > >>>>>>>> assignment >>>> >>>>>> extraction. >>>> >>>>>> > >>>>>>> I think >>>> hotspot >>>> >>>>>> would be best. >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Chris >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> @Serguei: >>>> Done >>>> >>>>>> for the method >>>> >>>>>> > renaming, for >>>> >>>>>> > >>>>>>>> the indent, >>>> are >>>> >>>>>> you talking about >>>> >>>>>> > going from >>>> >>>>>> > >>>>>>>> the 8-indent >>>> to >>>> >>>>>> 4-indent? If so, would >>>> >>>>>> > it not >>>> >>>>>> > >>>>>>>> just be >>>> better >>>> >>>>>> to do a new JBS bug and >>>> >>>>>> > do the >>>> >>>>>> > >>>>>>>> whole files >>>> in >>>> >>>>>> one go? I ask because >>>> >>>>>> > >>>>>>>> otherwise, it >>>> >>>>>> will look a bit weird to >>>> >>>>>> > have >>>> >>>>>> > >>>>>>>> parts of the >>>> >>>>>> file as 8-indent and others >>>> >>>>>> > >>>>>>>> 4-indent? >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> Thanks for >>>> >>>>>> looking at it! >>>> >>>>>> > >>>>>>>> Jc >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> On Mon, Nov >>>> 19, >>>> >>>>>> 2018 at 1:25 AM >>>> >>>>>> > >>>>>>>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> >>> >>>>>> > >>>> >>>>>> > >>>>>>>> >>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> >> >>>> >>>>>> > >>>>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>>>>>>> >>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> wrote: >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> Hi Jc, >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> We have >>>> to >>>> >>>>>> start this review >>>> >>>>>> > anyway. :) >>>> >>>>>> > >>>>>>>> It looks >>>> >>>>>> good to me in general. >>>> >>>>>> > >>>>>>>> Thank you >>>> >>>>>> for your consistency in this >>>> >>>>>> > >>>>>>>> >>>> refactoring! >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> Some >>>> minor >>>> >>>>>> comments. >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> +static >>>> >>>>>> const char* >>>> >>>>>> > remove_folders(const >>>> >>>>>> > >>>>>>>> char* >>>> >>>>>> fullname) { I'd suggest to >>>> >>>>>> > rename >>>> >>>>>> > >>>>>>>> the >>>> function >>>> >>>>>> name to something >>>> >>>>>> > traditional >>>> >>>>>> > >>>>>>>> like >>>> >>>>>> get_basename. Otherwise, it >>>> >>>>>> > sounds >>>> >>>>>> > >>>>>>>> like this >>>> >>>>>> function has to really >>>> >>>>>> > remove >>>> >>>>>> > >>>>>>>> folders. >>>> :) >>>> >>>>>> Also, all *Locker.cpp have >>>> >>>>>> > >>>>>>>> wrong >>>> indent >>>> >>>>>> in the bodies of if >>>> >>>>>> > and while >>>> >>>>>> > >>>>>>>> >>>> statements. >>>> >>>>>> Could this be fixed >>>> >>>>>> > with the >>>> >>>>>> > >>>>>>>> >>>> refactoring? >>>> >>>>>> I did not look on how >>>> >>>>>> > this >>>> >>>>>> > >>>>>>>> impacts >>>> the >>>> >>>>>> tests other than >>>> >>>>>> > >>>>>>>> serviceability. >>>> Thanks, >>>> >>>>>> Serguei >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> On >>>> 11/16/18 >>>> >>>>>> 19:43, JC Beyler wrote: >>>> >>>>>> > >>>>>>>>> Hi all, >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Anybody >>>> >>>>>> motivated to review this? :) >>>> >>>>>> > >>>>>>>>> Jc >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> On Wed, Nov >>>> 7, >>>> >>>>>> 2018 at 9:53 PM JC >>>> >>>>>> > Beyler >>>> >>>>>> > >>>>>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> wrote: >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Hi all, >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Could I >>>> have >>>> >>>>>> a review for the >>>> >>>>>> > >>>>>>>>> extension >>>> >>>>>> and usage of the >>>> >>>>>> > >>>>>>>>> ExceptionJniWrapper. This >>>> >>>>>> > adds lines >>>> >>>>>> > >>>>>>>>> and >>>> >>>>>> filenames to the end of the >>>> >>>>>> > >>>>>>>>> wrapper >>>> JNI >>>> >>>>>> methods, adds >>>> >>>>>> > tracing, >>>> >>>>>> > >>>>>>>>> and >>>> throws >>>> >>>>>> an error if need >>>> >>>>>> > be. I've >>>> >>>>>> > >>>>>>>>> ported >>>> the >>>> >>>>>> gc/lock files to >>>> >>>>>> > use the >>>> >>>>>> > >>>>>>>>> new >>>> >>>>>> TRACE_JNI_CALL add-on and >>>> >>>>>> > I've >>>> >>>>>> > >>>>>>>>> ported a >>>> few >>>> >>>>>> of the tests >>>> >>>>>> > that were >>>> >>>>>> > >>>>>>>>> already >>>> >>>>>> changed for the >>>> >>>>>> > assignment >>>> >>>>>> > >>>>>>>>> webrev >>>> for >>>> >>>>>> JDK-8212884. >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Webrev: >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.00/ >>>> >>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> >>>> >>>>>> > >>>>>>>>> Bug: >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> For >>>> >>>>>> illustration, if I force >>>> >>>>>> > an error >>>> >>>>>> > >>>>>>>>> to the >>>> >>>>>> AP04/ap04t03 test and >>>> >>>>>> > set the >>>> >>>>>> > >>>>>>>>> verbosity >>>> >>>>>> on, I get something >>>> >>>>>> > like: >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> >> >>>> Calling >>>> >>>>>> JNI method >>>> >>>>>> > FindClass from >>>> >>>>>> > >>>>>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>>>> >> >>>> Calling >>>> >>>>>> with these >>>> >>>>>> > parameter(s): >>>> >>>>>> > >>>>>>>>> java/lang/Threadd >>>> >>>>>> > >>>>>>>>> Wait for >>>> >>>>>> thread to finish >>>> >>>>>> > >>>>>>>>> << Called >>>> >>>>>> JNI method >>>> >>>>>> > FindClass from >>>> >>>>>> > >>>>>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>>>> >>>> Exception in >>>> >>>>>> thread "Thread-0" >>>> >>>>>> > >>>>>>>>> java.lang.NoClassDefFoundError: >>>> >>>>>> > >>>>>>>>> java/lang/Threadd >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>>> >>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Method) >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Caused >>>> by: >>>> >>>>>> > >>>>>>>>> java.lang.ClassNotFoundException: >>>> >>>>>> > >>>>>>>>> java.lang.Threadd >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) >>>> >>>>>> > >>>>>>>>> ... 3 >>>> more >>>> >>>>>> > >>>>>>>>> FATAL >>>> ERROR >>>> >>>>>> in native method: JNI >>>> >>>>>> > >>>>>>>>> method >>>> >>>>>> FindClass : internal error >>>> >>>>>> > >>>>>>>>> from >>>> >>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>>> >>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Method) >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Questions/comments I have about >>>> >>>>>> > >>>>>>>>> this are: >>>> >>>>>> > >>>>>>>>> - Do we >>>> >>>>>> want to force fatal >>>> >>>>>> > errors >>>> >>>>>> > >>>>>>>>> when a >>>> JNI >>>> >>>>>> call fails in general? >>>> >>>>>> > >>>>>>>>> Most of >>>> >>>>>> these tests do the right >>>> >>>>>> > >>>>>>>>> thing and >>>> >>>>>> test the return of >>>> >>>>>> > the JNI >>>> >>>>>> > >>>>>>>>> calls, >>>> for >>>> >>>>>> example: >>>> >>>>>> > >>>>>>>>> thrClass >>>> = >>>> >>>>>> > >>>>>>>>> jni->FindClass("java/lang/Threadd", >>>> >>>>>> > >>>>>>>>> TRACE_JNI_CALL); >>>> >>>>>> > >>>>>>>>> if >>>> >>>>>> (thrClass == NULL) { >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> but now >>>> the >>>> >>>>>> wrapper actually >>>> >>>>>> > would do >>>> >>>>>> > >>>>>>>>> a fatal >>>> if >>>> >>>>>> the FindClass call >>>> >>>>>> > would >>>> >>>>>> > >>>>>>>>> return a >>>> >>>>>> nullptr, so we could >>>> >>>>>> > remove >>>> >>>>>> > >>>>>>>>> that test >>>> >>>>>> altogether. What do >>>> >>>>>> > you >>>> >>>>>> > >>>>>>>>> think? >>>> >>>>>> > >>>>>>>>> - I >>>> >>>>>> prefer to leave them >>>> >>>>>> > as the >>>> >>>>>> > >>>>>>>>> tests >>>> then >>>> >>>>>> become closer to >>>> >>>>>> > what real >>>> >>>>>> > >>>>>>>>> users >>>> would >>>> >>>>>> have in their >>>> >>>>>> > code and is >>>> >>>>>> > >>>>>>>>> the >>>> >>>>>> "recommended" way of doing it >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> - The >>>> >>>>>> alternative is to >>>> >>>>>> > use the >>>> >>>>>> > >>>>>>>>> NonFatalError I >>>> added >>>> >>>>>> which >>>> >>>>>> > then just >>>> >>>>>> > >>>>>>>>> prints >>>> out >>>> >>>>>> that something >>>> >>>>>> > went wrong, >>>> >>>>>> > >>>>>>>>> letting >>>> the >>>> >>>>>> test continue. >>>> >>>>>> > Question >>>> >>>>>> > >>>>>>>>> will be >>>> what >>>> >>>>>> should be the >>>> >>>>>> > default? >>>> >>>>>> > >>>>>>>>> The >>>> fatal or >>>> >>>>>> the non-fatal error >>>> >>>>>> > >>>>>>>>> handling? >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> On a >>>> >>>>>> different subject: >>>> >>>>>> > >>>>>>>>> - On >>>> the >>>> >>>>>> new tests, I've >>>> >>>>>> > removed >>>> >>>>>> > >>>>>>>>> the >>>> >>>>>> NSK_JNI_VERIFY since the JNI >>>> >>>>>> > >>>>>>>>> wrapper >>>> >>>>>> handles the tracing >>>> >>>>>> > and the >>>> >>>>>> > >>>>>>>>> verify in >>>> >>>>>> almost the same >>>> >>>>>> > way; only >>>> >>>>>> > >>>>>>>>> >>>> difference I >>>> >>>>>> can really tell >>>> >>>>>> > is that >>>> >>>>>> > >>>>>>>>> the >>>> complain >>>> >>>>>> method from NSK >>>> >>>>>> > has a >>>> >>>>>> > >>>>>>>>> max >>>> complain >>>> >>>>>> before stopping to >>>> >>>>>> > >>>>>>>>> >>>> "complain"; >>>> >>>>>> I have not added that >>>> >>>>>> > >>>>>>>>> part of >>>> the >>>> >>>>>> code in this webrev >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Once we >>>> >>>>>> decide on these, I can >>>> >>>>>> > >>>>>>>>> continue >>>> on >>>> >>>>>> the files from >>>> >>>>>> > >>>>>>>>> >>>> JDK-8212884 >>>> >>>>>> and then do both the >>>> >>>>>> > >>>>>>>>> >>>> assignment >>>> >>>>>> in an if extraction >>>> >>>>>> > >>>>>>>>> >>>> followed-by >>>> >>>>>> this type of >>>> >>>>>> > webrev in an >>>> >>>>>> > >>>>>>>>> easier >>>> >>>>>> fashion. Depending on >>>> >>>>>> > >>>>>>>>> decisions >>>> >>>>>> here, NSK*VERIFY can be >>>> >>>>>> > >>>>>>>>> >>>> deprecated >>>> >>>>>> as well as we go >>>> >>>>>> > forward. >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Thank you >>>> >>>>>> for the >>>> >>>>>> > reviews/comments :) >>>> >>>>>> > >>>>>>>>> Jc >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> -- >>>> >>>>>> > >>>>>>>>> Thanks, >>>> >>>>>> > >>>>>>>>> Jc >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> -- >>>> >>>>>> > >>>>>>>> Thanks, >>>> >>>>>> > >>>>>>>> Jc >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> -- >>>> >>>>>> > >>>>>>> Thanks, >>>> >>>>>> > >>>>>>> Jc >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> -- >>>> >>>>>> > >>>>>> Thanks, >>>> >>>>>> > >>>>>> Jc >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> -- >>>> >>>>>> > >>>>> Thanks, >>>> >>>>>> > >>>>> Jc >>>> >>>>>> > >>>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> -- >>>> >>>>>> > >>> Thanks, >>>> >>>>>> > >>> Jc >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> -- >>>> >>>>>> > >>> >>>> >>>>>> > >>> Thanks, >>>> >>>>>> > >>> Jc >>>> >>>>>> > > >>>> >>>>>> > > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > -- >>>> >>>>>> > >>>> >>>>>> > Thanks, >>>> >>>>>> > Jc >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> -- >>>> >>>>>> Thanks, >>>> >>>>>> Jc >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> -- >>>> >>>>> Thanks, >>>> >>>>> Jc >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> >>>> Jc >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Jc >>>> >>> >>>> > >>>> >>> >>> >>> -- >>> >>> Thanks, >>> Jc >>> >> >> >> -- >> >> Thanks, >> Jc >> > > > -- > > Thanks, > Jc > -- Thanks, Jc From shade at redhat.com Mon Jan 14 16:23:56 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 14 Jan 2019 17:23:56 +0100 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <4df73169-dec5-cad0-cca2-28ac9689cbc1@oracle.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> <36682829-e2d5-c1aa-edbb-c49b8f1265cf@oracle.com> <9d5df7a6-f374-6d6d-1ff1-a9e19814ce14@oracle.com> <4df73169-dec5-cad0-cca2-28ac9689cbc1@oracle.com> Message-ID: On 1/14/19 2:49 PM, coleen.phillimore at oracle.com wrote: >>> Okay I agree it doesn't break anything new though I'd be happier if the >>> Backtrace::get_line_number issue was fixed. Otherwise it needs a follow up bug too - I'm starting >>> to think it makes no sense to allow redefinition to occur within a method like this! And this is >>> a distinct issue from ShowHiddenFrames. >> >> There's no practical way to disable redefinition here and it's not something that happens.? You >> can file a bug for it if you like, but it's not something worth fixing. > > Looking again at the original code.? If you have a redefinition at StringTable::intern() the line > number from the methodHandle (Method*) is correct.? In this case it's the "old" method.? It's where > the original method had the exception, which is what you want to print. +1. I am pushing the change shortly. -Aleksey From joe.darcy at oracle.com Mon Jan 14 19:04:12 2019 From: joe.darcy at oracle.com (Joe Darcy) Date: Mon, 14 Jan 2019 11:04:12 -0800 Subject: JDK 12 RFR of JDK-8213299: runtime/appcds/jigsaw/classpathtests/EmptyClassInBootClassPath.java failed with java.lang.NoSuchMethodException In-Reply-To: <8175ebfe-2951-6c71-29ee-f09a6b4da4f3@oracle.com> References: <5e5ae7b3-af70-81d4-0bc1-c56fd2b20165@oracle.com> <8175ebfe-2951-6c71-29ee-f09a6b4da4f3@oracle.com> Message-ID: Hi Stuart, On 1/11/2019 11:08 AM, Stuart Marks wrote: > Drat, you pushed this already. But I wanted to mention a couple style > points: > > On 1/10/19 10:13 PM, Joe Darcy wrote: >> + sb.append(Stream.of(argTypes).map(c -> {return (c == null) ? "null" >> : c.getName();}). >> +??? ??? ????? collect(Collectors.joining(","))); > > Since argTypes is an array, I usually prefer Arrays.stream() over > Stream.of(). The issue is that Stream.of() is varargs, and while this > case isn't formally ambiguous, it can create a question in the > reader's mind about whether the stream consists of the array elements > or of just one element that's the array itself. > > The statement lambda can probably be replaced with an expression > lambda. I think it makes the ternary easier to read. Also, indentation. > > ??? sb.append(Arrays.stream(argTypes) > ??????????????????? .map(c -> (c == null) ? "null" : c.getName()) > ??????????????????? .collect(Collectors.joining(","))); > > I'm not sure it's worth tracking this, but I could file a bug if you'd > like. Once JDK-8213299 hits the JDK 13 repo, I'd be open to doing a refactoring. I've filed JDK-8217000: Refactor Class::methodToString . Thanks, -Joe From sangheon.kim at oracle.com Mon Jan 14 20:53:55 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 14 Jan 2019 12:53:55 -0800 Subject: RFR (S/M): 8213827: NUMA heap allocation does not respect process membind/interleave settings [Was: Re: [PATCH] JDK NUMA Interleaving issue] In-Reply-To: <7694bb5a63c02bc94c1380982cad2a5ddb27916f.camel@oracle.com> References: <9bea7b0957bbfc2f0ac34306ee162f2d98e44bfe.camel@oracle.com> <99164b92f47f264978339ed327da9d41098a7e1d.camel@oracle.com> <10ecfa0f-eb78-869a-4d5a-991f55ec57ea@oracle.com> <3b8edd37-80cd-0f06-55ed-326972db98de@oracle.com> <2bc2fce2403324739a030d929b847fece95b0e25.camel@oracle.com> <7694bb5a63c02bc94c1380982cad2a5ddb27916f.camel@oracle.com> Message-ID: <697ef8a5-d84a-ccf4-ef86-24a73820e73c@oracle.com> Hi Amith and Thomas, On 1/14/19 1:34 AM, Thomas Schatzl wrote: > Hi Amith, > > On Sat, 2019-01-12 at 22:57 +0530, amith pawar wrote: >> Hi Thomas, >> >> SPECJBB shows following improvements with latest patch. >> 1. max-jOPs around 7-9% >> 2. critical-jOPS around 4-50% > thanks! > >> In webrev.4, Sangheon suggested following change is missing >> + ls.print("UseNUMA is enabled and invoked in '%s' mode." >> + " Heap will be configured using NUMA memory nodes:", >> numa_mode); >> There is one more space before " Heap.... ", please remove it. > The space before "Heap" is the space after the full stop in the > preceding sentence so needed; I moved the space to the previous line > though. > >> Also os_linux.hpp is already updated for new copyright year so patch >> import fails. >> >> The attached patch contains these changes. Please do check. > Regenerated the v4 webrevs. V4 webrevs looks good. Thanks, Sangheon > > Thanks, > Thomas > > From david.holmes at oracle.com Mon Jan 14 21:40:57 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 15 Jan 2019 07:40:57 +1000 Subject: RFR(M): 8216265: [testbug] Introduce Platform.sharedLibraryPathVariableName() and adapt all tests. In-Reply-To: <9ae6d6c231d24ba699b5785e06faebe9@sap.com> References: <9349eed214ce46ee81868840c0dbd54d@sap.com> <6277c580-0397-d4ab-7b03-7721544048ff@oracle.com> <129ed17946754b9c896fa41dd44d031f@sap.com> <7cefd8a46ae647969894c43cab72bc88@sap.com> <9ae6d6c231d24ba699b5785e06faebe9@sap.com> Message-ID: <63cd6761-53ff-ec49-87d9-789690ad0678@oracle.com> Hi Goetz, We are seeing two test failures after this push: Test: sun/security/krb5/auto/LoginProc.java C:\\ade\\mesos\\work_dir\\jib-master\\install\\jdk13-jdk.134\\src.full\\open\\test\\jdk\\sun\\security\\krb5\\auto\\KDC.java:24: error: package jdk.test.lib does not exist import jdk.test.lib.Platform; ^ Test: java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk13-jdk.134/src.full/open/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java:101: error: cannot find symbol String pathVar = Platform.sharedLibraryPathVariableName(); ^ symbol: variable Platform location: class InheritedChannelTest --- https://bugs.openjdk.java.net/browse/JDK-8217017 If you can't fix promptly I will do so in next couple of hours. Thanks, David On 15/01/2019 1:24 am, Lindenmaier, Goetz wrote: > Hi Martin, > > thanks for reveiwing. > > I'll fix the indentation to 4 before pushing. > > Best regards, > Goetz. > >> -----Original Message----- >> From: Doerr, Martin >> Sent: Montag, 14. Januar 2019 16:13 >> To: Lindenmaier, Goetz ; David Holmes >> ; 'hotspot-dev at openjdk.java.net' > dev at openjdk.java.net>; gary.adams at oracle.com >> Subject: RE: RFR(M): 8216265: [testbug] Introduce >> Platform.sharedLibraryPathVariableName() and adapt all tests. >> >> Hi G?tz, >> >> do we have an indentation rule for " .toAbsolutePath().toString();" in >> JliLaunchTest.java? >> >> Nice cleanup! Looks good. >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-dev On Behalf Of >> Lindenmaier, Goetz >> Sent: Mittwoch, 9. Januar 2019 13:52 >> To: David Holmes ; 'hotspot- >> dev at openjdk.java.net' ; >> gary.adams at oracle.com >> Subject: RE: RFR(M): 8216265: [testbug] Introduce >> Platform.sharedLibraryPathVariableName() and adapt all tests. >> >> Hi David, >> >> I fixed these locally. >> >> Best regards, >> Goetz. >> >> >> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Mittwoch, 9. Januar 2019 13:28 >>> To: Lindenmaier, Goetz ; 'hotspot- >>> dev at openjdk.java.net' ; >>> gary.adams at oracle.com >>> Subject: Re: RFR(M): 8216265: [testbug] Introduce >>> Platform.sharedLibraryPathVariableName() and adapt all tests. >>> >>> Hi Goetz, >>> >>> On 9/01/2019 8:34 pm, Lindenmaier, Goetz wrote: >>>> Hi David, >>>> >>>> thanks for looking at my change. >>>> It was asked for by Gary when he reviewed >>> https://bugs.openjdk.java.net/browse/JDK-8215975 >>>> >>>> New webrev: >>>> http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02- >> incremental/ >>> >>> Looks good. Two further pre-existing nits spotted: >>> >>> test/hotspot/jtreg/gtest/GTestWrapper.java >>> >>> ! * Copyright (c) 2016, 2019 Oracle >>> >>> Need a comma after 2019. >>> >>> Ditto for: >>> >>> >> test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited >>> ChannelTest.java >>> >>> Actually I now see quite a number of files missing the comma so >>> I'll file a general bug to fix that. >>> >>> Thanks, >>> David >>> >>> >>>> http://cr.openjdk.java.net/~goetz/wr19/8216265-PathVar/02/ >>>> >>>> See my comments inline below. >>>> >>>> Best regards, >>>> Goetz. >>>> >>>>> test/hotspot/jtreg/gtest/GTestWrapper.java >>>>> >>>>> 75 env.put(pathVar, path + ":" + ldLibraryPath); >>>>> >>>>> Shouldn't ":" be File.pathSeparator? >>>> Fixed. >>>> >>>>> >>> >> test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited >>>>> ChannelTest.java >>>>> >>>>> Copyright year needs updating. >>>> Done. >>>> >>>>> >>> >> test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited >>>>> ChannelTest.java >>>>> >>>>> 70 private static final Path pathEnvVar >>>>> >>>>> The variable isn't an env var, it's just a path - I suggest libraryPath. >>>> A cleanup not directly related. But makes sense, done. >>>> >>>>> 101 >>>>> System.out.println(Platform.sharedLibraryPathVariableName() + "=" + >>>>> pathEnvVar); >>>>> ... >>>>> 114 env.put(Platform.sharedLibraryPathVariableName(), >>>>> pathEnvVar.toString()); >>>>> >>>>> I suggest storing the name in a local to avoid the second call. >>>> Done. >>>> >>>>> test/jdk/tools/launcher/JliLaunchTest.java >>>>> >>>>> 57 env.compute(pathEnvVar, (k, v) -> (v == null) ? libdir >>>>> : libdir + ":" + v); >>>>> >>>>> Shouldn't ":" be File.pathSeparator? >>>> This is because there is anyways a switch about the OS. >>>> Did some more cleaning up. >>>> >>>>> test/jdk/tools/launcher/Test7029048.java >>>>> >>>>> 39 import jdk.test.lib.Platform; >>>>> >>>>> Why do you need this? >>>> Removed. >>>> >>>>> test/jdk/vm/JniInvocationTest.java >>>>> >>>>> This is a Mac only test so no changes needed. >>>> I would like to change this anyways. I think this makes >>>> it look more consistent. >>>> >>>>> test/lib/jdk/test/lib/Platform.java >>>>> >>>>> The javadoc comments is unnecessary as we don't generate javadoc >> here. >>> I >>>>> see you copied the preceding sharedLibraryExt() style. The @return is >>>>> superfluous. >>>> Changed. Better? >>>> >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Best regards, >>>>>> Goetz. >>>>>> From david.holmes at oracle.com Mon Jan 14 22:19:51 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 15 Jan 2019 08:19:51 +1000 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <4df73169-dec5-cad0-cca2-28ac9689cbc1@oracle.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> <36682829-e2d5-c1aa-edbb-c49b8f1265cf@oracle.com> <9d5df7a6-f374-6d6d-1ff1-a9e19814ce14@oracle.com> <4df73169-dec5-cad0-cca2-28ac9689cbc1@oracle.com> Message-ID: On 14/01/2019 11:49 pm, coleen.phillimore at oracle.com wrote: > On 1/14/19 8:23 AM, coleen.phillimore at oracle.com wrote: >> On 1/14/19 8:06 AM, David Holmes wrote: >>> On 14/01/2019 10:32 pm, Aleksey Shipilev wrote: >>>> On 1/12/19 1:43 AM, David Holmes wrote: >>>>>>> I'm also very unclear about how the redefinition case is >>>>>>> currently handled. It seems that we will >>>>>>> normally intern NULL (and presumably get a NULL or empty-string >>>>>>> oop?) unless ShowHiddenFrames is >>>>>>> set, in which case we use the unknown_class_name() - regardless >>>>>>> of whether the frame is actually >>>>>>> hidden or not! This seems broken to me. (Separate bug to fix that >>>>>>> is okay if it is indeed broken.) >>>>>> >>>>>> This looks like a bug, but I'm not sure what ShowHiddenFrames is >>>>>> supposed to do here, or how it >>>>>> got there.? I think if Aleksey removed that with this patch it >>>>>> would be fine with me. >>>>> >>>>> I think use of ShowHiddenFrames here is completely broken. But a >>>>> seperate bug and some suitable >>>>> archaeology is needed to fix it the right way. >>>> >>>> Okay, are we in agreement that current patch does not break anything >>>> new? If so, let's push the >>> >>> Okay I agree it doesn't break anything new though I'd be happier if >>> the Backtrace::get_line_number issue was fixed. Otherwise it needs a >>> follow up bug too - I'm starting to think it makes no sense to allow >>> redefinition to occur within a method like this! And this is a >>> distinct issue from ShowHiddenFrames. >> >> There's no practical way to disable redefinition here and it's not >> something that happens.? You can file a bug for it if you like, but >> it's not something worth fixing. > > Looking again at the original code.? If you have a redefinition at > StringTable::intern() the line number from the methodHandle (Method*) is > correct.? In this case it's the "old" method.? It's where the original > method had the exception, which is what you want to print. Then why do we have the -1 handling for the initial redefinition case? It doesn't make sense to me that the logic is basically: if method was redefined set line number = -1 < do other stuff that might lead to method redefinition > set line number with value read from possibly redefined method --- Surely the line number has to be the same, in terms of being valid or not, regardless of whether the redefinition occurred before this code or during it? Thanks, David > > Coleen > >> >> thanks, >> Coleen >> >>> >>> David >>> ----- >>> >>> >>>> current patch in its current form, and then follow up on >>>> ShowHiddenFrames in a separate issue. This >>>> would also make current patch simply backportable to 11. >>>> >>>> Current patch (no changes since last time): >>>> ?? http://cr.openjdk.java.net/~shade/8216308/webrev.01/ >>>> >>>> -Aleksey >>>> >> > From coleen.phillimore at oracle.com Mon Jan 14 23:23:08 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 14 Jan 2019 18:23:08 -0500 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> <36682829-e2d5-c1aa-edbb-c49b8f1265cf@oracle.com> <9d5df7a6-f374-6d6d-1ff1-a9e19814ce14@oracle.com> <4df73169-dec5-cad0-cca2-28ac9689cbc1@oracle.com> Message-ID: On 1/14/19 5:19 PM, David Holmes wrote: > On 14/01/2019 11:49 pm, coleen.phillimore at oracle.com wrote: >> On 1/14/19 8:23 AM, coleen.phillimore at oracle.com wrote: >>> On 1/14/19 8:06 AM, David Holmes wrote: >>>> On 14/01/2019 10:32 pm, Aleksey Shipilev wrote: >>>>> On 1/12/19 1:43 AM, David Holmes wrote: >>>>>>>> I'm also very unclear about how the redefinition case is >>>>>>>> currently handled. It seems that we will >>>>>>>> normally intern NULL (and presumably get a NULL or empty-string >>>>>>>> oop?) unless ShowHiddenFrames is >>>>>>>> set, in which case we use the unknown_class_name() - regardless >>>>>>>> of whether the frame is actually >>>>>>>> hidden or not! This seems broken to me. (Separate bug to fix >>>>>>>> that is okay if it is indeed broken.) >>>>>>> >>>>>>> This looks like a bug, but I'm not sure what ShowHiddenFrames is >>>>>>> supposed to do here, or how it >>>>>>> got there.? I think if Aleksey removed that with this patch it >>>>>>> would be fine with me. >>>>>> >>>>>> I think use of ShowHiddenFrames here is completely broken. But a >>>>>> seperate bug and some suitable >>>>>> archaeology is needed to fix it the right way. >>>>> >>>>> Okay, are we in agreement that current patch does not break >>>>> anything new? If so, let's push the >>>> >>>> Okay I agree it doesn't break anything new though I'd be happier if >>>> the Backtrace::get_line_number issue was fixed. Otherwise it needs >>>> a follow up bug too - I'm starting to think it makes no sense to >>>> allow redefinition to occur within a method like this! And this is >>>> a distinct issue from ShowHiddenFrames. >>> >>> There's no practical way to disable redefinition here and it's not >>> something that happens.? You can file a bug for it if you like, but >>> it's not something worth fixing. >> >> Looking again at the original code.? If you have a redefinition at >> StringTable::intern() the line number from the methodHandle (Method*) >> is correct.? In this case it's the "old" method.? It's where the >> original method had the exception, which is what you want to print. > > Then why do we have the -1 handling for the initial redefinition case? > It doesn't make sense to me that the logic is basically: > > if method was redefined > ? set line number = -1 > > < do other stuff that might lead to method redefinition > > > set line number with value read from possibly redefined method > > --- > > Surely the line number has to be the same, in terms of being valid or > not, regardless of whether the redefinition occurred before this code > or during it? ? if (method() == NULL || !version_matches(method(), version)) { ??? // The method was redefined, accurate line number information isn't available ??? java_lang_StackTraceElement::set_fileName(element(), NULL); ??? java_lang_StackTraceElement::set_lineNumber(element(), -1); ? } else { This case handles both the case that the Method* that we got from holder->method_idnum() is null (deleted) or doesn't match the version that we saved in the backtrace.? method_idnum() will return the new Method* which might have different line numbers if there was redefinition. When you get to the 'else' case, if a redefinition happens, the Method* is the old method and its line numbers will match the one that was saved in the backtrace, because we checked the version. I don't think the 'else' case needs to worry about redefinition. The comment in Backtrace::get_source_file_name doesn't make sense anymore and we've already checked the version in the 'if' above, so we only need to worry about NULL for some other reason.? Again, if a redefinition happens during the StringTable::intern, that's ok.? We already have a Symbol with the source file name matching the version. Does this help? Coleen > > Thanks, > David > > >> >> Coleen >> >>> >>> thanks, >>> Coleen >>> >>>> >>>> David >>>> ----- >>>> >>>> >>>>> current patch in its current form, and then follow up on >>>>> ShowHiddenFrames in a separate issue. This >>>>> would also make current patch simply backportable to 11. >>>>> >>>>> Current patch (no changes since last time): >>>>> ?? http://cr.openjdk.java.net/~shade/8216308/webrev.01/ >>>>> >>>>> -Aleksey >>>>> >>> >> From david.holmes at oracle.com Tue Jan 15 01:42:13 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 15 Jan 2019 11:42:13 +1000 Subject: Trivial RFR: 8217017 [TESTBUG] Tests fail to compile after JDK-8216265 Message-ID: <8d23d3c6-a8aa-ebd4-117f-764ba390a35b@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8217017 There's a missing import statement. Patch below. Thanks, David ----- diff -r de5564099c01 test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java --- a/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java +++ b/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java @@ -48,6 +48,7 @@ import jdk.test.lib.JDKToolFinder; import jdk.test.lib.Utils; import jdk.test.lib.process.ProcessTools; +import jdk.test.lib.Platform; import org.testng.annotations.DataProvider; import org.testng.annotations.Test; From vladimir.kozlov at oracle.com Tue Jan 15 01:49:39 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 14 Jan 2019 17:49:39 -0800 Subject: Trivial RFR: 8217017 [TESTBUG] Tests fail to compile after JDK-8216265 In-Reply-To: <8d23d3c6-a8aa-ebd4-117f-764ba390a35b@oracle.com> References: <8d23d3c6-a8aa-ebd4-117f-764ba390a35b@oracle.com> Message-ID: Good. Vladimir K. On 1/14/19 5:42 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8217017 > > There's a missing import statement. Patch below. > > Thanks, > David > ----- > > diff -r de5564099c01 test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java > --- a/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java > +++ b/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java > @@ -48,6 +48,7 @@ > ?import jdk.test.lib.JDKToolFinder; > ?import jdk.test.lib.Utils; > ?import jdk.test.lib.process.ProcessTools; > +import jdk.test.lib.Platform; > > ?import org.testng.annotations.DataProvider; > ?import org.testng.annotations.Test; From david.holmes at oracle.com Tue Jan 15 01:57:19 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 15 Jan 2019 11:57:19 +1000 Subject: Trivial RFR: 8217017 [TESTBUG] Tests fail to compile after JDK-8216265 In-Reply-To: References: <8d23d3c6-a8aa-ebd4-117f-764ba390a35b@oracle.com> Message-ID: Thanks Vladimir! Changes pushed. David On 15/01/2019 11:49 am, Vladimir Kozlov wrote: > Good. > > Vladimir K. > > On 1/14/19 5:42 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8217017 >> >> There's a missing import statement. Patch below. >> >> Thanks, >> David >> ----- >> >> diff -r de5564099c01 >> test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java >> >> --- >> a/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java >> >> +++ >> b/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/InheritedChannelTest.java >> >> @@ -48,6 +48,7 @@ >> ??import jdk.test.lib.JDKToolFinder; >> ??import jdk.test.lib.Utils; >> ??import jdk.test.lib.process.ProcessTools; >> +import jdk.test.lib.Platform; >> >> ??import org.testng.annotations.DataProvider; >> ??import org.testng.annotations.Test; From david.holmes at oracle.com Tue Jan 15 02:03:08 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 15 Jan 2019 12:03:08 +1000 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <10518795-18f3-b51b-5068-412eaec772e1@oracle.com> <32c5c310-e066-8e3b-6443-fe26ec174884@oracle.com> <36682829-e2d5-c1aa-edbb-c49b8f1265cf@oracle.com> <9d5df7a6-f374-6d6d-1ff1-a9e19814ce14@oracle.com> <4df73169-dec5-cad0-cca2-28ac9689cbc1@oracle.com> Message-ID: Hi Coleen, On 15/01/2019 9:23 am, coleen.phillimore at oracle.com wrote: > On 1/14/19 5:19 PM, David Holmes wrote: >> On 14/01/2019 11:49 pm, coleen.phillimore at oracle.com wrote: >>> On 1/14/19 8:23 AM, coleen.phillimore at oracle.com wrote: >>>> On 1/14/19 8:06 AM, David Holmes wrote: >>>>> On 14/01/2019 10:32 pm, Aleksey Shipilev wrote: >>>>>> On 1/12/19 1:43 AM, David Holmes wrote: >>>>>>>>> I'm also very unclear about how the redefinition case is >>>>>>>>> currently handled. It seems that we will >>>>>>>>> normally intern NULL (and presumably get a NULL or empty-string >>>>>>>>> oop?) unless ShowHiddenFrames is >>>>>>>>> set, in which case we use the unknown_class_name() - regardless >>>>>>>>> of whether the frame is actually >>>>>>>>> hidden or not! This seems broken to me. (Separate bug to fix >>>>>>>>> that is okay if it is indeed broken.) >>>>>>>> >>>>>>>> This looks like a bug, but I'm not sure what ShowHiddenFrames is >>>>>>>> supposed to do here, or how it >>>>>>>> got there.? I think if Aleksey removed that with this patch it >>>>>>>> would be fine with me. >>>>>>> >>>>>>> I think use of ShowHiddenFrames here is completely broken. But a >>>>>>> seperate bug and some suitable >>>>>>> archaeology is needed to fix it the right way. >>>>>> >>>>>> Okay, are we in agreement that current patch does not break >>>>>> anything new? If so, let's push the >>>>> >>>>> Okay I agree it doesn't break anything new though I'd be happier if >>>>> the Backtrace::get_line_number issue was fixed. Otherwise it needs >>>>> a follow up bug too - I'm starting to think it makes no sense to >>>>> allow redefinition to occur within a method like this! And this is >>>>> a distinct issue from ShowHiddenFrames. >>>> >>>> There's no practical way to disable redefinition here and it's not >>>> something that happens.? You can file a bug for it if you like, but >>>> it's not something worth fixing. >>> >>> Looking again at the original code.? If you have a redefinition at >>> StringTable::intern() the line number from the methodHandle (Method*) >>> is correct.? In this case it's the "old" method.? It's where the >>> original method had the exception, which is what you want to print. >> >> Then why do we have the -1 handling for the initial redefinition case? >> It doesn't make sense to me that the logic is basically: >> >> if method was redefined >> ? set line number = -1 >> >> < do other stuff that might lead to method redefinition > >> >> set line number with value read from possibly redefined method >> >> --- >> >> Surely the line number has to be the same, in terms of being valid or >> not, regardless of whether the redefinition occurred before this code >> or during it? > > ? if (method() == NULL || !version_matches(method(), version)) { > ??? // The method was redefined, accurate line number information isn't > available > ??? java_lang_StackTraceElement::set_fileName(element(), NULL); > ??? java_lang_StackTraceElement::set_lineNumber(element(), -1); > ? } else { > > This case handles both the case that the Method* that we got from > holder->method_idnum() is null (deleted) or doesn't match the version > that we saved in the backtrace.? method_idnum() will return the new > Method* which might have different line numbers if there was redefinition. > > When you get to the 'else' case, if a redefinition happens, the Method* > is the old method and its line numbers will match the one that was saved > in the backtrace, because we checked the version. > > I don't think the 'else' case needs to worry about redefinition. The > comment in Backtrace::get_source_file_name doesn't make sense anymore > and we've already checked the version in the 'if' above, so we only need > to worry about NULL for some other reason.? Again, if a redefinition > happens during the StringTable::intern, that's ok.? We already have a > Symbol with the source file name matching the version. > > Does this help? Yes - many thanks - I get it now. David ----- > Coleen >> >> Thanks, >> David >> >> >>> >>> Coleen >>> >>>> >>>> thanks, >>>> Coleen >>>> >>>>> >>>>> David >>>>> ----- >>>>> >>>>> >>>>>> current patch in its current form, and then follow up on >>>>>> ShowHiddenFrames in a separate issue. This >>>>>> would also make current patch simply backportable to 11. >>>>>> >>>>>> Current patch (no changes since last time): >>>>>> ?? http://cr.openjdk.java.net/~shade/8216308/webrev.01/ >>>>>> >>>>>> -Aleksey >>>>>> >>>> >>> > From goetz.lindenmaier at sap.com Tue Jan 15 07:59:22 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 15 Jan 2019 07:59:22 +0000 Subject: Trivial RFR: 8217017 [TESTBUG] Tests fail to compile after JDK-8216265 In-Reply-To: <8d23d3c6-a8aa-ebd4-117f-764ba390a35b@oracle.com> References: <8d23d3c6-a8aa-ebd4-117f-764ba390a35b@oracle.com> Message-ID: Hi, Sorry for breaking this ... but hard to catch for me. Best regards, Goetz. > -----Original Message----- > From: hotspot-dev On Behalf Of > David Holmes > Sent: Dienstag, 15. Januar 2019 02:42 > To: hotspot-dev developers > Subject: Trivial RFR: 8217017 [TESTBUG] Tests fail to compile after JDK- > 8216265 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8217017 > > There's a missing import statement. Patch below. > > Thanks, > David > ----- > > diff -r de5564099c01 > test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherited > ChannelTest.java > --- > a/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherit > edChannelTest.java > +++ > b/test/jdk/java/nio/channels/spi/SelectorProvider/inheritedChannel/Inherit > edChannelTest.java > @@ -48,6 +48,7 @@ > import jdk.test.lib.JDKToolFinder; > import jdk.test.lib.Utils; > import jdk.test.lib.process.ProcessTools; > +import jdk.test.lib.Platform; > > import org.testng.annotations.DataProvider; > import org.testng.annotations.Test; From robbin.ehn at oracle.com Tue Jan 15 10:39:24 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 15 Jan 2019 11:39:24 +0100 Subject: RFR(XL): 8203469: Faster safepoints Message-ID: Hi all, please review. Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ Thanks to Dan for pre-reviewing a lot! Background: ZGC often does very short safepoint operations. For a perspective, in a specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which operation it is. The time it takes to stop and start the JavaThreads is relative very large to a ZGC safepoint. With an operation that just takes 0.2ms the overhead of stopping and starting JavaThreads is several times the operation. High-level functionality change: Serializing the starting over Threads_lock takes time. - Don't wait on Threads_lock use the WaitBarrier. Serializing the stopping over Safepoint_lock takes time. - Let threads stop in parallel, remove Safepoint_lock. Details: JavaThreads have 2 abstract logical states: unsafe or safe. - Safe means the JavaThread will not touch Java heap or VM internal structures without doing a transition and block before doing so. - The safe states are: - When polls armed: _thread_in_native and _thread_blocked. - When Threads_lock is held: externally suspended flag is set. - VM Thread have polls armed and holds the Threads_lock during a safepoint. - Unsafe means that either Java heap or VM internal structures can be accessed by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. - All combination that are not safe are unsafe. We cannot start a safepoint until all unsafe threads have transitioned to a safe state. To make them safe, we arm polls in compiled code and make sure any transition to another unsafe state will be blocked. JavaThreads which are unsafe with state _thread_in_Java may transition to _thread_in_native without being blocked, since it just became a safe thread and we can proceed. Any safe thread may try to transition at any time to an unsafe state, thus coming into the safepoint blocking code at any moment, e.g., after the safepoint is over, or even at the beginning of next safepoint. The VMThread cannot tolerate false positives from the JavaThread thread state because that would mean starting the safepoint without all JavaThreads being safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe false positives from the safepoint blocking code, if we remove them, how do we handle false positives? By first publishing which barrier tag (safepoint counter) we will call WaitBarrier.wait() with as the threads safepoint id and then change the state to _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of the state. A stable load of the thread state is successful if the thread safepoint id is the same both before and after the load of the state and safepoint id is current or InactiveSafepointCounter. If the stable load fails, the thread is considered safepoint unsafe. It's no longer enough that thread is have state _thread_blocked it must also have correct safepoint id before and after we read the state. Performance: The result of faster safepoints is that the average CPU time for JavaThreads between safepoints is higher, thus increasing the allocation rate. The thread that stops first waits shorter time until it gets started. Even the thread that stops last also have shorter stop since we start them faster. If your application is using a concurrent GC it may need re-tunning since each java worker thread have an increased CPU time/allocation rate. Often this means max performance is achieved using slightly less java worker threads than before. Also the increase allocation rate means shorter time between GC safepoints. - If you are using a non-concurrent GC, you should see improved latency and throughput. - After re-tunning with a concurrent GC throughput should be equal or better but with better latency. But bear in mind this is a latency patch, not a throughput one. With current code a java thread is not to guarantee to run between safepoint (in theory a java thread can be starved indefinitely), since the VM thread may re-grab the Threads_locks before it woke up from previous safepoint. If the GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very over-provisioned this can happen. The current schema thus re-safepoint quickly if the java threads have not started yet at the cost of latency. Since the new code uses the WaitBarrier with the safepoint counter, all threads must roll forward to next safepoint by getting at least some CPU time between two safepoints. Meaning MMU violations are more obvious. Some examples on numbers: - On a 16 strand machine synchronization and un-synchronization/starting is at least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and starting ~400->~100us. (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster synchronization time on 16 strands and ~5% score increase. In this case the GC op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. - specJBB2015 ParGC ~9% increase in critical-jops. Thanks, Robbin From robbin.ehn at oracle.com Tue Jan 15 10:44:10 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 15 Jan 2019 11:44:10 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: Hi again, I forgot to say I have tested this :) Latest version of the patch t1-4 + Kitchensink24H and Kitchensink24HStress. Earlier versions have been through t1-7, a couple of t1-5 and targeted stress testing. Thanks, Robbin On 2019-01-15 11:39, Robbin Ehn wrote: > Hi all, please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 > Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ > > Thanks to Dan for pre-reviewing a lot! > > Background: > ZGC often does very short safepoint operations. For a perspective, in a > specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While > in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which > operation it is. The time it takes to stop and start the JavaThreads is relative > very large to a ZGC safepoint. With an operation that just takes 0.2ms the > overhead of stopping and starting JavaThreads is several times the operation. > > High-level functionality change: > Serializing the starting over Threads_lock takes time. > - Don't wait on Threads_lock use the WaitBarrier. > Serializing the stopping over Safepoint_lock takes time. > - Let threads stop in parallel, remove Safepoint_lock. > > Details: > JavaThreads have 2 abstract logical states: unsafe or safe. > - Safe means the JavaThread will not touch Java heap or VM internal structures > ? without doing a transition and block before doing so. > ??????? - The safe states are: > ??????????????? - When polls armed: _thread_in_native and _thread_blocked. > ??????????????? - When Threads_lock is held: externally suspended flag is set. > ??????? - VM Thread have polls armed and holds the Threads_lock during a > ????????? safepoint. > - Unsafe means that either Java heap or VM internal structures can be accessed > ? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. > ??????? - All combination that are not safe are unsafe. > > We cannot start a safepoint until all unsafe threads have transitioned to a safe > state. To make them safe, we arm polls in compiled code and make sure any > transition to another unsafe state will be blocked. JavaThreads which are unsafe > with state _thread_in_Java may transition to _thread_in_native without being > blocked, since it just became a safe thread and we can proceed. Any safe thread > may try to transition at any time to an unsafe state, thus coming into the > safepoint blocking code at any moment, e.g., after the safepoint is over, or > even at the beginning of next safepoint. > > The VMThread cannot tolerate false positives from the JavaThread thread state > because that would mean starting the safepoint without all JavaThreads being > safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe > false positives from the safepoint blocking code, if we remove them, how do we > handle false positives? > > By first publishing which barrier tag (safepoint counter) we will call > WaitBarrier.wait() with as the threads safepoint id and then change the state to > _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of > the state. A stable load of the thread state is successful if the thread > safepoint id is the same both before and after the load of the state and > safepoint id is current or InactiveSafepointCounter. If the stable load fails, > the thread is considered safepoint unsafe. It's no longer enough that thread is > have state _thread_blocked it must also have correct safepoint id before and > after we read the state. > > Performance: > The result of faster safepoints is that the average CPU time for JavaThreads > between safepoints is higher, thus increasing the allocation rate. The thread > that stops first waits shorter time until it gets started. Even the thread that > stops last also have shorter stop since we start them faster. If your > application is using a concurrent GC it may need re-tunning since each java > worker thread have an increased CPU time/allocation rate. Often this means max > performance is achieved using slightly less java worker threads than before. > Also the increase allocation rate means shorter time between GC safepoints. > - If you are using a non-concurrent GC, you should see improved latency and > ? throughput. > - After re-tunning with a concurrent GC throughput should be equal or better but > ? with better latency. But bear in mind this is a latency patch, not a > ? throughput one. > With current code a java thread is not to guarantee to run between safepoint (in > theory a java thread can be starved indefinitely), since the VM thread may > re-grab the Threads_locks before it woke up from previous safepoint. If the > GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very > over-provisioned this can happen. > The current schema thus re-safepoint quickly if the java threads have not > started yet at the cost of latency. Since the new code uses the WaitBarrier with > the safepoint counter, all threads must roll forward to next safepoint by > getting at least some CPU time between two safepoints. Meaning MMU violations > are more obvious. > > Some examples on numbers: > - On a 16 strand machine synchronization and un-synchronization/starting is at > ? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and > ? starting ~400->~100us. > ? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). > - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster > ? synchronization time on 16 strands and ~5% score increase. In this case the GC > ? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. > - specJBB2015 ParGC ~9% increase in critical-jops. > > Thanks, Robbin From aph at redhat.com Tue Jan 15 10:48:09 2019 From: aph at redhat.com (Andrew Haley) Date: Tue, 15 Jan 2019 10:48:09 +0000 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: <0df0ddca-0831-b56c-9263-35d37862c798@redhat.com> On 1/15/19 10:39 AM, Robbin Ehn wrote: > - On a 16 strand machine synchronization and un-synchronization/starting is at > least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and > starting ~400->~100us. Thanks. Could you share that benchmark? It'd be interesting to try it on other architectures. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From robbin.ehn at oracle.com Tue Jan 15 11:33:59 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 15 Jan 2019 12:33:59 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <0df0ddca-0831-b56c-9263-35d37862c798@redhat.com> References: <0df0ddca-0831-b56c-9263-35d37862c798@redhat.com> Message-ID: <10fa154b-4899-ab03-f083-9825df85e5d0@oracle.com> On 2019-01-15 11:48, Andrew Haley wrote: > On 1/15/19 10:39 AM, Robbin Ehn wrote: >> - On a 16 strand machine synchronization and un-synchronization/starting is at >> least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >> starting ~400->~100us. > > Thanks. Could you share that benchmark? It'd be interesting to try it on > other architectures. > Those numbers are calculate from linux perf data, I couldn't remember what I run. But apparently it is also specJVM2008 serial with benchmark thread 16. The reason for serial that is that Aleksey tested a couple of previews with Shenandoah and indicated that serial was a good test. (Aleksey, if you have time please test again) And from my testing he seems to be correct. Note that the 16 strands is one package. So we have no logging or time measurement for 'unsynchronization' since it would need some extra synchronization overhead. Thanks for testing it! /Robbin From volker.simonis at gmail.com Tue Jan 15 14:07:38 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 15 Jan 2019 15:07:38 +0100 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: <0c2b1d41-8e34-2011-d630-1534bf0823f0@redhat.com> References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <0c2b1d41-8e34-2011-d630-1534bf0823f0@redhat.com> Message-ID: On Fri, Jan 11, 2019 at 10:11 AM Aleksey Shipilev wrote: > > On 1/11/19 1:22 AM, David Holmes wrote: > > Hi Aleksey, > > > > On 11/01/2019 1:21 am, Aleksey Shipilev wrote: > >> RFE: > >> https://bugs.openjdk.java.net/browse/JDK-8216308 > >> > >> Fix: > >> http://cr.openjdk.java.net/~shade/8216308/webrev.01/ > >> > >> This is another patch that removes the use of SymbolTable on hot path in stack trace creation. We > >> can inject Class.source_file field to cache the source file name. Some caution is needed to properly > >> handle invalidation when redefinition happens. > > > > I'm struggling a bit with the redefinition logic. IIRC redefinition can only happen at a safepoint > > so if there are concurrent calls to fillInStackTrace that involve a given class Foo, then they must > > all see the same version of Foo, and we can not have the case where one execution of the code is > > clearing the stale cache, while another is setting it to the new value - right? > > Mmm. I *hope* so. But, since we are reading the source_file into local, NULL-checking it, and only > then using it, whatever happens with the class cache should not have immediate effect, and current > (racy) caller would use the non-NULL value even if cache is being concurrently cleared. (There are > silly C/C++ memory model issues that may still expose us to NULL even after NULL-check, e.g. by > re-reading the memory instead of using the local, but that would break lots of other places too, I > think) But this scenario (i.e. re-reading a field from memory instead of using the value cached in a locale) is not related to the memory model. That depends on the compilers sole discretion and is perfectly legal C/C++. We've faced such issues several time with XLC on AIX. I think the only thing that helps is to declare the field "volatile". This tells the compiler that the value of the field can change at any time so he won't try to re-read it a second time (and instead spill its value to the stack if he runs out of registers). > > > That said, IIRC Coleen stated that intern can lead to a safepoint, which would then invalidate the > > existing redefinition logic because we would get the line number after the intern and it may now be > > incorrect. So I think we have to reorder the code so the get_line_number occurs before the call to > > intern. > > Yeah, looks like it. Well, if that is so, we need to do that move in a separate bug and backport it. > But I'd like someone more savvy in whole redefinition deal to see what is up. This patch can wait > that fix, and apply the caching on top. > > > I'm also very unclear about how the redefinition case is currently handled. It seems that we will > > normally intern NULL (and presumably get a NULL or empty-string oop?) unless ShowHiddenFrames is > > set, in which case we use the unknown_class_name() - regardless of whether the frame is actually > > hidden or not! This seems broken to me. (Separate bug to fix that is okay if it is indeed broken.) > > I *guess* that was the tradeoff for returning the nulls transiently while class was momentarily > redefined... This patch tried to maintain whatever current behavior there is. > > > A couple of comments on comments: > > Thanks, these are fixed in-place. > > -Aleksey > From shade at redhat.com Tue Jan 15 14:42:46 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 15 Jan 2019 15:42:46 +0100 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <0c2b1d41-8e34-2011-d630-1534bf0823f0@redhat.com> Message-ID: On 1/15/19 3:07 PM, Volker Simonis wrote: >> Mmm. I *hope* so. But, since we are reading the source_file into local, NULL-checking it, and >> only then using it, whatever happens with the class cache should not have immediate effect, and >> current (racy) caller would use the non-NULL value even if cache is being concurrently cleared. >> (There are silly C/C++ memory model issues that may still expose us to NULL even after >> NULL-check, e.g. by re-reading the memory instead of using the local, but that would break lots >> of other places too, I think) > > But this scenario (i.e. re-reading a field from memory instead of using the value cached in a > locale) is not related to the memory model. That depends on the compilers sole discretion and is > perfectly legal C/C++. Technically, it is related to (deliberately underspecified) memory model, which C/C++ can exploit. > We've faced such issues several time with XLC on AIX. I think the only > thing that helps is to declare the field "volatile". This tells the compiler that the value of > the field can change at any time so he won't try to re-read it a second time (and instead spill > its value to the stack if he runs out of registers). That's a valid concern. However, here we are dealing with *Java* field in Class from the VM code, and so it should definitely abide by Java semantics. If it does not, then javaClasses are broken too (that's what I meant by "break lots of other places"). Putting "volatile" over Java field would not help correctness here, because javaClass accesses ignore declaration-site volatility. We can model it with use-site volatility by doing obj_field_acquire, but that does not give us C-style volatile access, I think. Putting "volatile" in C++ is impossible, because there is nowhere to place it. I say if we ever see compilers wreck up javaClasses ordering, we fix it there? In this case, this wreckage would not be catastrophic, and would "only" result in transient "null" where source file would be, under class redefinition race. -Aleksey From volker.simonis at gmail.com Tue Jan 15 15:04:14 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 15 Jan 2019 16:04:14 +0100 Subject: RFR (S) 8216308: StackTraceElement::fill_in can use injected Class source-file In-Reply-To: References: <75e1bbcc-e807-6546-97ff-380e3f9d408f@redhat.com> <75513caf-5b58-7eed-f7c6-a147facba77a@oracle.com> <0c2b1d41-8e34-2011-d630-1534bf0823f0@redhat.com> Message-ID: On Tue, Jan 15, 2019 at 3:42 PM Aleksey Shipilev wrote: > > On 1/15/19 3:07 PM, Volker Simonis wrote: > >> Mmm. I *hope* so. But, since we are reading the source_file into local, NULL-checking it, and > >> only then using it, whatever happens with the class cache should not have immediate effect, and > >> current (racy) caller would use the non-NULL value even if cache is being concurrently cleared. > >> (There are silly C/C++ memory model issues that may still expose us to NULL even after > >> NULL-check, e.g. by re-reading the memory instead of using the local, but that would break lots > >> of other places too, I think) > > > > But this scenario (i.e. re-reading a field from memory instead of using the value cached in a > > locale) is not related to the memory model. That depends on the compilers sole discretion and is > > perfectly legal C/C++. > > Technically, it is related to (deliberately underspecified) memory model, which C/C++ can exploit. > > > We've faced such issues several time with XLC on AIX. I think the only > > thing that helps is to declare the field "volatile". This tells the compiler that the value of > > the field can change at any time so he won't try to re-read it a second time (and instead spill > > its value to the stack if he runs out of registers). > > That's a valid concern. However, here we are dealing with *Java* field in Class from the VM code, > and so it should definitely abide by Java semantics. If it does not, then javaClasses are broken too > (that's what I meant by "break lots of other places"). Putting "volatile" over Java field would not > help correctness here, because javaClass accesses ignore declaration-site volatility. We can model > it with use-site volatility by doing obj_field_acquire, but that does not give us C-style volatile > access, I think. Putting "volatile" in C++ is impossible, because there is nowhere to place it. > I see. You're right. > I say if we ever see compilers wreck up javaClasses ordering, we fix it there? In this case, this > wreckage would not be catastrophic, and would "only" result in transient "null" where source file > would be, under class redefinition race. > Agreed! > -Aleksey > From harold.seigel at oracle.com Tue Jan 15 19:38:34 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Tue, 15 Jan 2019 14:38:34 -0500 Subject: RFR 8216563: [TESTBUG] Change stressTime to default to 30 for nsk tests (part 2) In-Reply-To: References: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> Message-ID: Thanks Misha! Harold On 1/15/2019 2:37 PM, mikhailo.seledtsov at oracle.com wrote: > Changes look good to me, > > Misha > > > On 1/14/19 7:34 AM, Harold David Seigel wrote: >> Hi, >> >> Please review this fix to change the default stress time for hotspot >> vmTestbase tests from 60 seconds to 30 seconds.? The fix for >> JDK-8207964 >> intended to do this but was incomplete.? This fix provides the >> additional needed changes. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8216563/webrev/index.html >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216563 >> >> The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, >> Windows, Solaris, and Mac OS X. >> >> Thanks, Harold >> > From mikhailo.seledtsov at oracle.com Tue Jan 15 19:37:09 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Tue, 15 Jan 2019 11:37:09 -0800 Subject: RFR 8216563: [TESTBUG] Change stressTime to default to 30 for nsk tests (part 2) In-Reply-To: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> References: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> Message-ID: Changes look good to me, Misha On 1/14/19 7:34 AM, Harold David Seigel wrote: > Hi, > > Please review this fix to change the default stress time for hotspot > vmTestbase tests from 60 seconds to 30 seconds.? The fix for > JDK-8207964 > intended to do this but was incomplete.? This fix provides the > additional needed changes. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8216563/webrev/index.html > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216563 > > The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, > Windows, Solaris, and Mac OS X. > > Thanks, Harold > From coleen.phillimore at oracle.com Tue Jan 15 19:40:34 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 15 Jan 2019 14:40:34 -0500 Subject: RFR 8216563: [TESTBUG] Change stressTime to default to 30 for nsk tests (part 2) In-Reply-To: References: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> Message-ID: <084917a2-e6c4-fd24-a96b-96000fe3ee95@oracle.com> +1 Coleen On 1/15/19 2:37 PM, mikhailo.seledtsov at oracle.com wrote: > Changes look good to me, > > Misha > > > On 1/14/19 7:34 AM, Harold David Seigel wrote: >> Hi, >> >> Please review this fix to change the default stress time for hotspot >> vmTestbase tests from 60 seconds to 30 seconds.? The fix for >> JDK-8207964 >> intended to do this but was incomplete.? This fix provides the >> additional needed changes. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8216563/webrev/index.html >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216563 >> >> The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, >> Windows, Solaris, and Mac OS X. >> >> Thanks, Harold >> > From harold.seigel at oracle.com Tue Jan 15 19:50:16 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Tue, 15 Jan 2019 14:50:16 -0500 Subject: RFR 8216563: [TESTBUG] Change stressTime to default to 30 for nsk tests (part 2) In-Reply-To: <084917a2-e6c4-fd24-a96b-96000fe3ee95@oracle.com> References: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> <084917a2-e6c4-fd24-a96b-96000fe3ee95@oracle.com> Message-ID: <437ba435-4acb-f892-f41a-8e2e4676f289@oracle.com> Thanks Coleen! Harold On 1/15/2019 2:40 PM, coleen.phillimore at oracle.com wrote: > +1 > Coleen > > On 1/15/19 2:37 PM, mikhailo.seledtsov at oracle.com wrote: >> Changes look good to me, >> >> Misha >> >> >> On 1/14/19 7:34 AM, Harold David Seigel wrote: >>> Hi, >>> >>> Please review this fix to change the default stress time for hotspot >>> vmTestbase tests from 60 seconds to 30 seconds.? The fix for >>> JDK-8207964 >>> intended to do this but was incomplete.? This fix provides the >>> additional needed changes. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8216563/webrev/index.html >>> >>> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216563 >>> >>> The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, >>> Windows, Solaris, and Mac OS X. >>> >>> Thanks, Harold >>> >> > From shade at redhat.com Tue Jan 15 20:34:08 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 15 Jan 2019 21:34:08 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <10fa154b-4899-ab03-f083-9825df85e5d0@oracle.com> References: <0df0ddca-0831-b56c-9263-35d37862c798@redhat.com> <10fa154b-4899-ab03-f083-9825df85e5d0@oracle.com> Message-ID: <59dc07d3-0e80-b168-ce5a-57960f699433@redhat.com> On 1/15/19 12:33 PM, Robbin Ehn wrote: > On 2019-01-15 11:48, Andrew Haley wrote: >> On 1/15/19 10:39 AM, Robbin Ehn wrote: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ??? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ??? starting ~400->~100us. >> >> Thanks. Could you share that benchmark? It'd be interesting to try it on >> other architectures. >> > > Those numbers are calculate from linux perf data, I couldn't remember what I run. > But apparently it is also specJVM2008 serial with benchmark thread 16. > The reason for serial that is that Aleksey tested a couple of previews with > Shenandoah and indicated that serial was a good test. > (Aleksey, if you have time please test again) Hey. I indeed remember testing the early version of the patch, discovered some Serial throughput hit, and Robbin had fixed that then. Unfortunately, SPECjvm does not run cleanly with current jdk/jdk. We have Serial-like workload driven by JMH. Re-running it with the current jdk/jdk first to get the bearings on possible throughput loss (i7-7820X, 16 threads, -Xms4g -Xmx4g): ZGC, baseline: 19779 ? 79 ops/s ZGC, fast safepoints: 19858 ? 89 ops/s ; no loss! Shenandoah, baseline: 19929 ? 57 ops/s Shenandoah, fast safepoints: 19932 ? 49 ops/s ; no loss! Excellent, both collectors indicate no throughput loss. Now, there are two ways to measure the safepointing overhead. First, Shenandoah measures both "net" pause time (VM operation time, basically), and "gross" pause time (which includes both entering and leaving the safepoint, as well as VM operation time). With longer Serial run to get more GC cycles going: # Shenandoah, baseline [info][gc,stats] Total Pauses (G) = 0.47 s (a = 1139 us) (n = 416) (lvls, us = 203, 381, 426, 2324, 4705) [info][gc,stats] Total Pauses (N) = 0.09 s (a = 216 us) (n = 416) (lvls, us = 27, 182, 203, 223, 1258) # Shenandoah, fast safepoints [info][gc,stats] Total Pauses (G) = 0.21 s (a = 500 us) (n = 414) (lvls, us = 191, 398, 418, 443, 8992) [info][gc,stats] Total Pauses (N) = 0.10 s (a = 229 us) (n = 414) (lvls, us = 26, 184, 217, 244, 2327) Ta-da! Gross pause times dropped twice. Which means that before we spent ~3/4 ms in safepoint infra, and now we only spend ~1/4 ms. The second way is to parse safepoint logs. JMH has -prof safepoints that does it automatically. (It also prints percentiles and event counts, omitted here for brevity): Benchmark Mode Cnt Score Error Units # ZGC, baseline Serial.test thrpt 10 19710.077 ? 96.073 ops/s Serial.test:?safepoints.pause thrpt 1260 1310.022 ms Serial.test:?safepoints.pause.avg thrpt 1.040 ms Serial.test:?safepoints.ttsp thrpt 1260 1053.717 ms Serial.test:?safepoints.ttsp.avg thrpt 0.836 ms # ZGC, fast safepoints Serial.test thrpt 10 19904.687 ? 39.679 ops/s Serial.test:?safepoints.pause thrpt 1275 291.520 ms Serial.test:?safepoints.pause.avg thrpt 0.229 ms Serial.test:?safepoints.ttsp thrpt 1275 92.065 ms Serial.test:?safepoints.ttsp.avg thrpt 0.072 ms ; 11.6x lower! # Shenandoah, baseline Serial.test thrpt 10 19874.508 ? 47.175 ops/s Serial.test:?safepoints.pause thrpt 389 425.723 ms Serial.test:?safepoints.pause.avg thrpt 1.094 ms Serial.test:?safepoints.ttsp thrpt 389 312.413 ms Serial.test:?safepoints.ttsp.avg thrpt 0.803 ms # Shenandoah, fast safepoints Serial.test thrpt 10 19918.094 ? 48.058 ops/s Serial.test:?safepoints.pause thrpt 387 159.440 ms Serial.test:?safepoints.pause.avg thrpt 0.412 ms Serial.test:?safepoints.ttsp thrpt 387 41.941 ms Serial.test:?safepoints.ttsp.avg thrpt 0.108 ms ; 7.4x lower! So, TTSP (time to safepoint) had improved majestically. The rest of "gross" pause time measured by Shenandoah is quite probably the safepoint leaving latency. Also did a spot build checks: the patch builds in fastdebug on aarch64, s390x, x86_32. -Aleksey From harold.seigel at oracle.com Tue Jan 15 21:53:12 2019 From: harold.seigel at oracle.com (Harold David Seigel) Date: Tue, 15 Jan 2019 16:53:12 -0500 Subject: RFR 8216563: [TESTBUG] Change stressTime to default to 30 for nsk tests (part 2) In-Reply-To: <87f7b762-8a18-e005-7c7c-3c9c0e2e094f@oracle.com> References: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> <87f7b762-8a18-e005-7c7c-3c9c0e2e094f@oracle.com> Message-ID: <1c54fd3b-7063-2c48-9736-f5512fc7ca59@oracle.com> Thanks David! Some of the different places were just comments. Harold On 1/15/2019 4:51 PM, David Holmes wrote: > Looks good. > > Pity that default value appears in so many different places. > > Thanks, > David > > On 15/01/2019 1:34 am, Harold David Seigel wrote: >> Hi, >> >> Please review this fix to change the default stress time for hotspot >> vmTestbase tests from 60 seconds to 30 seconds.? The fix for >> JDK-8207964 >> intended to do this but was incomplete.? This fix provides the >> additional needed changes. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8216563/webrev/index.html >> >> JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216563 >> >> The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, >> Windows, Solaris, and Mac OS X. >> >> Thanks, Harold >> From david.holmes at oracle.com Tue Jan 15 21:51:33 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 16 Jan 2019 07:51:33 +1000 Subject: RFR 8216563: [TESTBUG] Change stressTime to default to 30 for nsk tests (part 2) In-Reply-To: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> References: <4ad385bf-0eb8-4545-1388-5285e87f9873@oracle.com> Message-ID: <87f7b762-8a18-e005-7c7c-3c9c0e2e094f@oracle.com> Looks good. Pity that default value appears in so many different places. Thanks, David On 15/01/2019 1:34 am, Harold David Seigel wrote: > Hi, > > Please review this fix to change the default stress time for hotspot > vmTestbase tests from 60 seconds to 30 seconds.? The fix for JDK-8207964 > intended to do this > but was incomplete.? This fix provides the additional needed changes. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8216563/webrev/index.html > > JBS Bug:? https://bugs.openjdk.java.net/browse/JDK-8216563 > > The fix was tested by running Mach5 hotspot tiers 1-5 on Linux-x64, > Windows, Solaris, and Mac OS X. > > Thanks, Harold > From manc at google.com Wed Jan 16 02:41:45 2019 From: manc at google.com (Man Cao) Date: Tue, 15 Jan 2019 18:41:45 -0800 Subject: RFR (M): 8212206: Refactor AdaptiveSizePolicy to separate out code related to GC overhead In-Reply-To: <7e0c775d-86c1-b80c-b1a6-373ca21206ba@oracle.com> References: <6b1e59ec7f4746e8e071fd44ec91ca966fac8d78.camel@oracle.com> <7e0c775d-86c1-b80c-b1a6-373ca21206ba@oracle.com> Message-ID: Hi, I rebased the patch to tip and updated year in some headers to 2019, without making any real change: http://cr.openjdk.java.net/~manc/8212206/webrev.02/ I don't foresee that this will be implemented, or even makes sense, for > ZGC. As I see it, this is only a thing STW collectors. For that reason, > I don't think it belongs in CollectedHeap. Keeping it as a separate > utility class for collectors that want to use it sounds better. > Sounds good to keep this patch in the current state, without further changing the CollectedHeap class. I haven't looked very closely at the patch, but couldn't help to notice > that the option is called "GCOverheapLimitThreshold" (and > "AdaptiveSizePolicyGCTimeLimitThreshold" before that), which is a > tautology and a not very good description of what it is. > How about we take the opportunity to clean this up and completely ditch > the "gc_overhead_limit_count" thing and get rid of this option? It's a > "develop" option, so it's not available to normal users anyway. Has > anyone of you ever used this option and actually find it valuable? I didn't find any users inside Google that require changing this option. That said, some users did complain that UseGCOverheadLimit for ParallelGC or CMS is too difficult to get triggered, because of the requirement for 5 consecutive full GCs, which is set by this option. I think if it were a normal "product" option, there will definitely be users setting it. I never understand why it is a "develop" option. I think we could either remove it, or make it an "experimental" option. I'm leaning towards not removing it for now, as I'm not sure if 5 is still a reasonable default value for UseGCOverheadLimit for G1. How about we decide whether to keep or remove this option after JDK-8212084 (UseGCOverheadLimit for G1) is fixed? Also for the hsperfdata counter change, I created https://bugs.openjdk.java.net/browse/JDK-8217221. I will draft a CSR for it later. -Man From robbin.ehn at oracle.com Wed Jan 16 09:45:50 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 16 Jan 2019 10:45:50 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <59dc07d3-0e80-b168-ce5a-57960f699433@redhat.com> References: <0df0ddca-0831-b56c-9263-35d37862c798@redhat.com> <10fa154b-4899-ab03-f083-9825df85e5d0@oracle.com> <59dc07d3-0e80-b168-ce5a-57960f699433@redhat.com> Message-ID: <6a6df770-7cf7-0ec8-f8cc-13366bbe21ce@oracle.com> Thanks Aleksey! Great that you are seeing such improvement and thanks for the build test. I added a new logging line in this patch, e.g. : [93.734s][info][safepoint,stats] Deoptimize: Synchronization: 4899 Operation: 457865 - Total: 462764 Application: 8270801 (ns) Which might be useful (it is for me), let me know if it's needs reformatting or is missing information etc.. or useless. We also don't measure the early prolog with: 380 Universe::heap()->safepoint_synchronize_begin(); 381 382 // By getting the Threads_lock, we assure that no threads are about to start or 383 // exit. It is released again in SafepointSynchronize::end(). 384 Threads_lock->lock(); I had a measurement there but it was never any time spent there, something to keep in mind at least. /Robbin On 2019-01-15 21:34, Aleksey Shipilev wrote: > On 1/15/19 12:33 PM, Robbin Ehn wrote: >> On 2019-01-15 11:48, Andrew Haley wrote: >>> On 1/15/19 10:39 AM, Robbin Ehn wrote: >>>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>>> ??? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>>> ??? starting ~400->~100us. >>> >>> Thanks. Could you share that benchmark? It'd be interesting to try it on >>> other architectures. >>> >> >> Those numbers are calculate from linux perf data, I couldn't remember what I run. >> But apparently it is also specJVM2008 serial with benchmark thread 16. >> The reason for serial that is that Aleksey tested a couple of previews with >> Shenandoah and indicated that serial was a good test. >> (Aleksey, if you have time please test again) > > Hey. I indeed remember testing the early version of the patch, discovered some Serial throughput > hit, and Robbin had fixed that then. Unfortunately, SPECjvm does not run cleanly with current > jdk/jdk. We have Serial-like workload driven by JMH. Re-running it with the current jdk/jdk first to > get the bearings on possible throughput loss (i7-7820X, 16 threads, -Xms4g -Xmx4g): > > ZGC, baseline: 19779 ? 79 ops/s > ZGC, fast safepoints: 19858 ? 89 ops/s ; no loss! > Shenandoah, baseline: 19929 ? 57 ops/s > Shenandoah, fast safepoints: 19932 ? 49 ops/s ; no loss! > > Excellent, both collectors indicate no throughput loss. > > Now, there are two ways to measure the safepointing overhead. First, Shenandoah measures both "net" > pause time (VM operation time, basically), and "gross" pause time (which includes both entering and > leaving the safepoint, as well as VM operation time). With longer Serial run to get more GC cycles > going: > > # Shenandoah, baseline > [info][gc,stats] Total Pauses (G) = 0.47 s (a = 1139 us) (n = 416) > (lvls, us = 203, 381, 426, 2324, 4705) > [info][gc,stats] Total Pauses (N) = 0.09 s (a = 216 us) (n = 416) > (lvls, us = 27, 182, 203, 223, 1258) > > # Shenandoah, fast safepoints > [info][gc,stats] Total Pauses (G) = 0.21 s (a = 500 us) (n = 414) > (lvls, us = 191, 398, 418, 443, 8992) > [info][gc,stats] Total Pauses (N) = 0.10 s (a = 229 us) (n = 414) > (lvls, us = 26, 184, 217, 244, 2327) > > Ta-da! Gross pause times dropped twice. Which means that before we spent ~3/4 ms in safepoint infra, > and now we only spend ~1/4 ms. > > The second way is to parse safepoint logs. JMH has -prof safepoints that does it automatically. (It > also prints percentiles and event counts, omitted here for brevity): > > Benchmark Mode Cnt Score Error Units > > # ZGC, baseline > Serial.test thrpt 10 19710.077 ? 96.073 ops/s > Serial.test:?safepoints.pause thrpt 1260 1310.022 ms > Serial.test:?safepoints.pause.avg thrpt 1.040 ms > Serial.test:?safepoints.ttsp thrpt 1260 1053.717 ms > Serial.test:?safepoints.ttsp.avg thrpt 0.836 ms > > # ZGC, fast safepoints > Serial.test thrpt 10 19904.687 ? 39.679 ops/s > Serial.test:?safepoints.pause thrpt 1275 291.520 ms > Serial.test:?safepoints.pause.avg thrpt 0.229 ms > Serial.test:?safepoints.ttsp thrpt 1275 92.065 ms > Serial.test:?safepoints.ttsp.avg thrpt 0.072 ms ; 11.6x lower! > > # Shenandoah, baseline > Serial.test thrpt 10 19874.508 ? 47.175 ops/s > Serial.test:?safepoints.pause thrpt 389 425.723 ms > Serial.test:?safepoints.pause.avg thrpt 1.094 ms > Serial.test:?safepoints.ttsp thrpt 389 312.413 ms > Serial.test:?safepoints.ttsp.avg thrpt 0.803 ms > > # Shenandoah, fast safepoints > Serial.test thrpt 10 19918.094 ? 48.058 ops/s > Serial.test:?safepoints.pause thrpt 387 159.440 ms > Serial.test:?safepoints.pause.avg thrpt 0.412 ms > Serial.test:?safepoints.ttsp thrpt 387 41.941 ms > Serial.test:?safepoints.ttsp.avg thrpt 0.108 ms ; 7.4x lower! > > So, TTSP (time to safepoint) had improved majestically. The rest of "gross" pause time measured by > Shenandoah is quite probably the safepoint leaving latency. > > Also did a spot build checks: the patch builds in fastdebug on aarch64, s390x, x86_32. > > -Aleksey > From shade at redhat.com Wed Jan 16 10:31:48 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 16 Jan 2019 11:31:48 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <6a6df770-7cf7-0ec8-f8cc-13366bbe21ce@oracle.com> References: <0df0ddca-0831-b56c-9263-35d37862c798@redhat.com> <10fa154b-4899-ab03-f083-9825df85e5d0@oracle.com> <59dc07d3-0e80-b168-ce5a-57960f699433@redhat.com> <6a6df770-7cf7-0ec8-f8cc-13366bbe21ce@oracle.com> Message-ID: <5e204a3e-a471-d9c5-aaa4-d2ca25d2f0e4@redhat.com> On 1/16/19 10:45 AM, Robbin Ehn wrote: > I added a new logging line in this patch, e.g. : > [93.734s][info][safepoint,stats] Deoptimize: Synchronization: 4899 Operation: 457865 - Total: 462764 > Application: 8270801 (ns) > > Which might be useful (it is for me), let me know if it's needs reformatting or is missing > information etc.. or useless. Um. I think this is inconsistent with the regular PrintSafepointStatistics safepoint logging that logs "sync", "block", "vm op", and "cleanup". The new message seems to tell "Synchronization" = "sync" + "block", and "Operation" = "vm op" + "cleanup"? I say we call it what it is: [93.734s][info][safepoint,stats] Safepoint "Deoptimize", Time since last: 8270801 ns; Reaching safepoint: 4899 ns; At safepoint: 457865 ns; Total: 462764 ns Also note the temporal ordering: first the timestamp for application stopped time, then time intervals for stopping/stopped, then totals. > We also don't measure the early prolog with: > ?380?? Universe::heap()->safepoint_synchronize_begin(); > ?381 > ?382?? // By getting the Threads_lock, we assure that no threads are about to start or > ?383?? // exit. It is released again in SafepointSynchronize::end(). > ?384?? Threads_lock->lock(); > > I had a measurement there but it was never any time spent there, something to keep in mind at least. Oh. Can we (should we) move the call to RuntimeService::record_safepoint_begin() before these? I'd imagine CollectedHeap::safepoint_synchronize_begin() might take a while in some implementations, and Threads_lock acquisition can be contended as well. It's fine to make it in a separate issue to avoid tainting this one with artificial "regression". Cheers, -Aleksey From david.griffiths at gmail.com Wed Jan 16 11:08:27 2019 From: david.griffiths at gmail.com (David Griffiths) Date: Wed, 16 Jan 2019 11:08:27 +0000 Subject: getPCDescNearDbg returns incorrect PCDesc Message-ID: Hi, I'd like some help please understanding what appears to be a bug in getPCDescNearDbg. The problem is caused by the fact that the _pc_offset value stored in a PcDesc is actually the offset of the code contained in the _next_ PcDesc rather than the current one. I assume it's done like this so that obtaining stack traces works correctly. At least on x86, the last instruction in a PcDesc chunk is the callq which means that the return address pc points to the next PcDesc. Therefore associating the PcDesc containing the callq with the address of the next PcDesc chunk means that the matching works in getPCDescAt. But this causes "off by one" errors in getPCDescNearDbg which appears to expect the PcDesc getRealPC address to be that of the PcDesc itself rather than the following one. So you sometimes see incorrect top of stack line numbers when debugging. (And this would presumably also affect profilers). I can fix the top of stack issue by changing distance to: long distance = pcDesc.getRealPC(this).minus(pc) - 1 but this then messes up the line numbers further down the stack because they are trying to match against return pcs. Anybody come across this before, is my analysis correct? Cheers, David From robbin.ehn at oracle.com Wed Jan 16 11:43:52 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 16 Jan 2019 12:43:52 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <5e204a3e-a471-d9c5-aaa4-d2ca25d2f0e4@redhat.com> References: <0df0ddca-0831-b56c-9263-35d37862c798@redhat.com> <10fa154b-4899-ab03-f083-9825df85e5d0@oracle.com> <59dc07d3-0e80-b168-ce5a-57960f699433@redhat.com> <6a6df770-7cf7-0ec8-f8cc-13366bbe21ce@oracle.com> <5e204a3e-a471-d9c5-aaa4-d2ca25d2f0e4@redhat.com> Message-ID: <4aba7695-8b32-c223-d666-7736a8405dc4@oracle.com> On 2019-01-16 11:31, Aleksey Shipilev wrote: > I say we call it what it is: > > [93.734s][info][safepoint,stats] Safepoint "Deoptimize", Time since last: 8270801 ns; Reaching > safepoint: 4899 ns; At safepoint: 457865 ns; Total: 462764 ns > > Also note the temporal ordering: first the timestamp for application stopped time, then time > intervals for stopping/stopped, then totals. I'll update to this, thanks. > >> We also don't measure the early prolog with: >> ?380?? Universe::heap()->safepoint_synchronize_begin(); >> ?381 >> ?382?? // By getting the Threads_lock, we assure that no threads are about to start or >> ?383?? // exit. It is released again in SafepointSynchronize::end(). >> ?384?? Threads_lock->lock(); >> >> I had a measurement there but it was never any time spent there, something to keep in mind at least. > > Oh. Can we (should we) move the call to RuntimeService::record_safepoint_begin() before these? I'd > imagine CollectedHeap::safepoint_synchronize_begin() might take a while in some implementations, and > Threads_lock acquisition can be contended as well. It's fine to make it in a separate issue to avoid > tainting this one with artificial "regression". Yes, https://bugs.openjdk.java.net/browse/JDK-8217244 /Robbin > > Cheers, > -Aleksey > > From martin.doerr at sap.com Wed Jan 16 15:57:59 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 16 Jan 2019 15:57:59 +0000 Subject: getPCDescNearDbg returns incorrect PCDesc In-Reply-To: References: Message-ID: <1ed2e1d3cc6943e3be9bcc9551af5fe3@sap.com> Hi David, re-posting to serviceability mailing list since you're referring to a method from https://docs.oracle.com/javase/jp/8/docs/serviceabilityagent/sun/jvm/hotspot/code/NMethod.html Best regards, Martin -----Original Message----- From: hotspot-dev On Behalf Of David Griffiths Sent: Mittwoch, 16. Januar 2019 12:08 To: hotspot-dev at openjdk.java.net Subject: getPCDescNearDbg returns incorrect PCDesc Hi, I'd like some help please understanding what appears to be a bug in getPCDescNearDbg. The problem is caused by the fact that the _pc_offset value stored in a PcDesc is actually the offset of the code contained in the _next_ PcDesc rather than the current one. I assume it's done like this so that obtaining stack traces works correctly. At least on x86, the last instruction in a PcDesc chunk is the callq which means that the return address pc points to the next PcDesc. Therefore associating the PcDesc containing the callq with the address of the next PcDesc chunk means that the matching works in getPCDescAt. But this causes "off by one" errors in getPCDescNearDbg which appears to expect the PcDesc getRealPC address to be that of the PcDesc itself rather than the following one. So you sometimes see incorrect top of stack line numbers when debugging. (And this would presumably also affect profilers). I can fix the top of stack issue by changing distance to: long distance = pcDesc.getRealPC(this).minus(pc) - 1 but this then messes up the line numbers further down the stack because they are trying to match against return pcs. Anybody come across this before, is my analysis correct? Cheers, David From coleen.phillimore at oracle.com Wed Jan 16 16:43:29 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 16 Jan 2019 11:43:29 -0500 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter Message-ID: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> Summary: make SystemDictionary::modification_counter atomic so not to require Compile_lock. I moved updating the modification counter when the class is defined and added to the hierarchy.? I didn't remove the Compile_lock completely because there may be other code currently under the lock that needs it (flush_dependencies).?? Can someone from the compiler area also review this? Made Compile_lock an always safepointing lock. Tested with mach5 tier1-6. open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8216136 Thanks, Coleen From jcbeyler at google.com Wed Jan 16 17:29:54 2019 From: jcbeyler at google.com (JC Beyler) Date: Wed, 16 Jan 2019 09:29:54 -0800 Subject: getPCDescNearDbg returns incorrect PCDesc In-Reply-To: <1ed2e1d3cc6943e3be9bcc9551af5fe3@sap.com> References: <1ed2e1d3cc6943e3be9bcc9551af5fe3@sap.com> Message-ID: Hi David, The explanation you are providing is clear to me, though I'm not sure at all what the right fix would be in this case. I would agree that there might be a bug here but it would be easier to see if you could provide an easy reproducer that shows how the initial line is off by 1 and then how it messes up higher in the stack if you try to fix it by your -1. My best guess is that there is a difference between code paths as you are saying and we might have to differentiate top frame and other frames for this calculation but without a reproducer to see it in action, it is hard to tell. Thanks, Jc On Wed, Jan 16, 2019 at 7:58 AM Doerr, Martin wrote: > Hi David, > > re-posting to serviceability mailing list since you're referring to a > method from > > https://docs.oracle.com/javase/jp/8/docs/serviceabilityagent/sun/jvm/hotspot/code/NMethod.html > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-dev On Behalf Of > David Griffiths > Sent: Mittwoch, 16. Januar 2019 12:08 > To: hotspot-dev at openjdk.java.net > Subject: getPCDescNearDbg returns incorrect PCDesc > > Hi, I'd like some help please understanding what appears to be a bug in > getPCDescNearDbg. The problem is caused by the fact that the _pc_offset > value stored in a PcDesc is actually the offset of the code contained in > the _next_ PcDesc rather than the current one. I assume it's done like this > so that obtaining stack traces works correctly. At least on x86, the last > instruction in a PcDesc chunk is the callq which means that the return > address pc points to the next PcDesc. Therefore associating the PcDesc > containing the callq with the address of the next PcDesc chunk means that > the matching works in getPCDescAt. > > But this causes "off by one" errors in getPCDescNearDbg which appears to > expect the PcDesc getRealPC address to be that of the PcDesc itself rather > than the following one. So you sometimes see incorrect top of stack line > numbers when debugging. (And this would presumably also affect profilers). > > I can fix the top of stack issue by changing distance to: > > long distance = pcDesc.getRealPC(this).minus(pc) - 1 > > but this then messes up the line numbers further down the stack because > they are trying to match against return pcs. > > Anybody come across this before, is my analysis correct? > > Cheers, > > David > -- Thanks, Jc From dean.long at oracle.com Thu Jan 17 03:53:49 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 16 Jan 2019 19:53:49 -0800 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> Message-ID: <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> Hi Coleen.? You still can't safely call notice_modification() outside of Compile_lock, (at least not without other changes), so this: - static inline void notice_modification() { assert_locked_or_safepoint(Compile_lock); ++_number_of_modifications; } + static inline void notice_modification() { Atomic::inc(&_number_of_modifications); } should be: static inline void notice_modification() { assert_locked_or_safepoint(Compile_lock); Atomic::inc(&_number_of_modifications); } Are you trying to eventually remove Compile_lock completely?? If so, then notice_modification() would have to be called *before* the class hierarchy is changed, not after, and probably other changes would be needed as well. dl On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: > Summary: make SystemDictionary::modification_counter atomic so not to > require Compile_lock. > > I moved updating the modification counter when the class is defined > and added to the hierarchy.? I didn't remove the Compile_lock > completely because there may be other code currently under the lock > that needs it (flush_dependencies).?? Can someone from the compiler > area also review this? > > Made Compile_lock an always safepointing lock. > > Tested with mach5 tier1-6. > > open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8216136 > > Thanks, > Coleen From coleen.phillimore at oracle.com Thu Jan 17 12:15:56 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 17 Jan 2019 07:15:56 -0500 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> Message-ID: <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> On 1/16/19 10:53 PM, dean.long at oracle.com wrote: > Hi Coleen.? You still can't safely call notice_modification() outside > of Compile_lock, (at least not without other changes), so this: > > - static inline void notice_modification() { > assert_locked_or_safepoint(Compile_lock); ++_number_of_modifications; } > + static inline void notice_modification() { > Atomic::inc(&_number_of_modifications); } > > should be: > > static inline void notice_modification() { > assert_locked_or_safepoint(Compile_lock); > Atomic::inc(&_number_of_modifications); } > > > Are you trying to eventually remove Compile_lock completely?? If so, > then notice_modification() would have to be called *before* the > class hierarchy is changed, not after, and probably other changes > would be needed as well. Dean, Thank you for looking at this and your comments. No, I'm not trying to remove Compile_lock entirely and I can assert that notice_modification has the Compile_lock as above.? The class hierarchy code has been changed to be lock free rather than requiring the Compile_lock, although I think the Compile_lock still protects some of this code. There are also some Compile_lock free ways of getting to dependencies, because putting notice_modification after flush_dependencies caused bugs that I'll ask to you offline about. Thanks for your help.? I was just trying to peel off one place where Compile_lock seemed wrong. Thanks, Coleen > > dl > > > On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: >> Summary: make SystemDictionary::modification_counter atomic so not to >> require Compile_lock. >> >> I moved updating the modification counter when the class is defined >> and added to the hierarchy.? I didn't remove the Compile_lock >> completely because there may be other code currently under the lock >> that needs it (flush_dependencies). Can someone from the compiler >> area also review this? >> >> Made Compile_lock an always safepointing lock. >> >> Tested with mach5 tier1-6. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >> >> Thanks, >> Coleen > From shade at redhat.com Thu Jan 17 14:37:58 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 17 Jan 2019 15:37:58 +0100 Subject: RFR (S) 8217315: Proper units should print more significant digits Message-ID: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> RFE: https://bugs.openjdk.java.net/browse/JDK-8217315 Fix: http://cr.openjdk.java.net/~shade/8217315/webrev.01/ This bothered me for quite some time. Our current rounding code in "proper units" cuts off to next suffix after "10" units, which makes the values too coarse all of the sudden. With current two-significant-digits code, the rounding error can be as bad as 10% (for example, both 10.0G vs 10.(9)G would round down to 10G). I suggest we allow at least three significant digits, dropping that max error to 1%, and making the logs more precise. The difference in e.g. Epsilon logs is clearly visible: Now: [25.186s][info][gc] Heap: 100G reserved, 15G (15.09%) committed, 15G (15.00%) used After the patch: [23.315s][info][gc] Heap: 100G reserved, 15449M (15.09%) committed, 15361M (15.00%) used I'd like to push it to jdk/jdk, see if there are higher-tier tests that expect something else, and then backport it to 12u, 11u, 8u. Testing: new gtest, hotspot tier1, jdk-submit Thanks, -Aleksey From thomas.schatzl at oracle.com Thu Jan 17 14:53:17 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Jan 2019 15:53:17 +0100 Subject: RFR (S) 8217315: Proper units should print more significant digits In-Reply-To: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> References: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> Message-ID: <49444d02bce5f62c69b6667ca39010543db7defa.camel@oracle.com> Hi, On Thu, 2019-01-17 at 15:37 +0100, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217315 > > Fix: > http://cr.openjdk.java.net/~shade/8217315/webrev.01/ looks good to me. Thanks, Thomas From sgehwolf at redhat.com Thu Jan 17 14:57:23 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 17 Jan 2019 15:57:23 +0100 Subject: [Proposal] Better systemd slice memory limit support for OpenJDK Message-ID: Hi, Current container awareness for OpenJDK seems to work for systemd slices too, on some systems. To be precise, this works for older Kernels e.g. 3.10. However, we've noticed that this breaks for newer Kernel versions[1] such as the one in F28, currently 4.19.14-200. If the container support would also look at hierarchical memory limits exposed via memory.stat in the cgroup file system, it would again work[2]. A proof of concept implementation is here: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/webrev/ This enhancement wouldn't change any existing container memory limit detection as it only kicks in when all other look-ups determined that there is no limit in place. I've verified this by running Docker container tests. The idea is to look for hierarchical_memory_limit and hierarchical_memsw_limit lines in the memory.stat file of the cgroup tree. Would it be possible to consider such an enhancement upstream? If so, I'll file a bug and propose it for review. This issue has been originally raised here: https://bugzilla.redhat.com/show_bug.cgi?id=1509371 Thanks, Severin [1] Java process gets killed by oom killer. See: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/before.txt [2] Java process throws OutOfMemoryError as expected. See: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/after.txt From shade at redhat.com Thu Jan 17 15:00:16 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 17 Jan 2019 16:00:16 +0100 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 Message-ID: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8217321 Not sure if we should push it to jdk/jdk12, or just to jdk/jdk, and then backport. I am leaning to just jdk/jdk. Fix: diff -r 495edb72707a test/hotspot/gtest/utilities/test_globalDefinitions.cpp --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 15:25:11 2019 +0100 +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 15:56:06 2019 +0100 @@ -103,7 +103,7 @@ EXPECT_STREQ("M", exact_unit_for_byte_size(M)); EXPECT_STREQ("B", exact_unit_for_byte_size(M + 1)); EXPECT_STREQ("K", exact_unit_for_byte_size(M + K)); -#ifdef LP64 +#ifdef _LP64 EXPECT_STREQ("B", exact_unit_for_byte_size(G - 1)); EXPECT_STREQ("G", exact_unit_for_byte_size(G)); EXPECT_STREQ("B", exact_unit_for_byte_size(G + 1)); @@ -123,7 +123,7 @@ EXPECT_EQ(1u, byte_size_in_exact_unit(M)); EXPECT_EQ(M + 1, byte_size_in_exact_unit(M + 1)); EXPECT_EQ(K + 1, byte_size_in_exact_unit(M + K)); -#ifdef LP64 +#ifdef _LP64 EXPECT_EQ(G - 1, byte_size_in_exact_unit(G - 1)); EXPECT_EQ(1u, byte_size_in_exact_unit(G)); EXPECT_EQ(G + 1, byte_size_in_exact_unit(G + 1)); Testing: gtest on Linux x86_64, jdk-submit (running) Thanks, -Aleksey From shade at redhat.com Thu Jan 17 15:05:15 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 17 Jan 2019 16:05:15 +0100 Subject: RFR (S) 8217315: Proper units should print more significant digits In-Reply-To: <49444d02bce5f62c69b6667ca39010543db7defa.camel@oracle.com> References: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> <49444d02bce5f62c69b6667ca39010543db7defa.camel@oracle.com> Message-ID: <2d85f752-40e5-6977-0c91-62dcb354b726@redhat.com> On 1/17/19 3:53 PM, Thomas Schatzl wrote: > On Thu, 2019-01-17 at 15:37 +0100, Aleksey Shipilev wrote: >> RFE: >> https://bugs.openjdk.java.net/browse/JDK-8217315 >> >> Fix: >> http://cr.openjdk.java.net/~shade/8217315/webrev.01/ > > looks good to me. Thanks Thomas. I stared at the patch a bit and realized there is a test bug, which was hidden by another existing test bug. Fixed here by: diff -r f552d25a803f test/hotspot/gtest/utilities/test_globalDefinitions.cpp --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 16:02:30 2019 +0100 +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 16:02:50 2019 +0100 @@ -125,12 +125,12 @@ EXPECT_STREQ("K", proper_unit_for_byte_size(50*M)); -#ifdef LP64 - EXPECT_EQ(1024u, byte_size_in_proper_unit(G - 1)); +#ifdef _LP64 + EXPECT_EQ(1023u, byte_size_in_proper_unit(G - 1)); EXPECT_STREQ("M", proper_unit_for_byte_size(G - 1)); New webrev: http://cr.openjdk.java.net/~shade/8217315/webrev.02/ The fix for original test is here: https://bugs.openjdk.java.net/browse/JDK-8217321 Thanks, -Aleksey From thomas.stuefe at gmail.com Thu Jan 17 15:08:23 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 17 Jan 2019 16:08:23 +0100 Subject: RFR (S) 8217315: Proper units should print more significant digits In-Reply-To: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> References: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> Message-ID: Looks good to me too. A better solution may be printing as float, see e.g. "print_human_readable_size" in metaspaceCommon.cpp . But that is outside the scope of your change. As for the test, note that you could include all sizes up to the last one (50G) in 32bit too. ..Thomas On Thu, Jan 17, 2019 at 3:39 PM Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217315 > > Fix: > http://cr.openjdk.java.net/~shade/8217315/webrev.01/ > > This bothered me for quite some time. Our current rounding code in "proper > units" cuts off to next > suffix after "10" units, which makes the values too coarse all of the > sudden. With current > two-significant-digits code, the rounding error can be as bad as 10% (for > example, both 10.0G vs > 10.(9)G would round down to 10G). I suggest we allow at least three > significant digits, dropping > that max error to 1%, and making the logs more precise. The difference in > e.g. Epsilon logs is > clearly visible: > > Now: > [25.186s][info][gc] Heap: 100G reserved, 15G (15.09%) committed, 15G > (15.00%) used > > After the patch: > [23.315s][info][gc] Heap: 100G reserved, 15449M (15.09%) committed, > 15361M (15.00%) used > > I'd like to push it to jdk/jdk, see if there are higher-tier tests that > expect something else, and > then backport it to 12u, 11u, 8u. > > Testing: new gtest, hotspot tier1, jdk-submit > > Thanks, > -Aleksey > > From thomas.schatzl at oracle.com Thu Jan 17 15:19:58 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Jan 2019 16:19:58 +0100 Subject: RFR (S) 8217315: Proper units should print more significant digits In-Reply-To: <2d85f752-40e5-6977-0c91-62dcb354b726@redhat.com> References: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> <49444d02bce5f62c69b6667ca39010543db7defa.camel@oracle.com> <2d85f752-40e5-6977-0c91-62dcb354b726@redhat.com> Message-ID: <0d149246975fe12e2365e851e12acdc5716cb041.camel@oracle.com> Hi, On Thu, 2019-01-17 at 16:05 +0100, Aleksey Shipilev wrote: > On 1/17/19 3:53 PM, Thomas Schatzl wrote: > > On Thu, 2019-01-17 at 15:37 +0100, Aleksey Shipilev wrote: > > > RFE: > > > https://bugs.openjdk.java.net/browse/JDK-8217315 > > > > > > Fix: > > > http://cr.openjdk.java.net/~shade/8217315/webrev.01/ > > > > looks good to me. > > Thanks Thomas. I stared at the patch a bit and realized there is a > test bug, which was hidden by > another existing test bug. Fixed here by: > > diff -r f552d25a803f > test/hotspot/gtest/utilities/test_globalDefinitions.cpp > --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu > Jan 17 16:02:30 2019 +0100 > +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu > Jan 17 16:02:50 2019 +0100 > @@ -125,12 +125,12 @@ > EXPECT_STREQ("K", proper_unit_for_byte_size(50*M)); > > -#ifdef LP64 > - EXPECT_EQ(1024u, byte_size_in_proper_unit(G - 1)); > +#ifdef _LP64 > + EXPECT_EQ(1023u, byte_size_in_proper_unit(G - 1)); > EXPECT_STREQ("M", proper_unit_for_byte_size(G - 1)); > > New webrev: > http://cr.openjdk.java.net/~shade/8217315/webrev.02/ > > The fix for original test is here: > https://bugs.openjdk.java.net/browse/JDK-8217321 > > Thanks, > -Aleksey > still looks good. Thanks, Thomas From thomas.schatzl at oracle.com Thu Jan 17 15:24:05 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Jan 2019 16:24:05 +0100 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 In-Reply-To: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> References: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> Message-ID: <28c13cd766e362ef77e7acddd307f71651cda114.camel@oracle.com> Hi, On Thu, 2019-01-17 at 16:00 +0100, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8217321 > > Not sure if we should push it to jdk/jdk12, or just to jdk/jdk, and > then backport. I am leaning to just jdk/jdk. since this is a test fix it is fine to push to jdk/jdk12. Either is fine with me. Looks good. Thanks, Thomas From shade at redhat.com Thu Jan 17 15:23:33 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 17 Jan 2019 16:23:33 +0100 Subject: RFR (S) 8217315: Proper units should print more significant digits In-Reply-To: References: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> Message-ID: On 1/17/19 4:08 PM, Thomas St?fe wrote: > Looks good to me too.? Thanks! > A better solution may be printing as float, see e.g. "print_human_readable_size" > in?metaspaceCommon.cpp . But that is outside the scope of your change. Yes. > As for the test, note that you could include all sizes up to the last one (50G) in 32bit too. I could, but I would like not to, and follow what other tests in the same file do with "G". "G" is very close to SIZE_MAX on 32/31 bit platforms, it seems risky to expose it too much. -Aleksey From thomas.stuefe at gmail.com Thu Jan 17 15:42:08 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 17 Jan 2019 16:42:08 +0100 Subject: RFR (S) 8217315: Proper units should print more significant digits In-Reply-To: References: <3fa392c2-5c73-4bbe-0272-accb7105b55e@redhat.com> Message-ID: Latest iteration is still fine. ..Thomas On Thu, Jan 17, 2019 at 4:23 PM Aleksey Shipilev wrote: > On 1/17/19 4:08 PM, Thomas St?fe wrote: > > Looks good to me too. > > Thanks! > > > A better solution may be printing as float, see e.g. > "print_human_readable_size" > > in metaspaceCommon.cpp . But that is outside the scope of your change. > > Yes. > > > As for the test, note that you could include all sizes up to the last > one (50G) in 32bit too. > > I could, but I would like not to, and follow what other tests in the same > file do with "G". "G" is > very close to SIZE_MAX on 32/31 bit platforms, it seems risky to expose it > too much. > > -Aleksey > > From coleen.phillimore at oracle.com Thu Jan 17 15:58:07 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 17 Jan 2019 10:58:07 -0500 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 In-Reply-To: <28c13cd766e362ef77e7acddd307f71651cda114.camel@oracle.com> References: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> <28c13cd766e362ef77e7acddd307f71651cda114.camel@oracle.com> Message-ID: <4a6a6b6e-ae02-a92b-dd78-4433f796f994@oracle.com> Yes, looks good and trivial.? It doesn't fail in jdk12 though, does it?? It just doesn't run those tests? If so, just push to jdk 13.? It doesn't seem like it needs to be fixed in 12 at this point. Thanks, Coleen On 1/17/19 10:24 AM, Thomas Schatzl wrote: > Hi, > > On Thu, 2019-01-17 at 16:00 +0100, Aleksey Shipilev wrote: >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8217321 >> >> Not sure if we should push it to jdk/jdk12, or just to jdk/jdk, and >> then backport. I am leaning to just jdk/jdk. > since this is a test fix it is fine to push to jdk/jdk12. Either is > fine with me. > > Looks good. > > Thanks, > Thomas > > From bob.vandette at oracle.com Thu Jan 17 15:59:43 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 17 Jan 2019 10:59:43 -0500 Subject: [Proposal] Better systemd slice memory limit support for OpenJDK In-Reply-To: References: Message-ID: <4F03B58D-FD8A-4ED1-B14A-C3769C573647@oracle.com> I checked a few systems I have access to and they all have use_hierarchy enabled. When I set a memory limit the memory.stat hierarchical_memory_limit is identical to the memory.limit_in_bytes contents. Do you have any idea why the kernel behavior changed? Is this documented behavior? I wouldn?t want to add a work-around for a transient kernel bug. Bob. > On Jan 17, 2019, at 9:57 AM, Severin Gehwolf wrote: > > Hi, > > Current container awareness for OpenJDK seems to work for systemd > slices too, on some systems. To be precise, this works for older > Kernels e.g. 3.10. However, we've noticed that this breaks for newer > Kernel versions[1] such as the one in F28, currently 4.19.14-200. If > the container support would also look at hierarchical memory limits > exposed via memory.stat in the cgroup file system, it would again > work[2]. A proof of concept implementation is here: > > http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/webrev/ > > This enhancement wouldn't change any existing container memory limit > detection as it only kicks in when all other look-ups determined that > there is no limit in place. I've verified this by running Docker > container tests. The idea is to look for hierarchical_memory_limit and > hierarchical_memsw_limit lines in the memory.stat file of the cgroup > tree. > > Would it be possible to consider such an enhancement upstream? If so, > I'll file a bug and propose it for review. > > This issue has been originally raised here: > https://bugzilla.redhat.com/show_bug.cgi?id=1509371 > > Thanks, > Severin > > [1] Java process gets killed by oom killer. > See: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/before.txt > [2] Java process throws OutOfMemoryError as expected. > See: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/after.txt > From shade at redhat.com Thu Jan 17 16:13:29 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 17 Jan 2019 17:13:29 +0100 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 In-Reply-To: <4a6a6b6e-ae02-a92b-dd78-4433f796f994@oracle.com> References: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> <28c13cd766e362ef77e7acddd307f71651cda114.camel@oracle.com> <4a6a6b6e-ae02-a92b-dd78-4433f796f994@oracle.com> Message-ID: On 1/17/19 4:58 PM, coleen.phillimore at oracle.com wrote: > Yes, looks good and trivial. Thanks. > It doesn't fail in jdk12 though, does it?? It just doesn't run those > tests? If so, just push to jdk 13.? It doesn't seem like it needs to be fixed in 12 at this point. In both jdk/jdk and jdk/jdk12, it does not fail the tests themselves, it just does not run them. Indeed, there is no rush. I am going to push it to jdk/jdk. -Aleksey From sgehwolf at redhat.com Thu Jan 17 16:16:24 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 17 Jan 2019 17:16:24 +0100 Subject: [Proposal] Better systemd slice memory limit support for OpenJDK In-Reply-To: <4F03B58D-FD8A-4ED1-B14A-C3769C573647@oracle.com> References: <4F03B58D-FD8A-4ED1-B14A-C3769C573647@oracle.com> Message-ID: Hi Bob, On Thu, 2019-01-17 at 10:59 -0500, Bob Vandette wrote: > I checked a few systems I have access to and they all have use_hierarchy enabled. When I set a memory > limit the memory.stat hierarchical_memory_limit is identical to the memory.limit_in_bytes contents. This seems to suggest they're running on older kernels. I know F29, F28 are affected. > Do you have any idea why the kernel behavior changed? No, not really. It could be with the "unified control group hierarchy" work done in 3.16, but that's speculation: https://lwn.net/Articles/601840/ Some more discussion with systemd/kernel folk is here: https://bugzilla.redhat.com/show_bug.cgi?id=1599387 > Is this documented behavior? Yes. Though, in the self-proclaimed very outdated document: https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt Section 6, "Hierarchy support". > I wouldn?t want to add a work-around for a transient kernel bug. FWIW, cgroup-v1 is a legacy interface, so I doubt anything in that code has a great chance of getting "fixed". So far there is no sign of evidence what the expected behaviour should be. Thanks, Severin > Bob. > > > > On Jan 17, 2019, at 9:57 AM, Severin Gehwolf wrote: > > > > Hi, > > > > Current container awareness for OpenJDK seems to work for systemd > > slices too, on some systems. To be precise, this works for older > > Kernels e.g. 3.10. However, we've noticed that this breaks for newer > > Kernel versions[1] such as the one in F28, currently 4.19.14-200. If > > the container support would also look at hierarchical memory limits > > exposed via memory.stat in the cgroup file system, it would again > > work[2]. A proof of concept implementation is here: > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/webrev/ > > > > This enhancement wouldn't change any existing container memory limit > > detection as it only kicks in when all other look-ups determined that > > there is no limit in place. I've verified this by running Docker > > container tests. The idea is to look for hierarchical_memory_limit and > > hierarchical_memsw_limit lines in the memory.stat file of the cgroup > > tree. > > > > Would it be possible to consider such an enhancement upstream? If so, > > I'll file a bug and propose it for review. > > > > This issue has been originally raised here: > > https://bugzilla.redhat.com/show_bug.cgi?id=1509371 > > > > Thanks, > > Severin > > > > [1] Java process gets killed by oom killer. > > See: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/before.txt > > [2] Java process throws OutOfMemoryError as expected. > > See: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/after.txt > > > > From bob.vandette at oracle.com Thu Jan 17 18:11:10 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 17 Jan 2019 13:11:10 -0500 Subject: [Proposal] Better systemd slice memory limit support for OpenJDK In-Reply-To: References: <4F03B58D-FD8A-4ED1-B14A-C3769C573647@oracle.com> Message-ID: <9D22CF42-A701-48E2-9B27-3D862F187B6C@oracle.com> Do you happen to know if this issue impacts Ubuntu 19.04? Looking at our official OS support list, we?re only supporting Ubuntu 18.04 with JDK 12 which is using kernel version 4.15. Even if this gets bumped to 18.10, this is still only using 4.18. Go ahead and file a bug while we look into this issue a bit more. I?d like to know the exact kernel cutoff and which distros are impacted. If it?s wide-spread on distros getting released this year, we should fix it in JDK 13. Bob. > On Jan 17, 2019, at 11:16 AM, Severin Gehwolf wrote: > > Hi Bob, > > On Thu, 2019-01-17 at 10:59 -0500, Bob Vandette wrote: >> I checked a few systems I have access to and they all have use_hierarchy enabled. When I set a memory >> limit the memory.stat hierarchical_memory_limit is identical to the memory.limit_in_bytes contents. > > This seems to suggest they're running on older kernels. I know F29, F28 > are affected. > >> Do you have any idea why the kernel behavior changed? > > No, not really. It could be with the "unified control group hierarchy" > work done in 3.16, but that's speculation: > https://lwn.net/Articles/601840/ > > Some more discussion with systemd/kernel folk is here: > https://bugzilla.redhat.com/show_bug.cgi?id=1599387 > >> Is this documented behavior? > > Yes. Though, in the self-proclaimed very outdated document: > https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt > > Section 6, "Hierarchy support". > >> I wouldn?t want to add a work-around for a transient kernel bug. > > FWIW, cgroup-v1 is a legacy interface, so I doubt anything in that code > has a great chance of getting "fixed". So far there is no sign of > evidence what the expected behaviour should be. > > Thanks, > Severin > >> Bob. >> >> >>> On Jan 17, 2019, at 9:57 AM, Severin Gehwolf wrote: >>> >>> Hi, >>> >>> Current container awareness for OpenJDK seems to work for systemd >>> slices too, on some systems. To be precise, this works for older >>> Kernels e.g. 3.10. However, we've noticed that this breaks for newer >>> Kernel versions[1] such as the one in F28, currently 4.19.14-200. If >>> the container support would also look at hierarchical memory limits >>> exposed via memory.stat in the cgroup file system, it would again >>> work[2]. A proof of concept implementation is here: >>> >>> http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/webrev/ >>> >>> This enhancement wouldn't change any existing container memory limit >>> detection as it only kicks in when all other look-ups determined that >>> there is no limit in place. I've verified this by running Docker >>> container tests. The idea is to look for hierarchical_memory_limit and >>> hierarchical_memsw_limit lines in the memory.stat file of the cgroup >>> tree. >>> >>> Would it be possible to consider such an enhancement upstream? If so, >>> I'll file a bug and propose it for review. >>> >>> This issue has been originally raised here: >>> https://bugzilla.redhat.com/show_bug.cgi?id=1509371 >>> >>> Thanks, >>> Severin >>> >>> [1] Java process gets killed by oom killer. >>> See: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/before.txt >>> [2] Java process throws OutOfMemoryError as expected. >>> See: http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/after.txt >>> >> >> > From sgehwolf at redhat.com Thu Jan 17 19:24:13 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 17 Jan 2019 20:24:13 +0100 Subject: [Proposal] Better systemd slice memory limit support for OpenJDK In-Reply-To: <9D22CF42-A701-48E2-9B27-3D862F187B6C@oracle.com> References: <4F03B58D-FD8A-4ED1-B14A-C3769C573647@oracle.com> <9D22CF42-A701-48E2-9B27-3D862F187B6C@oracle.com> Message-ID: <7c1eb1cc2345ecf35dc3a5ddc4f69546f1d33df4.camel@redhat.com> On Thu, 2019-01-17 at 13:11 -0500, Bob Vandette wrote: > Do you happen to know if this issue impacts Ubuntu 19.04? I don't, sorry. > Looking at our official OS support list, we?re only supporting Ubuntu 18.04 with JDK 12 which > is using kernel version 4.15. Even if this gets bumped to 18.10, this is still only using 4.18. I can confirm Ubuntu 18.04 doesn't seem to be affected with Kernel 4.15. RHEL-8 Beta, on the other hand, has 4.18 which is affected. I'm also trying Ubuntu 18.10 now, which, in theory, should be affected. > Go ahead and file a bug while we look into this issue a bit more. I?d like to know the exact > kernel cutoff and which distros are impacted. If it?s wide-spread on distros getting released > this year, we should fix it in JDK 13. OK. Here you go: https://bugs.openjdk.java.net/browse/JDK-8217338 Thanks, Severin > Bob. > > > > On Jan 17, 2019, at 11:16 AM, Severin Gehwolf > > wrote: > > > > Hi Bob, > > > > On Thu, 2019-01-17 at 10:59 -0500, Bob Vandette wrote: > > > I checked a few systems I have access to and they all have > > > use_hierarchy enabled. When I set a memory > > > limit the memory.stat hierarchical_memory_limit is identical to > > > the memory.limit_in_bytes contents. > > > > This seems to suggest they're running on older kernels. I know F29, > > F28 > > are affected. > > > > > Do you have any idea why the kernel behavior changed? > > > > No, not really. It could be with the "unified control group > > hierarchy" > > work done in 3.16, but that's speculation: > > https://lwn.net/Articles/601840/ > > > > Some more discussion with systemd/kernel folk is here: > > https://bugzilla.redhat.com/show_bug.cgi?id=1599387 > > > > > Is this documented behavior? > > > > Yes. Though, in the self-proclaimed very outdated document: > > https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt > > > > Section 6, "Hierarchy support". > > > > > I wouldn?t want to add a work-around for a transient kernel bug. > > > > FWIW, cgroup-v1 is a legacy interface, so I doubt anything in that > > code > > has a great chance of getting "fixed". So far there is no sign of > > evidence what the expected behaviour should be. > > > > Thanks, > > Severin > > > > > Bob. > > > > > > > > > > On Jan 17, 2019, at 9:57 AM, Severin Gehwolf < > > > > sgehwolf at redhat.com> wrote: > > > > > > > > Hi, > > > > > > > > Current container awareness for OpenJDK seems to work for > > > > systemd > > > > slices too, on some systems. To be precise, this works for > > > > older > > > > Kernels e.g. 3.10. However, we've noticed that this breaks for > > > > newer > > > > Kernel versions[1] such as the one in F28, currently 4.19.14- > > > > 200. If > > > > the container support would also look at hierarchical memory > > > > limits > > > > exposed via memory.stat in the cgroup file system, it would > > > > again > > > > work[2]. A proof of concept implementation is here: > > > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/webrev/ > > > > > > > > This enhancement wouldn't change any existing container memory > > > > limit > > > > detection as it only kicks in when all other look-ups > > > > determined that > > > > there is no limit in place. I've verified this by running > > > > Docker > > > > container tests. The idea is to look for > > > > hierarchical_memory_limit and > > > > hierarchical_memsw_limit lines in the memory.stat file of the > > > > cgroup > > > > tree. > > > > > > > > Would it be possible to consider such an enhancement upstream? > > > > If so, > > > > I'll file a bug and propose it for review. > > > > > > > > This issue has been originally raised here: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1509371 > > > > > > > > Thanks, > > > > Severin > > > > > > > > [1] Java process gets killed by oom killer. > > > > See: > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/before.txt > > > > [2] Java process throws OutOfMemoryError as expected. > > > > See: > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/container-systemd-slice-01/after.txt > > > > > > > > > > > > From david.holmes at oracle.com Thu Jan 17 22:12:43 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 18 Jan 2019 08:12:43 +1000 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 In-Reply-To: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> References: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> Message-ID: Hi Aleksey, When we build hotspot we set LP64 (whereas _LP64 presumably comes from the compiler). Shouldn't the gtests get compiled with the same flags as the product code? Thanks, David On 18/01/2019 1:00 am, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8217321 > > Not sure if we should push it to jdk/jdk12, or just to jdk/jdk, and then backport. I am leaning to > just jdk/jdk. > > Fix: > > diff -r 495edb72707a test/hotspot/gtest/utilities/test_globalDefinitions.cpp > --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 15:25:11 2019 +0100 > +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 15:56:06 2019 +0100 > @@ -103,7 +103,7 @@ > EXPECT_STREQ("M", exact_unit_for_byte_size(M)); > EXPECT_STREQ("B", exact_unit_for_byte_size(M + 1)); > EXPECT_STREQ("K", exact_unit_for_byte_size(M + K)); > -#ifdef LP64 > +#ifdef _LP64 > EXPECT_STREQ("B", exact_unit_for_byte_size(G - 1)); > EXPECT_STREQ("G", exact_unit_for_byte_size(G)); > EXPECT_STREQ("B", exact_unit_for_byte_size(G + 1)); > @@ -123,7 +123,7 @@ > EXPECT_EQ(1u, byte_size_in_exact_unit(M)); > EXPECT_EQ(M + 1, byte_size_in_exact_unit(M + 1)); > EXPECT_EQ(K + 1, byte_size_in_exact_unit(M + K)); > -#ifdef LP64 > +#ifdef _LP64 > EXPECT_EQ(G - 1, byte_size_in_exact_unit(G - 1)); > EXPECT_EQ(1u, byte_size_in_exact_unit(G)); > EXPECT_EQ(G + 1, byte_size_in_exact_unit(G + 1)); > > Testing: gtest on Linux x86_64, jdk-submit (running) > > Thanks, > -Aleksey > From shade at redhat.com Thu Jan 17 23:20:12 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 18 Jan 2019 00:20:12 +0100 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 In-Reply-To: References: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> Message-ID: <8f55602b-7831-b952-62d1-0d36f9731fb8@redhat.com> On 1/17/19 11:12 PM, David Holmes wrote: > When we build hotspot we set LP64 (whereas _LP64 presumably comes from the compiler). Shouldn't the > gtests get compiled with the same flags as the product code? Yes, I think so, and that is what happens here? Everywhere else in Hotspot we use "#ifdef _LP64", and this patch makes use of the same in gtests as well. I already pushed the patch under triviality rule , and this is current jdk/jdk: $ ack "ifdef LP64" src/hotspot/ | wc -l 0 $ ack "ifdef _LP64" src/hotspot/ | wc -l 523 $ ack "ifdef LP64" test/hotspot/ | wc -l 0 $ ack "ifdef _LP64" test/hotspot/ | wc -l 3 Anyhow, if you revert the check to old (broken) thing, then gtest would not run the block on my Linux x86_64, while it should. This is the same head-scratcher I had when debugging the test in JDK-8217315. Look: $ vi ... :wq $ hg diff diff -r 61b6e5a0b321 test/hotspot/gtest/utilities/test_globalDefinitions.cpp --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 20:35:43 2019 +0100 +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Fri Jan 18 00:17:16 2019 +0100 @@ -124,6 +124,7 @@ EXPECT_EQ(M + 1, byte_size_in_exact_unit(M + 1)); EXPECT_EQ(K + 1, byte_size_in_exact_unit(M + K)); #ifdef _LP64 + ShouldNotReachHere(); EXPECT_EQ(G - 1, byte_size_in_exact_unit(G - 1)); EXPECT_EQ(1u, byte_size_in_exact_unit(G)); EXPECT_EQ(G + 1, byte_size_in_exact_unit(G + 1)); $ CONF=linux-x86_64-server-fastdebug make images run-test TEST=gtest:globalDefinitions $ vi ... :wq $ hg diff diff -r 61b6e5a0b321 test/hotspot/gtest/utilities/test_globalDefinitions.cpp --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 20:35:43 2019 +0100 +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Fri Jan 18 00:16:05 2019 +0100 @@ -123,7 +123,8 @@ EXPECT_EQ(1u, byte_size_in_exact_unit(M)); EXPECT_EQ(M + 1, byte_size_in_exact_unit(M + 1)); EXPECT_EQ(K + 1, byte_size_in_exact_unit(M + K)); -#ifdef _LP64 +#ifdef LP64 + ShouldNotReachHere(); EXPECT_EQ(G - 1, byte_size_in_exact_unit(G - 1)); EXPECT_EQ(1u, byte_size_in_exact_unit(G)); EXPECT_EQ(G + 1, byte_size_in_exact_unit(G + 1)); $ CONF=linux-x86_64-server-fastdebug make images run-test TEST=gtest:globalDefinitions <--- Say what. -Aleksey From david.holmes at oracle.com Thu Jan 17 23:42:50 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 18 Jan 2019 09:42:50 +1000 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 In-Reply-To: <8f55602b-7831-b952-62d1-0d36f9731fb8@redhat.com> References: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> <8f55602b-7831-b952-62d1-0d36f9731fb8@redhat.com> Message-ID: <59186248-0e5b-2ee5-35e1-f8a6c20d65f6@oracle.com> On 18/01/2019 9:20 am, Aleksey Shipilev wrote: > On 1/17/19 11:12 PM, David Holmes wrote: >> When we build hotspot we set LP64 (whereas _LP64 presumably comes from the compiler). Shouldn't the >> gtests get compiled with the same flags as the product code? > > Yes, I think so, and that is what happens here? Everywhere else in Hotspot we use "#ifdef _LP64", > and this patch makes use of the same in gtests as well. I've no issue with the change, but if the same flags are used then they tests should have been executed anyway. If they weren't then the same flags are not being used and perhaps they should? > I already pushed the patch under triviality rule , and this is current jdk/jdk: > > $ ack "ifdef LP64" src/hotspot/ | wc -l > 0 Try ifndef LP64 - there are a few of those in ./cpu/x86/x86.ad Cheers, David ----- > $ ack "ifdef _LP64" src/hotspot/ | wc -l > 523 > > $ ack "ifdef LP64" test/hotspot/ | wc -l > 0 > > $ ack "ifdef _LP64" test/hotspot/ | wc -l > 3 > > Anyhow, if you revert the check to old (broken) thing, then gtest would not run the block on my > Linux x86_64, while it should. This is the same head-scratcher I had when debugging the test in > JDK-8217315. Look: > > $ vi ... > :wq > > $ hg diff > diff -r 61b6e5a0b321 test/hotspot/gtest/utilities/test_globalDefinitions.cpp > --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 20:35:43 2019 +0100 > +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Fri Jan 18 00:17:16 2019 +0100 > @@ -124,6 +124,7 @@ > EXPECT_EQ(M + 1, byte_size_in_exact_unit(M + 1)); > EXPECT_EQ(K + 1, byte_size_in_exact_unit(M + K)); > #ifdef _LP64 > + ShouldNotReachHere(); > EXPECT_EQ(G - 1, byte_size_in_exact_unit(G - 1)); > EXPECT_EQ(1u, byte_size_in_exact_unit(G)); > EXPECT_EQ(G + 1, byte_size_in_exact_unit(G + 1)); > > $ CONF=linux-x86_64-server-fastdebug make images run-test TEST=gtest:globalDefinitions > > > $ vi ... > :wq > > $ hg diff > diff -r 61b6e5a0b321 test/hotspot/gtest/utilities/test_globalDefinitions.cpp > --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Thu Jan 17 20:35:43 2019 +0100 > +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Fri Jan 18 00:16:05 2019 +0100 > @@ -123,7 +123,8 @@ > EXPECT_EQ(1u, byte_size_in_exact_unit(M)); > EXPECT_EQ(M + 1, byte_size_in_exact_unit(M + 1)); > EXPECT_EQ(K + 1, byte_size_in_exact_unit(M + K)); > -#ifdef _LP64 > +#ifdef LP64 > + ShouldNotReachHere(); > EXPECT_EQ(G - 1, byte_size_in_exact_unit(G - 1)); > EXPECT_EQ(1u, byte_size_in_exact_unit(G)); > EXPECT_EQ(G + 1, byte_size_in_exact_unit(G + 1)); > > $ CONF=linux-x86_64-server-fastdebug make images run-test TEST=gtest:globalDefinitions > <--- Say what. > > -Aleksey > From patricio.chilano.mateo at oracle.com Fri Jan 18 01:00:55 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Thu, 17 Jan 2019 20:00:55 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: Hi Robbin, Nice work! Some minor comments: 1) In method SafepointSynchronize::decrement_waiting_to_block(), what's the argument post_if_needed for? 2) In method SafepointSynchronize::try_stable_load_state(), why do you have to load the safepoint_id again after loading the thread state? The safepoint_id could change right after you read it the second time so it doesn't seem to be a question of correctness. Since the state is only returned back to the caller when you read a sid equal to InactiveSafepointCounter or to the safepoint_count, could't you do something more simple like: ? uint64_t sid = thread->safepoint_state()->get_safepoint_id();? // Load acquire ? if (sid == InactiveSafepointCounter || sid == safepoint_count) { ??? *state = thread->thread_state(); ??? return true; ? } ? return false; Or in other words, which problematic scenario are you covering by reading it twice as it is now? 3) In method SafepointSynchronize::end, it seems the if-else conditional based on "SafepointMechanism::uses_thread_local_poll()" is executing almost the same code in both cases, except for two asserts in the "else" branch which seem to apply to the "if" one too, a storestore barrier against a full fence(is it needed?) and the actual disarm_local_poll(current) for the "if" case which maybe could be replaced by a if(_disarm_local_poll_needed) disarm_local_poll(current) statement. (I see that it is like that too in the current safepoint code though). Also that whole if-else conditional is inside a {} block which was needed because of "MutexLocker mu(Safepoint_lock);" but not anymore. Thanks! Patricio On 1/15/19 5:39 AM, Robbin Ehn wrote: > Hi all, please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 > Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ > > Thanks to Dan for pre-reviewing a lot! > > Background: > ZGC often does very short safepoint operations. For a perspective, in a > specJBB2015 run, G1 can have young collection stops lasting about 170 > ms. While > in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which > operation it is. The time it takes to stop and start the JavaThreads > is relative > very large to a ZGC safepoint. With an operation that just takes 0.2ms > the > overhead of stopping and starting JavaThreads is several times the > operation. > > High-level functionality change: > Serializing the starting over Threads_lock takes time. > - Don't wait on Threads_lock use the WaitBarrier. > Serializing the stopping over Safepoint_lock takes time. > - Let threads stop in parallel, remove Safepoint_lock. > > Details: > JavaThreads have 2 abstract logical states: unsafe or safe. > - Safe means the JavaThread will not touch Java heap or VM internal > structures > ? without doing a transition and block before doing so. > ??????? - The safe states are: > ??????????????? - When polls armed: _thread_in_native and > _thread_blocked. > ??????????????? - When Threads_lock is held: externally suspended flag > is set. > ??????? - VM Thread have polls armed and holds the Threads_lock during a > ????????? safepoint. > - Unsafe means that either Java heap or VM internal structures can be > accessed > ? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. > ??????? - All combination that are not safe are unsafe. > > We cannot start a safepoint until all unsafe threads have transitioned > to a safe > state. To make them safe, we arm polls in compiled code and make sure any > transition to another unsafe state will be blocked. JavaThreads which > are unsafe > with state _thread_in_Java may transition to _thread_in_native without > being > blocked, since it just became a safe thread and we can proceed. Any > safe thread > may try to transition at any time to an unsafe state, thus coming into > the > safepoint blocking code at any moment, e.g., after the safepoint is > over, or > even at the beginning of next safepoint. > > The VMThread cannot tolerate false positives from the JavaThread > thread state > because that would mean starting the safepoint without all JavaThreads > being > safe. The two locks (Threads_lock and Safepoint_lock) make sure we > never observe > false positives from the safepoint blocking code, if we remove them, > how do we > handle false positives? > > By first publishing which barrier tag (safepoint counter) we will call > WaitBarrier.wait() with as the threads safepoint id and then change > the state to > _thread_blocked, the VMThread can ignore JavaThreads by doing a stable > load of > the state. A stable load of the thread state is successful if the thread > safepoint id is the same both before and after the load of the state and > safepoint id is current or InactiveSafepointCounter. If the stable > load fails, > the thread is considered safepoint unsafe. It's no longer enough that > thread is > have state _thread_blocked it must also have correct safepoint id > before and > after we read the state. > > Performance: > The result of faster safepoints is that the average CPU time for > JavaThreads > between safepoints is higher, thus increasing the allocation rate. The > thread > that stops first waits shorter time until it gets started. Even the > thread that > stops last also have shorter stop since we start them faster. If your > application is using a concurrent GC it may need re-tunning since each > java > worker thread have an increased CPU time/allocation rate. Often this > means max > performance is achieved using slightly less java worker threads than > before. > Also the increase allocation rate means shorter time between GC > safepoints. > - If you are using a non-concurrent GC, you should see improved > latency and > ? throughput. > - After re-tunning with a concurrent GC throughput should be equal or > better but > ? with better latency. But bear in mind this is a latency patch, not a > ? throughput one. > With current code a java thread is not to guarantee to run between > safepoint (in > theory a java thread can be starved indefinitely), since the VM thread > may > re-grab the Threads_locks before it woke up from previous safepoint. > If the > GC/VM don't respect MMU (minimum mutator utilization) or if your > machine is very > over-provisioned this can happen. > The current schema thus re-safepoint quickly if the java threads have not > started yet at the cost of latency. Since the new code uses the > WaitBarrier with > the safepoint counter, all threads must roll forward to next safepoint by > getting at least some CPU time between two safepoints. Meaning MMU > violations > are more obvious. > > Some examples on numbers: > - On a 16 strand machine synchronization and > un-synchronization/starting is at > ? least 3x faster (in non-trivial test). Synchronization ~600 -> > ~100us and > ? starting ~400->~100us. > ? (Semaphore path is a bit slower than futex in the WaitBarrier on > Linux). > - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster > ? synchronization time on 16 strands and ~5% score increase. In this > case the GC > ? op is 1ms, so we reduce the overhead of synchronization from 100% to > 10%. > - specJBB2015 ParGC ~9% increase in critical-jops. > > Thanks, Robbin From vladimir.x.ivanov at oracle.com Fri Jan 18 01:39:05 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 17 Jan 2019 17:39:05 -0800 Subject: [13] RFR (S): 8217358: Optimized build is broken by Shenandoah changes Message-ID: <77ba7a2b-63c0-cbed-ac9f-e4637c28f4ee@oracle.com> http://cr.openjdk.java.net/~vlivanov/8217358/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8217358 There's an inconsistency in how assert helpers are declared in Shenandoah code: PRODUCT_RETURN is used in headers, but implementations are guarded by ASSERT. Proposed fix uses NOT_DEBUG_RETURN instead. Best regards, Vladimir Ivanov From shade at redhat.com Fri Jan 18 08:48:02 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 18 Jan 2019 09:48:02 +0100 Subject: [13] RFR (S): 8217358: Optimized build is broken by Shenandoah changes In-Reply-To: <77ba7a2b-63c0-cbed-ac9f-e4637c28f4ee@oracle.com> References: <77ba7a2b-63c0-cbed-ac9f-e4637c28f4ee@oracle.com> Message-ID: <20f538bd-6d91-d0c7-31ec-a06dd80049fd@redhat.com> On 1/18/19 2:39 AM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8217358/webrev.00/ This looks good and trivial, thanks for fixing. I verified Shenandoah builds in {fastdebug,release,optimized} with this patch. Aside: does anyone use "optimized", and for what? -Aleksey From shade at redhat.com Fri Jan 18 09:31:55 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 18 Jan 2019 10:31:55 +0100 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 In-Reply-To: <59186248-0e5b-2ee5-35e1-f8a6c20d65f6@oracle.com> References: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> <8f55602b-7831-b952-62d1-0d36f9731fb8@redhat.com> <59186248-0e5b-2ee5-35e1-f8a6c20d65f6@oracle.com> Message-ID: <2cbb8285-a634-2018-e54e-4ee4a7a87211@redhat.com> On 1/18/19 12:42 AM, David Holmes wrote: > On 18/01/2019 9:20 am, Aleksey Shipilev wrote: >> On 1/17/19 11:12 PM, David Holmes wrote: >>> When we build hotspot we set LP64 (whereas _LP64 presumably comes from the compiler). Shouldn't the >>> gtests get compiled with the same flags as the product code? >> >> Yes, I think so, and that is what happens here? Everywhere else in Hotspot we use "#ifdef _LP64", >> and this patch makes use of the same in gtests as well. > > I've no issue with the change, but if the same flags are used then they tests should have been > executed anyway. If they weren't then the same flags are not being used and perhaps they should? I think the basic premise "LP64 is set" is incorrect. _LP64 is defined, not LP64, in both hotspot and gtest compilation on Linux x86_64. I don't see the problem: flags are consistent between hotspot and gtest, and gtests cannot just use LP64 on their own. Look: $ hg diff -U 2 diff -r e1da82072c79 src/hotspot/share/opto/library_call.cpp --- a/src/hotspot/share/opto/library_call.cpp Fri Jan 18 09:04:09 2019 +0100 +++ b/src/hotspot/share/opto/library_call.cpp Fri Jan 18 10:22:01 2019 +0100 @@ -341,4 +341,11 @@ //---------------------------make_vm_intrinsic---------------------------- CallGenerator* Compile::make_vm_intrinsic(ciMethod* m, bool is_virtual) { +#ifdef LP64 + Non-compilable cruft. +#endif +#ifndef _LP64 + Non-compilable cruft. +#endif + vmIntrinsics::ID id = m->intrinsic_id(); assert(id != vmIntrinsics::_none, "must be a VM intrinsic"); diff -r e1da82072c79 test/hotspot/gtest/utilities/test_globalDefinitions.cpp --- a/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Fri Jan 18 09:04:09 2019 +0100 +++ b/test/hotspot/gtest/utilities/test_globalDefinitions.cpp Fri Jan 18 10:22:01 2019 +0100 @@ -95,4 +95,11 @@ TEST(globalDefinitions, exact_unit_for_byte_size) { +#ifdef LP64 + Non-compilable cruft. +#endif +#ifndef _LP64 + Non-compilable cruft. +#endif + EXPECT_STREQ("B", exact_unit_for_byte_size(0)); EXPECT_STREQ("B", exact_unit_for_byte_size(1)); $ CONF=linux-x86_64-server-fastdebug make clean images $ CONF=linux-x86_64-server-fastdebug make clean run-test TEST=gtest:globalDefinitions >> $ ack "ifdef LP64" src/hotspot/ | wc -l >> 0 > > Try ifndef LP64 - there are a few of those in ./cpu/x86/x86.ad Oh, .ad is compiled with ADLC, which is yet another build config. I think the use of "ifndef LP64" there is incorrect, and it should use _LP64 as well. In fact, the rest of x86.ad uses _LP64, and that LP64 looks like a recent addition. And that code even fails to compile when ifndef is fixed! See: https://bugs.openjdk.java.net/browse/JDK-8217371 -Aleksey From sgehwolf at redhat.com Fri Jan 18 09:41:59 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 18 Jan 2019 10:41:59 +0100 Subject: [13] RFR (S): 8217358: Optimized build is broken by Shenandoah changes In-Reply-To: <20f538bd-6d91-d0c7-31ec-a06dd80049fd@redhat.com> References: <77ba7a2b-63c0-cbed-ac9f-e4637c28f4ee@oracle.com> <20f538bd-6d91-d0c7-31ec-a06dd80049fd@redhat.com> Message-ID: <7ef74f3bff98d1b73141ebbc5ace01abb2a4378f.camel@redhat.com> On Fri, 2019-01-18 at 09:48 +0100, Aleksey Shipilev wrote: > Aside: does anyone use "optimized", and for what? +1. It's the first I've heard of it. How does one even build that config? Thanks, Severin From shade at redhat.com Fri Jan 18 11:53:14 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 18 Jan 2019 12:53:14 +0100 Subject: RFR (S) 8217378: UseCriticalCMSThreadPriority is broken Message-ID: <44e2ae70-4023-8e20-06a7-e387cb667f7c@redhat.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8217378 Fix: http://cr.openjdk.java.net/~shade/8217378/webrev.01/ While this concerns the experimental CMS option, the fix itself also unblocks Shenandoah RFE (JDK-8217343). There might be a larger discussion if CriticalPriority thingie is useful, but I would first push the simple fix that makes it work, at least to have the up-to-date performance data. It should make the most benefit on Solaris where critical prio can map to FX priorities, but non-Solaris builds can also enjoy the (control) GC threads priority elevated to MaxPriority, above the VMThread. Testing: new test, hotspot tier1, jdk-submit (running) Thanks, -Aleksey From rkennke at redhat.com Fri Jan 18 12:41:27 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 18 Jan 2019 13:41:27 +0100 Subject: RFR (S) 8217378: UseCriticalCMSThreadPriority is broken In-Reply-To: <44e2ae70-4023-8e20-06a7-e387cb667f7c@redhat.com> References: <44e2ae70-4023-8e20-06a7-e387cb667f7c@redhat.com> Message-ID: <088d6858-2adb-b385-2bb4-a5ad1dc67263@redhat.com> Looks good to me. Thanks, Roman > Bug: > https://bugs.openjdk.java.net/browse/JDK-8217378 > > Fix: > http://cr.openjdk.java.net/~shade/8217378/webrev.01/ > > While this concerns the experimental CMS option, the fix itself also unblocks Shenandoah RFE > (JDK-8217343). There might be a larger discussion if CriticalPriority thingie is useful, but I would > first push the simple fix that makes it work, at least to have the up-to-date performance data. It > should make the most benefit on Solaris where critical prio can map to FX priorities, but > non-Solaris builds can also enjoy the (control) GC threads priority elevated to MaxPriority, above > the VMThread. > > Testing: new test, hotspot tier1, jdk-submit (running) > > Thanks, > -Aleksey > From robbin.ehn at oracle.com Fri Jan 18 13:45:45 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 18 Jan 2019 14:45:45 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: <89ad7b46-899d-54c5-04ba-2e8ab589ffe6@oracle.com> Hi Patricio, On 1/18/19 2:00 AM, Patricio Chilano wrote: > Hi Robbin, > > Nice work! Some minor comments: > > 1) In method SafepointSynchronize::decrement_waiting_to_block(), what's the > argument post_if_needed for? Left-over, removed. > > 2) In method SafepointSynchronize::try_stable_load_state(), why do you have to > load the safepoint_id again after loading the thread state? The safepoint_id > could change right after you read it the second time so it doesn't seem to be a > question of correctness. Since the state is only returned back to the caller > when you read a sid equal to InactiveSafepointCounter or to the safepoint_count, > could't you do something more simple like: > > ? uint64_t sid = thread->safepoint_state()->get_safepoint_id();? // Load acquire > > ? if (sid == InactiveSafepointCounter || sid == safepoint_count) { > ??? *state = thread->thread_state(); > ??? return true; > ? } > ? return false; > > Or in other words, which problematic scenario are you covering by reading it > twice as it is now? The WaitBarrier is armed for current safepoint id, if the thread is in correct safepoint the loaded safepoint id in SS::block() is current. Then it cannot change since the WaitBarrier is armed for that id. To separate threads blocked in SS:block from other blocked threads (since we do not have a safepoint check when leaving SS:block()) the java thread may never publish thread_blocked with a zero thread safepoint id in the SS:block code. So we should set the safepoint id before going to blocked and go from blocked before resetting (zeroing) the thread safepoint id. We could do it the other way around but it would just create another type of false positives and we still need to do a 'stable' load. Normally stores are seen as: - Javathread have non blocked + 0 safepoint id. - Store thread safepoint id (next). - Store thread state. (blocked) -> waitbarrier Meaning we read them in reverse-order: - Load state - Load safepoint id Since a new safepoint can be started directly after _wait_barrier->wait(); We can see a thread leaving previous safepoint, in case stores are seen as: - Leaving previous waitbarrier. - Javathread have blocked + previous safepoint id. - Store thread state. (non blocked) - Store thread safepoint id (0). - Store thread safepoint id (next). - Store thread state. (blocked) (here it is safe) -> waitbarrier Thus the loading can see this as: - Leaving previous waitbarrier. - Javathread have blocked + previous safepoint id. <--- Load state blocked - Store thread state. (non blocked) - Store thread safepoint id (0). <---- Load thread safepoint id 0 - Store thread safepoint id (next). - Store thread state. (blocked) (here it is safe) -> waitbarrier Now this is a false positive, resulting in blocked with safepoint id 0, not good. By loading the thread safepoint id before and after we can notice this: We would thus load: - Load safepoint id => previous safepoint id - Load state => blocked - Load safepoint id => previous safepoint id / 0 / next safepoint id The stable load say that not only must they be the same, they also must be 0 or _current_. Now we can say this thread is still unsafe! > > 3) In method SafepointSynchronize::end, it seems the if-else conditional based > on "SafepointMechanism::uses_thread_local_poll()" is executing almost the same > code in both cases, except for two asserts in the "else" branch which seem to > apply to the "if" one too, a storestore barrier against a full fence(is it > needed?) and the actual disarm_local_poll(current) for the "if" case which maybe > could be replaced by a if(_disarm_local_poll_needed) disarm_local_poll(current) > statement. (I see that it is like that too in the current safepoint code though). > Also that whole if-else conditional is inside a {} block which was needed > because of "MutexLocker mu(Safepoint_lock);" but not anymore. Re-factored this to a disarm method. I'll post v01 to initial RFR mail, just need some more testing. Thanks, Robbin > > > Thanks! > Patricio > > On 1/15/19 5:39 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads is relative >> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >> overhead of stopping and starting JavaThreads is several times the operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal structures >> ? without doing a transition and block before doing so. >> ??????? - The safe states are: >> ??????????????? - When polls armed: _thread_in_native and _thread_blocked. >> ??????????????? - When Threads_lock is held: externally suspended flag is set. >> ??????? - VM Thread have polls armed and holds the Threads_lock during a >> ????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be accessed >> ? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ??????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure any >> transition to another unsafe state will be blocked. JavaThreads which are unsafe >> with state _thread_in_Java may transition to _thread_in_native without being >> blocked, since it just became a safe thread and we can proceed. Any safe thread >> may try to transition at any time to an unsafe state, thus coming into the >> safepoint blocking code at any moment, e.g., after the safepoint is over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread thread state >> because that would mean starting the safepoint without all JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >> false positives from the safepoint blocking code, if we remove them, how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >> the thread is considered safepoint unsafe. It's no longer enough that thread is >> have state _thread_blocked it must also have correct safepoint id before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for JavaThreads >> between safepoints is higher, thus increasing the allocation rate. The thread >> that stops first waits shorter time until it gets started. Even the thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since each java >> worker thread have an increased CPU time/allocation rate. Often this means max >> performance is achieved using slightly less java worker threads than before. >> Also the increase allocation rate means shorter time between GC safepoints. >> - If you are using a non-concurrent GC, you should see improved latency and >> ? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or better but >> ? with better latency. But bear in mind this is a latency patch, not a >> ? throughput one. >> With current code a java thread is not to guarantee to run between safepoint (in >> theory a java thread can be starved indefinitely), since the VM thread may >> re-grab the Threads_locks before it woke up from previous safepoint. If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have not >> started yet at the cost of latency. Since the new code uses the WaitBarrier with >> the safepoint counter, all threads must roll forward to next safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and un-synchronization/starting is at >> ? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >> ? starting ~400->~100us. >> ? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ? synchronization time on 16 strands and ~5% score increase. In this case the GC >> ? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin > From zgu at redhat.com Fri Jan 18 15:18:54 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 18 Jan 2019 10:18:54 -0500 Subject: RFR 8217342: Build failed with excluding JFR Message-ID: Please review this patch that fixes build failures when excluding JFR (configure build with --with-jvm-features=-jfr) Bug: https://bugs.openjdk.java.net/browse/JDK-8217342 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8217342/webrev.00/index.html Test: PCH and none PCH builds with/without JFR on Linux x64 (fastdebug and release) Thanks, -Zhengyu From shade at redhat.com Fri Jan 18 15:36:17 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 18 Jan 2019 16:36:17 +0100 Subject: RFR 8217342: Build failed with excluding JFR In-Reply-To: References: Message-ID: <908f459e-b083-6ac9-8067-08f3c75ca169@redhat.com> On 1/18/19 4:18 PM, Zhengyu Gu wrote: > Please review this patch that fixes build failures when excluding JFR (configure build with > --with-jvm-features=-jfr) > > Bug: https://bugs.openjdk.java.net/browse/JDK-8217342 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8217342/webrev.00/index.html Shenandoah parts look good. I think zBarrierSet.hpp includes gc/shared/barrierSet.hpp already, so no need for it in zBarrierSetC2.cpp? -Aleksey From per.liden at oracle.com Fri Jan 18 15:46:23 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 18 Jan 2019 16:46:23 +0100 Subject: RFR 8217342: Build failed with excluding JFR In-Reply-To: <908f459e-b083-6ac9-8067-08f3c75ca169@redhat.com> References: <908f459e-b083-6ac9-8067-08f3c75ca169@redhat.com> Message-ID: <46cd48ff-4d6f-2f2c-d074-9851feda8602@oracle.com> Hi, On 1/18/19 4:36 PM, Aleksey Shipilev wrote: > On 1/18/19 4:18 PM, Zhengyu Gu wrote: >> Please review this patch that fixes build failures when excluding JFR (configure build with >> --with-jvm-features=-jfr) >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8217342 >> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8217342/webrev.00/index.html > > Shenandoah parts look good. > > I think zBarrierSet.hpp includes gc/shared/barrierSet.hpp already, so no need for it in > zBarrierSetC2.cpp? Agree, only including zBarrierSet.hpp should be enough. Other than that, the ZGC parts look good. /Per From zgu at redhat.com Fri Jan 18 16:03:02 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 18 Jan 2019 11:03:02 -0500 Subject: RFR 8217342: Build failed with excluding JFR In-Reply-To: <46cd48ff-4d6f-2f2c-d074-9851feda8602@oracle.com> References: <908f459e-b083-6ac9-8067-08f3c75ca169@redhat.com> <46cd48ff-4d6f-2f2c-d074-9851feda8602@oracle.com> Message-ID: <9ca5fb48-398c-f677-9b12-9584cfbb0a52@redhat.com> Thanks for reviewing, Aleksey and Per. >> >> I think zBarrierSet.hpp includes gc/shared/barrierSet.hpp already, so >> no need for it in >> zBarrierSetC2.cpp? > > Agree, only including zBarrierSet.hpp should be enough. Other than that, > the ZGC parts look good. Verified. Updated webrev: http://cr.openjdk.java.net/~zgu/JDK-8217342/webrev.01/ Thanks, -Zhengyu > > /Per From daniel.daugherty at oracle.com Fri Jan 18 16:33:28 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 18 Jan 2019 11:33:28 -0500 Subject: RFR (S) 8217378: UseCriticalCMSThreadPriority is broken In-Reply-To: <44e2ae70-4023-8e20-06a7-e387cb667f7c@redhat.com> References: <44e2ae70-4023-8e20-06a7-e387cb667f7c@redhat.com> Message-ID: <7921a507-68ce-437e-30b4-9e7556d06ba5@oracle.com> On 1/18/19 6:53 AM, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8217378 > > Fix: > http://cr.openjdk.java.net/~shade/8217378/webrev.01/ src/hotspot/share/runtime/os.cpp ??? L220: ? if ((p >= MinPriority && p <= MaxPriority) || ??? L221: ?????? (p == CriticalPriority && thread->is_ConcurrentGC_thread())) { ??????? nit - please delete one indent space from L221. ??????? Copyright year needs to be updated. test/hotspot/jtreg/gc/cms/TestCriticalPriority.java ??? No comments. Thumbs up. No need to see another webrev if you fix the nit on L221. Dan > > While this concerns the experimental CMS option, the fix itself also unblocks Shenandoah RFE > (JDK-8217343). There might be a larger discussion if CriticalPriority thingie is useful, but I would > first push the simple fix that makes it work, at least to have the up-to-date performance data. It > should make the most benefit on Solaris where critical prio can map to FX priorities, but > non-Solaris builds can also enjoy the (control) GC threads priority elevated to MaxPriority, above > the VMThread. > > Testing: new test, hotspot tier1, jdk-submit (running) > > Thanks, > -Aleksey > From shade at redhat.com Fri Jan 18 16:58:02 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 18 Jan 2019 17:58:02 +0100 Subject: RFR (S) 8217378: UseCriticalCMSThreadPriority is broken In-Reply-To: <7921a507-68ce-437e-30b4-9e7556d06ba5@oracle.com> References: <44e2ae70-4023-8e20-06a7-e387cb667f7c@redhat.com> <7921a507-68ce-437e-30b4-9e7556d06ba5@oracle.com> Message-ID: On 1/18/19 5:33 PM, Daniel D. Daugherty wrote: > src/hotspot/share/runtime/os.cpp > ??? L220: ? if ((p >= MinPriority && p <= MaxPriority) || > ??? L221: ?????? (p == CriticalPriority && thread->is_ConcurrentGC_thread())) { > ??????? nit - please delete one indent space from L221. > > ??????? Copyright year needs to be updated. Fixed both. > test/hotspot/jtreg/gc/cms/TestCriticalPriority.java > ??? No comments. > > Thumbs up. No need to see another webrev if you fix the nit on L221. There is the updated webrev anyway: http://cr.openjdk.java.net/~shade/8217378/webrev.02/ -Aleksey From vladimir.x.ivanov at oracle.com Fri Jan 18 18:43:51 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 10:43:51 -0800 Subject: [13] RFR (S): 8217358: Optimized build is broken by Shenandoah changes In-Reply-To: <7ef74f3bff98d1b73141ebbc5ace01abb2a4378f.camel@redhat.com> References: <77ba7a2b-63c0-cbed-ac9f-e4637c28f4ee@oracle.com> <20f538bd-6d91-d0c7-31ec-a06dd80049fd@redhat.com> <7ef74f3bff98d1b73141ebbc5ace01abb2a4378f.camel@redhat.com> Message-ID: <5a680bf4-0ede-5a96-5770-3f50c0b82d53@oracle.com> Thanks for review, Aleksey. >> Aside: does anyone use "optimized", and for what? > > +1. It's the first I've heard of it. How does one even build that > config? Optimized build is something in-between product and fastdebug builds. There's some diagnostic code in HotSpot which isn't available in product builds. It may be useful to enable it, but without incurring all the costs of fastdebug. That's the difference between notproduct & develop flags and !PRODUCT & ASSERT #ifdefs. As of now, It's enough to specify --with-debug-level=optimized to get it built. The plan is to eliminate optimized JDK build [1] and replace it with a JVM feature. As for me, I use optimized to access C2-specific diagnostic features/output when I need them in release binaries (e.g., IdealGraphVisualizer, PrintEscapeAnalysis et all). Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8202283 From lois.foltan at oracle.com Fri Jan 18 18:50:59 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Fri, 18 Jan 2019 13:50:59 -0500 Subject: RFR (S) JDK-8216970: condy causes JVM crash Message-ID: Please review this change that allows escape analysis to correctly handle a dynamic constant whose return type is an array. open webrev at: http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.1/webrev/ bug link: https://bugs.openjdk.java.net/browse/JDK-8216970 Testing: hs-tier1-3, jdk-tier1-3 (all platforms).? hs-tier4-5 (linux only) Thanks, Lois From vladimir.x.ivanov at oracle.com Fri Jan 18 21:45:15 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 13:45:15 -0800 Subject: [13] RFR (T): 8217399: Backout 8217358 Message-ID: https://bugs.openjdk.java.net/browse/JDK-8217399 http://cr.openjdk.java.net/~vlivanov/8217399/webrev.00 Sorry, pushed wrong patch under JDK-8217358. Backing out the changeset. Original patch will be pushed under JDK-8217400. Best regards, Vladimir Ivanov From vladimir.kozlov at oracle.com Fri Jan 18 22:01:36 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 18 Jan 2019 14:01:36 -0800 Subject: [13] RFR (T): 8217399: Backout 8217358 In-Reply-To: References: Message-ID: Good and trivial. thanks, Vladimir On 1/18/19 1:45 PM, Vladimir Ivanov wrote: > https://bugs.openjdk.java.net/browse/JDK-8217399 > http://cr.openjdk.java.net/~vlivanov/8217399/webrev.00 > > Sorry, pushed wrong patch under JDK-8217358. > Backing out the changeset. > Original patch will be pushed under JDK-8217400. > > Best regards, > Vladimir Ivanov From david.holmes at oracle.com Fri Jan 18 22:13:45 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 19 Jan 2019 08:13:45 +1000 Subject: RFR (XS) 8217321: [TESTBUG] utilities/test_globalDefinitions.cpp should use _LP64, not LP64 In-Reply-To: <2cbb8285-a634-2018-e54e-4ee4a7a87211@redhat.com> References: <133fb5f2-1e5b-af8d-dd45-a61bdfd042d0@redhat.com> <8f55602b-7831-b952-62d1-0d36f9731fb8@redhat.com> <59186248-0e5b-2ee5-35e1-f8a6c20d65f6@oracle.com> <2cbb8285-a634-2018-e54e-4ee4a7a87211@redhat.com> Message-ID: On 18/01/2019 7:31 pm, Aleksey Shipilev wrote: > On 1/18/19 12:42 AM, David Holmes wrote: >> On 18/01/2019 9:20 am, Aleksey Shipilev wrote: >>> On 1/17/19 11:12 PM, David Holmes wrote: >>>> When we build hotspot we set LP64 (whereas _LP64 presumably comes from the compiler). Shouldn't the >>>> gtests get compiled with the same flags as the product code? >>> >>> Yes, I think so, and that is what happens here? Everywhere else in Hotspot we use "#ifdef _LP64", >>> and this patch makes use of the same in gtests as well. >> >> I've no issue with the change, but if the same flags are used then they tests should have been >> executed anyway. If they weren't then the same flags are not being used and perhaps they should? > > I think the basic premise "LP64 is set" is incorrect. You're correct - must be some institutional memory coming to the fore. The build system no longer sets it. Not even for adlc. >> Try ifndef LP64 - there are a few of those in ./cpu/x86/x86.ad > > Oh, .ad is compiled with ADLC, which is yet another build config. I think the use of "ifndef LP64" > there is incorrect, and it should use _LP64 as well. In fact, the rest of x86.ad uses _LP64, and > that LP64 looks like a recent addition. And that code even fails to compile when ifndef is fixed! See: > https://bugs.openjdk.java.net/browse/JDK-8217371 Thanks for filing that! So glad this little discussion led to a real problem being discovered. :) Cheers, David > -Aleksey > From david.holmes at oracle.com Fri Jan 18 22:30:53 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 19 Jan 2019 08:30:53 +1000 Subject: RFR (S) 8217378: UseCriticalCMSThreadPriority is broken In-Reply-To: <44e2ae70-4023-8e20-06a7-e387cb667f7c@redhat.com> References: <44e2ae70-4023-8e20-06a7-e387cb667f7c@redhat.com> Message-ID: <8d3863ba-6c41-08eb-d0dd-e5e9888de314@oracle.com> Hi Aleksey, On 18/01/2019 9:53 pm, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8217378 > > Fix: > http://cr.openjdk.java.net/~shade/8217378/webrev.01/ > > While this concerns the experimental CMS option, the fix itself also unblocks Shenandoah RFE > (JDK-8217343). There might be a larger discussion if CriticalPriority thingie is useful, but I would > first push the simple fix that makes it work, at least to have the up-to-date performance data. It > should make the most benefit on Solaris where critical prio can map to FX priorities, but > non-Solaris builds can also enjoy the (control) GC threads priority elevated to MaxPriority, above > the VMThread. Well the fix does restore previous behaviour so reviewed. But its a sharp tool you just repaired. Cheers, David > Testing: new test, hotspot tier1, jdk-submit (running) > > Thanks, > -Aleksey > From vladimir.x.ivanov at oracle.com Fri Jan 18 23:05:21 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 15:05:21 -0800 Subject: [13] RFR (S): 8213234: Move LambdaForm.Hidden to jdk.internal.vm.annotation Message-ID: <3aba3976-60a0-f36f-92b6-de035b954dd4@oracle.com> http://cr.openjdk.java.net/~vlivanov/8213234/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8213234 Move LambdaForm.Hidden to jdk.internal.vm.annotation, so it can be shared across JDK until a standard solution is provided [1]. Testing: tier1-2 Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8212620 From mandy.chung at oracle.com Fri Jan 18 23:24:39 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 18 Jan 2019 15:24:39 -0800 Subject: [13] RFR (S): 8213234: Move LambdaForm.Hidden to jdk.internal.vm.annotation In-Reply-To: <3aba3976-60a0-f36f-92b6-de035b954dd4@oracle.com> References: <3aba3976-60a0-f36f-92b6-de035b954dd4@oracle.com> Message-ID: <09d7f573-86d1-49af-8bd5-e75bf377be02@oracle.com> On 1/18/19 3:05 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8213234/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8213234 > > Move LambdaForm.Hidden to jdk.internal.vm.annotation, so it can be > shared across JDK until a standard solution is provided [1]. > Looks good.? The new Hidden.java should have 2019 copyright start year. Mandy From vladimir.x.ivanov at oracle.com Fri Jan 18 23:33:31 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 15:33:31 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled Message-ID: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8217404 --with-jvm-features doesn't work properly when multiple features are explicitly disabled: $ bash configure --with-jvm-features="-aot -jvmci -graal" ... checking if jvmci module jdk.internal.vm.ci should be built... yes checking if graal module jdk.internal.vm.compiler should be built... yes checking if aot should be enabled... yes ... The problem in the following code: DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` if test "x$DISABLE_AOT" = "xaot"; then ENABLE_AOT="false" fi Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of explicitly disabled features, grep over it returns the whole list when there's a match. The subsequent check fails because there's no exact match, though DISABLE_AOT contains "aot" . Proposed fix is to check there's no match instead. After the fix it works as expected: $ bash configure --with-jvm-features="-aot -jvmci -graal" ... checking if jvmci module jdk.internal.vm.ci should be built... no, forced checking if graal module jdk.internal.vm.compiler should be built... no, forced checking if aot should be enabled... no, forced ... (The fix doesn't address the case when one feature has a name which is a proper substring of another feature, but there are no such cases at the moment.) Best regards, Vladimir Ivanov From vladimir.x.ivanov at oracle.com Fri Jan 18 23:43:52 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 15:43:52 -0800 Subject: [13] RFR (T): 8217407: StackValue::print_on() crashes on NULL handle Message-ID: <35496a16-7cf7-62a6-4406-193d9015be58@oracle.com> http://cr.openjdk.java.net/~vlivanov/8217407/webrev.00 https://bugs.openjdk.java.net/browse/JDK-8217407 JDK-8217407 [1] removed NULL check from oopDesc::print_value_on(), but didn't add explicit check in StackValue::print_on(). Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8202171 From vladimir.kozlov at oracle.com Fri Jan 18 23:45:01 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 18 Jan 2019 15:45:01 -0800 Subject: [13] RFR (T): 8217407: StackValue::print_on() crashes on NULL handle In-Reply-To: <35496a16-7cf7-62a6-4406-193d9015be58@oracle.com> References: <35496a16-7cf7-62a6-4406-193d9015be58@oracle.com> Message-ID: <57a7d2d7-fcef-b7c8-85dc-ecb976e56b92@oracle.com> Good. Thanks, Vladimir On 1/18/19 3:43 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8217407/webrev.00 > https://bugs.openjdk.java.net/browse/JDK-8217407 > > JDK-8217407 [1] removed NULL check from oopDesc::print_value_on(), but didn't add explicit check in StackValue::print_on(). > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8202171 From vladimir.kozlov at oracle.com Fri Jan 18 23:44:19 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 18 Jan 2019 15:44:19 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> Message-ID: I usually used --with-jvm-features=-aot,-jvmci,-graal Did not work in this case too? Anyway your fix is good. Thanks, Vladimir On 1/18/19 3:33 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8217404 > > --with-jvm-features doesn't work properly when multiple features are explicitly disabled: > > $ bash configure --with-jvm-features="-aot -jvmci -graal" > ... > checking if jvmci module jdk.internal.vm.ci should be built... yes > checking if graal module jdk.internal.vm.compiler should be built... yes > checking if aot should be enabled... yes > ... > > The problem in the following code: > > ? DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` > ? if test "x$DISABLE_AOT" = "xaot"; then > ??? ENABLE_AOT="false" > ? fi > > Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of explicitly disabled features, grep over it returns > the whole list when there's a match. The subsequent check fails because there's no exact match, though DISABLE_AOT > contains "aot" . > > Proposed fix is to check there's no match instead. > > After the fix it works as expected: > > $ bash configure --with-jvm-features="-aot -jvmci -graal" > ... > checking if jvmci module jdk.internal.vm.ci should be built... no, forced > checking if graal module jdk.internal.vm.compiler should be built... no, forced > checking if aot should be enabled... no, forced > ... > > (The fix doesn't address the case when one feature has a name which is a proper substring of another feature, but there > are no such cases at the moment.) > > Best regards, > Vladimir Ivanov From igor.ignatyev at oracle.com Fri Jan 18 23:45:52 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 18 Jan 2019 15:45:52 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> Message-ID: Hi Vladimir, overall your fix looks reasonable, but w/ it we can get unintentionally disabled features (b/c grep doesn't do full word match). although this problem wasn't really introduced by your fix, I think it's be better to fix it as a part of your patch. I see two possible solutions: - add "-w" to grep, but I am not sure if "-w" is supported by all grep implementations - use $XARGS instead of $ECHO when we get DISABLE_X. in this case you will need to revert your changes in 'if test ...' lines Thanks, -- Igor > On Jan 18, 2019, at 3:33 PM, Vladimir Ivanov wrote: > > http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8217404 > > --with-jvm-features doesn't work properly when multiple features are explicitly disabled: > > $ bash configure --with-jvm-features="-aot -jvmci -graal" > ... > checking if jvmci module jdk.internal.vm.ci should be built... yes > checking if graal module jdk.internal.vm.compiler should be built... yes > checking if aot should be enabled... yes > ... > > The problem in the following code: > > DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` > if test "x$DISABLE_AOT" = "xaot"; then > ENABLE_AOT="false" > fi > > Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of explicitly disabled features, grep over it returns the whole list when there's a match. The subsequent check fails because there's no exact match, though DISABLE_AOT contains "aot" . > > Proposed fix is to check there's no match instead. > > After the fix it works as expected: > > $ bash configure --with-jvm-features="-aot -jvmci -graal" > ... > checking if jvmci module jdk.internal.vm.ci should be built... no, forced > checking if graal module jdk.internal.vm.compiler should be built... no, forced > checking if aot should be enabled... no, forced > ... > > (The fix doesn't address the case when one feature has a name which is a proper substring of another feature, but there are no such cases at the moment.) > > Best regards, > Vladimir Ivanov From vladimir.x.ivanov at oracle.com Fri Jan 18 23:52:39 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 15:52:39 -0800 Subject: [13] RFR (T): 8217407: StackValue::print_on() crashes on NULL handle In-Reply-To: <57a7d2d7-fcef-b7c8-85dc-ecb976e56b92@oracle.com> References: <35496a16-7cf7-62a6-4406-193d9015be58@oracle.com> <57a7d2d7-fcef-b7c8-85dc-ecb976e56b92@oracle.com> Message-ID: <0647deb5-d69d-5be3-b647-45340bb9dc00@oracle.com> Thanks, Vladimir. Best regards, Vladimir Ivanov On 18/01/2019 15:45, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 1/18/19 3:43 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8217407/webrev.00 >> https://bugs.openjdk.java.net/browse/JDK-8217407 >> >> JDK-8217407 [1] removed NULL check from oopDesc::print_value_on(), but >> didn't add explicit check in StackValue::print_on(). >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8202171 From vladimir.x.ivanov at oracle.com Fri Jan 18 23:52:30 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 15:52:30 -0800 Subject: [13] RFR (T): 8217399: Backout 8217358 In-Reply-To: References: Message-ID: <20bb0d78-e584-e246-b0ad-f68539696b45@oracle.com> Thanks, Vladimir. Best regards, Vladimir Ivanov On 18/01/2019 14:01, Vladimir Kozlov wrote: > Good and trivial. > > thanks, > Vladimir > > On 1/18/19 1:45 PM, Vladimir Ivanov wrote: >> https://bugs.openjdk.java.net/browse/JDK-8217399 >> http://cr.openjdk.java.net/~vlivanov/8217399/webrev.00 >> >> Sorry, pushed wrong patch under JDK-8217358. >> Backing out the changeset. >> Original patch will be pushed under JDK-8217400. >> >> Best regards, >> Vladimir Ivanov From dean.long at oracle.com Sat Jan 19 00:18:01 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 18 Jan 2019 16:18:01 -0800 Subject: [13] RFR (S): 8213234: Move LambdaForm.Hidden to jdk.internal.vm.annotation In-Reply-To: <3aba3976-60a0-f36f-92b6-de035b954dd4@oracle.com> References: <3aba3976-60a0-f36f-92b6-de035b954dd4@oracle.com> Message-ID: <54b6dd72-b1e7-868b-a4d9-2bd49f4dff1e@oracle.com> Thanks for fixing this. Some copyright dates weren't updated.? Comment for Hidden.java still says TODO. dl On 1/18/19 3:05 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8213234/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8213234 > > Move LambdaForm.Hidden to jdk.internal.vm.annotation, so it can be > shared across JDK until a standard solution is provided [1]. > > Testing: tier1-2 > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8212620 From vladimir.x.ivanov at oracle.com Sat Jan 19 00:31:03 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 16:31:03 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> Message-ID: <095dc325-6e6b-7112-56dc-8514c58ad0b7@oracle.com> Thanks, Vladimir. > I usually used --with-jvm-features=-aot,-jvmci,-graal > > Did not work in this case too? I didn't know it supports comma-separated list, but it doesn't work as well: $ bash configure --with-jvm-features="-aot,-jvmci,-graal" checking if jvmci module jdk.internal.vm.ci should be built... yes checking if graal module jdk.internal.vm.compiler should be built... yes checking if aot should be enabled... yes Best regards, Vladimir Ivanov > On 1/18/19 3:33 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8217404 >> >> --with-jvm-features doesn't work properly when multiple features are >> explicitly disabled: >> >> $ bash configure --with-jvm-features="-aot -jvmci -graal" >> ... >> checking if jvmci module jdk.internal.vm.ci should be built... yes >> checking if graal module jdk.internal.vm.compiler should be built... yes >> checking if aot should be enabled... yes >> ... >> >> The problem in the following code: >> >> ?? DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` >> ?? if test "x$DISABLE_AOT" = "xaot"; then >> ???? ENABLE_AOT="false" >> ?? fi >> >> Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of >> explicitly disabled features, grep over it returns the whole list when >> there's a match. The subsequent check fails because there's no exact >> match, though DISABLE_AOT contains "aot" . >> >> Proposed fix is to check there's no match instead. >> >> After the fix it works as expected: >> >> $ bash configure --with-jvm-features="-aot -jvmci -graal" >> ... >> checking if jvmci module jdk.internal.vm.ci should be built... no, forced >> checking if graal module jdk.internal.vm.compiler should be built... >> no, forced >> checking if aot should be enabled... no, forced >> ... >> >> (The fix doesn't address the case when one feature has a name which is >> a proper substring of another feature, but there are no such cases at >> the moment.) >> >> Best regards, >> Vladimir Ivanov From vladimir.x.ivanov at oracle.com Sat Jan 19 00:39:45 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 16:39:45 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> Message-ID: <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> Thanks, Igor. > overall your fix looks reasonable, but w/ it we can get unintentionally disabled features (b/c grep doesn't do full word match). although this problem wasn't really introduced by your fix, I think it's be better to fix it as a part of your patch. I see two possible solutions: I was aware of such drawback, but decided to leave it as is, since it doesn't affect existing features. > - add "-w" to grep, but I am not sure if "-w" is supported by all grep implementations > - use $XARGS instead of $ECHO when we get DISABLE_X. in this case you will need to revert your changes in 'if test ...' lines I'm in favor of using "-w" and I see different grep flags being used already, but would like somebody from Build team confirm they are OK with such solution. Best regards, Vladimir Ivanov >> On Jan 18, 2019, at 3:33 PM, Vladimir Ivanov wrote: >> >> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8217404 >> >> --with-jvm-features doesn't work properly when multiple features are explicitly disabled: >> >> $ bash configure --with-jvm-features="-aot -jvmci -graal" >> ... >> checking if jvmci module jdk.internal.vm.ci should be built... yes >> checking if graal module jdk.internal.vm.compiler should be built... yes >> checking if aot should be enabled... yes >> ... >> >> The problem in the following code: >> >> DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` >> if test "x$DISABLE_AOT" = "xaot"; then >> ENABLE_AOT="false" >> fi >> >> Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of explicitly disabled features, grep over it returns the whole list when there's a match. The subsequent check fails because there's no exact match, though DISABLE_AOT contains "aot" . >> >> Proposed fix is to check there's no match instead. >> >> After the fix it works as expected: >> >> $ bash configure --with-jvm-features="-aot -jvmci -graal" >> ... >> checking if jvmci module jdk.internal.vm.ci should be built... no, forced >> checking if graal module jdk.internal.vm.compiler should be built... no, forced >> checking if aot should be enabled... no, forced >> ... >> >> (The fix doesn't address the case when one feature has a name which is a proper substring of another feature, but there are no such cases at the moment.) >> >> Best regards, >> Vladimir Ivanov > From vladimir.x.ivanov at oracle.com Sat Jan 19 01:26:03 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 17:26:03 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> Message-ID: <0b40d58e-8267-5a89-5ce5-684b7bb67b1d@oracle.com> Updated webrev: http://cr.openjdk.java.net/~vlivanov/8217404/webrev.01 Verified that it works as expected on Linux, Windows, MacOS, and Solaris. Best regards, Vladimir Ivanov On 18/01/2019 16:39, Vladimir Ivanov wrote: > Thanks, Igor. > >> overall your fix looks reasonable, but w/ it we can get >> unintentionally disabled features (b/c grep doesn't do full word >> match). although this problem wasn't really introduced by your fix, I >> think it's be better to fix it as a part of your patch. I see two >> possible solutions: > > I was aware of such drawback, but decided to leave it as is, since it > doesn't affect existing features. > >> ? -? add "-w" to grep, but I am not sure if "-w" is supported by all >> grep implementations >> ? - use $XARGS instead of $ECHO when we get DISABLE_X. in this case >> you will need to revert your changes in 'if test ...' lines > > I'm in favor of using "-w" and I see different grep flags being used > already, but would like somebody from Build team confirm they are OK > with such solution. > > Best regards, > Vladimir Ivanov > >>> On Jan 18, 2019, at 3:33 PM, Vladimir Ivanov >>> wrote: >>> >>> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8217404 >>> >>> --with-jvm-features doesn't work properly when multiple features are >>> explicitly disabled: >>> >>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>> ... >>> checking if jvmci module jdk.internal.vm.ci should be built... yes >>> checking if graal module jdk.internal.vm.compiler should be built... yes >>> checking if aot should be enabled... yes >>> ... >>> >>> The problem in the following code: >>> >>> ? DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` >>> ? if test "x$DISABLE_AOT" = "xaot"; then >>> ??? ENABLE_AOT="false" >>> ? fi >>> >>> Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of >>> explicitly disabled features, grep over it returns the whole list >>> when there's a match. The subsequent check fails because there's no >>> exact match, though DISABLE_AOT contains "aot" . >>> >>> Proposed fix is to check there's no match instead. >>> >>> After the fix it works as expected: >>> >>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>> ... >>> checking if jvmci module jdk.internal.vm.ci should be built... no, >>> forced >>> checking if graal module jdk.internal.vm.compiler should be built... >>> no, forced >>> checking if aot should be enabled... no, forced >>> ... >>> >>> (The fix doesn't address the case when one feature has a name which >>> is a proper substring of another feature, but there are no such cases >>> at the moment.) >>> >>> Best regards, >>> Vladimir Ivanov >> From igor.ignatyev at oracle.com Sat Jan 19 01:29:05 2019 From: igor.ignatyev at oracle.com (Igor Ignatev) Date: Fri, 18 Jan 2019 17:29:05 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: <0b40d58e-8267-5a89-5ce5-684b7bb67b1d@oracle.com> References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> <0b40d58e-8267-5a89-5ce5-684b7bb67b1d@oracle.com> Message-ID: <431EEFF8-A81A-4D2E-812A-AE47CD01C098@oracle.com> Still looks good to me. ? Igor > On Jan 18, 2019, at 5:26 PM, Vladimir Ivanov wrote: > > Updated webrev: > http://cr.openjdk.java.net/~vlivanov/8217404/webrev.01 > > Verified that it works as expected on Linux, Windows, MacOS, and Solaris. > > Best regards, > Vladimir Ivanov > >> On 18/01/2019 16:39, Vladimir Ivanov wrote: >> Thanks, Igor. >>> overall your fix looks reasonable, but w/ it we can get unintentionally disabled features (b/c grep doesn't do full word match). although this problem wasn't really introduced by your fix, I think it's be better to fix it as a part of your patch. I see two possible solutions: >> I was aware of such drawback, but decided to leave it as is, since it doesn't affect existing features. >>> - add "-w" to grep, but I am not sure if "-w" is supported by all grep implementations >>> - use $XARGS instead of $ECHO when we get DISABLE_X. in this case you will need to revert your changes in 'if test ...' lines >> I'm in favor of using "-w" and I see different grep flags being used already, but would like somebody from Build team confirm they are OK with such solution. >> Best regards, >> Vladimir Ivanov >>>> On Jan 18, 2019, at 3:33 PM, Vladimir Ivanov wrote: >>>> >>>> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8217404 >>>> >>>> --with-jvm-features doesn't work properly when multiple features are explicitly disabled: >>>> >>>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>>> ... >>>> checking if jvmci module jdk.internal.vm.ci should be built... yes >>>> checking if graal module jdk.internal.vm.compiler should be built... yes >>>> checking if aot should be enabled... yes >>>> ... >>>> >>>> The problem in the following code: >>>> >>>> DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` >>>> if test "x$DISABLE_AOT" = "xaot"; then >>>> ENABLE_AOT="false" >>>> fi >>>> >>>> Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of explicitly disabled features, grep over it returns the whole list when there's a match. The subsequent check fails because there's no exact match, though DISABLE_AOT contains "aot" . >>>> >>>> Proposed fix is to check there's no match instead. >>>> >>>> After the fix it works as expected: >>>> >>>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>>> ... >>>> checking if jvmci module jdk.internal.vm.ci should be built... no, forced >>>> checking if graal module jdk.internal.vm.compiler should be built... no, forced >>>> checking if aot should be enabled... no, forced >>>> ... >>>> >>>> (The fix doesn't address the case when one feature has a name which is a proper substring of another feature, but there are no such cases at the moment.) >>>> >>>> Best regards, >>>> Vladimir Ivanov >>> From vladimir.kozlov at oracle.com Sat Jan 19 01:45:48 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 18 Jan 2019 17:45:48 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: <431EEFF8-A81A-4D2E-812A-AE47CD01C098@oracle.com> References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> <0b40d58e-8267-5a89-5ce5-684b7bb67b1d@oracle.com> <431EEFF8-A81A-4D2E-812A-AE47CD01C098@oracle.com> Message-ID: <96E54E87-7E1C-44FC-8623-330483A66CC9@oracle.com> +1 Thanks Vladimir > On Jan 18, 2019, at 5:29 PM, Igor Ignatev wrote: > > Still looks good to me. > > ? Igor > >> On Jan 18, 2019, at 5:26 PM, Vladimir Ivanov wrote: >> >> Updated webrev: >> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.01 >> >> Verified that it works as expected on Linux, Windows, MacOS, and Solaris. >> >> Best regards, >> Vladimir Ivanov >> >>>> On 18/01/2019 16:39, Vladimir Ivanov wrote: >>>> Thanks, Igor. >>>> overall your fix looks reasonable, but w/ it we can get unintentionally disabled features (b/c grep doesn't do full word match). although this problem wasn't really introduced by your fix, I think it's be better to fix it as a part of your patch. I see two possible solutions: >>> I was aware of such drawback, but decided to leave it as is, since it doesn't affect existing features. >>>> - add "-w" to grep, but I am not sure if "-w" is supported by all grep implementations >>>> - use $XARGS instead of $ECHO when we get DISABLE_X. in this case you will need to revert your changes in 'if test ...' lines >>> I'm in favor of using "-w" and I see different grep flags being used already, but would like somebody from Build team confirm they are OK with such solution. >>> Best regards, >>> Vladimir Ivanov >>>>> On Jan 18, 2019, at 3:33 PM, Vladimir Ivanov wrote: >>>>> >>>>> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8217404 >>>>> >>>>> --with-jvm-features doesn't work properly when multiple features are explicitly disabled: >>>>> >>>>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>>>> ... >>>>> checking if jvmci module jdk.internal.vm.ci should be built... yes >>>>> checking if graal module jdk.internal.vm.compiler should be built... yes >>>>> checking if aot should be enabled... yes >>>>> ... >>>>> >>>>> The problem in the following code: >>>>> >>>>> DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` >>>>> if test "x$DISABLE_AOT" = "xaot"; then >>>>> ENABLE_AOT="false" >>>>> fi >>>>> >>>>> Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of explicitly disabled features, grep over it returns the whole list when there's a match. The subsequent check fails because there's no exact match, though DISABLE_AOT contains "aot" . >>>>> >>>>> Proposed fix is to check there's no match instead. >>>>> >>>>> After the fix it works as expected: >>>>> >>>>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>>>> ... >>>>> checking if jvmci module jdk.internal.vm.ci should be built... no, forced >>>>> checking if graal module jdk.internal.vm.compiler should be built... no, forced >>>>> checking if aot should be enabled... no, forced >>>>> ... >>>>> >>>>> (The fix doesn't address the case when one feature has a name which is a proper substring of another feature, but there are no such cases at the moment.) >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>> > From vladimir.x.ivanov at oracle.com Sat Jan 19 02:16:21 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 18 Jan 2019 18:16:21 -0800 Subject: [13] RFR (S): 8213234: Move LambdaForm.Hidden to jdk.internal.vm.annotation In-Reply-To: <54b6dd72-b1e7-868b-a4d9-2bd49f4dff1e@oracle.com> References: <3aba3976-60a0-f36f-92b6-de035b954dd4@oracle.com> <54b6dd72-b1e7-868b-a4d9-2bd49f4dff1e@oracle.com> Message-ID: <7351f8e2-b812-83fd-8268-912544f90689@oracle.com> Thanks, Mandy & Dean. Updated in-place: http://cr.openjdk.java.net/~vlivanov/8213234/webrev.00/ Javadoc for Hidden now says the following: 30 /** 31 * A method or constructor may be annotated as "hidden" to hint it is desirable 32 * to omit it from stack traces. 33 * 34 * @implNote 35 * This annotation only takes effect for methods or constructors of classes 36 * loaded by the boot loader. Annotations on methods or constructors of classes 37 * loaded outside of the boot loader are ignored. 38 * 39 *

HotSpot JVM provides diagnostic option {@code -XX:+ShowHiddenFrames} to 40 * always show "hidden" frames. 41 */ Best regards, Vladimir Ivanov On 18/01/2019 16:18, dean.long at oracle.com wrote: > Thanks for fixing this. > Some copyright dates weren't updated.? Comment for Hidden.java still > says TODO. > > dl > > On 1/18/19 3:05 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8213234/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8213234 >> >> Move LambdaForm.Hidden to jdk.internal.vm.annotation, so it can be >> shared across JDK until a standard solution is provided [1]. >> >> Testing: tier1-2 >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8212620 > From kim.barrett at oracle.com Sun Jan 20 08:45:46 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 20 Jan 2019 03:45:46 -0500 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: <095dc325-6e6b-7112-56dc-8514c58ad0b7@oracle.com> References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> <095dc325-6e6b-7112-56dc-8514c58ad0b7@oracle.com> Message-ID: > On Jan 18, 2019, at 7:31 PM, Vladimir Ivanov wrote: > > Thanks, Vladimir. > >> I usually used --with-jvm-features=-aot,-jvmci,-graal >> Did not work in this case too? > > I didn't know it supports comma-separated list, but it doesn't work as well: > > $ bash configure --with-jvm-features="-aot,-jvmci,-graal" > > checking if jvmci module jdk.internal.vm.ci should be built... yes > checking if graal module jdk.internal.vm.compiler should be built... yes > checking if aot should be enabled? yes Isn?t the problem here simply incorrect syntax in that command line? Drop the quotes around the ?with-jvm-features argument and I think it should work fine. From volker.simonis at gmail.com Mon Jan 21 09:19:15 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 21 Jan 2019 10:19:15 +0100 Subject: Best mailing list for JVM embedding In-Reply-To: References: Message-ID: -- Moved to hotspot-dev -- Hi Robert, You can use "-XX:+PrintFlagsFinal" and compare the output for the two variants to see if for some reason there are differing option settings. Regards, Volker On Sat, Jan 19, 2019 at 6:23 PM Robert Marcano wrote: > > Greetings, which is the best mailing list for discussions about > embedding the JVM (via JNI_CreateJavaVM)? > > The JVM is being embedded for desktop integration issues, for example to > show the appropriate application name on the process list instead of > java/java.exe, among many other things. > > I am experiencing what looks like higher memory usage and/or failure to > garbage collect correctly when OpenJDK 11 is the embedded JVM. Starting > a test application using the java launcher, I get a GC log like this: > > > [0.007s][info][gc] Using G1 > > [0.389s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 10M->8M(124M) 8.661ms > > [0.705s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) 13M->10M(124M) 6.148ms > > Jan 19, 2019 1:04:26 PM test.Test init > > FINE: Starting application > > [1.376s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) 18M->10M(40M) 4.763ms > > [2.288s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) 23M->12M(40M) 6.382ms > > [2.444s][info][gc] GC(5) Pause Young (Concurrent Start) (Metadata GC Threshold) 18M->12M(48M) 7.579ms > > [2.444s][info][gc] GC(6) Concurrent Cycle > > [2.481s][info][gc] GC(6) Pause Remark 13M->13M(48M) 5.255ms > > [2.498s][info][gc] GC(6) Pause Cleanup 13M->13M(48M) 0.090ms > > [2.499s][info][gc] GC(6) Concurrent Cycle 54.811ms > > [2.905s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) 26M->13M(48M) 12.726ms > > [3.204s][info][gc] GC(8) Pause Young (Normal) (GCLocker Initiated GC) 29M->15M(48M) 11.216ms > > [3.462s][info][gc] GC(9) Pause Young (Normal) (G1 Evacuation Pause) 30M->17M(48M) 18.043ms > > [3.679s][info][gc] GC(10) Pause Young (Normal) (G1 Evacuation Pause) 31M->18M(64M) 15.195ms > > [3.933s][info][gc] GC(11) Pause Young (Normal) (G1 Evacuation Pause) 38M->20M(64M) 9.412ms > > [4.230s][info][gc] GC(12) Pause Young (Normal) (G1 Evacuation Pause) 40M->21M(64M) 16.319ms > > [4.536s][info][gc] GC(13) Pause Young (Normal) (G1 Evacuation Pause) 41M->23M(64M) 23.897ms > > [4.750s][info][gc] GC(14) Pause Young (Normal) (G1 Evacuation Pause) 43M->24M(94M) 8.776ms > > [5.180s][info][gc] GC(15) Pause Young (Normal) (G1 Evacuation Pause) 58M->26M(94M) 15.610ms > > [5.546s][info][gc] GC(16) Pause Young (Normal) (G1 Evacuation Pause) 67M->27M(94M) 18.075ms > > [6.058s][info][gc] GC(17) Pause Young (Normal) (G1 Evacuation Pause) 69M->30M(94M) 32.625ms > > [7.268s][info][gc] GC(18) Pause Young (Normal) (G1 Evacuation Pause) 71M->31M(156M) 18.999ms > > [7.458s][info][gc] GC(19) Pause Young (Concurrent Start) (Metadata GC Threshold) 40M->31M(156M) 20.217ms > > [7.459s][info][gc] GC(20) Concurrent Cycle > > [7.676s][info][gc] GC(20) Pause Remark 35M->35M(156M) 19.304ms > > [7.748s][info][gc] GC(20) Pause Cleanup 36M->36M(156M) 0.183ms > > [7.782s][info][gc] GC(20) Concurrent Cycle 323.765ms > > [8.899s][info][gc] GC(21) Pause Young (Concurrent Start) (G1 Evacuation Pause) 84M->40M(156M) 69.976ms > > [8.899s][info][gc] GC(22) Concurrent Cycle > > [9.152s][info][gc] GC(22) Pause Remark 47M->47M(156M) 21.133ms > > [9.244s][info][gc] GC(22) Pause Cleanup 49M->49M(156M) 0.127ms > > [9.247s][info][gc] GC(22) Concurrent Cycle 348.256ms > > [10.203s][info][gc] GC(23) Pause Young (Normal) (G1 Evacuation Pause) 97M->55M(156M) 74.572ms > > [11.102s][info][gc] GC(24) Pause Full (System.gc()) 100M->21M(77M) 115.166ms > > [15.382s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation Pause) 48M->22M(154M) 6.371ms > > When the same Java 11 JVM is loaded via JNI, with the same VM arguments > than using the java launcher, the log look like this (to reach the same > point at startup) > > > [0.015s][info][gc] Using G1 > > [0.501s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 6M->1M(124M) 8.550ms > > [0.785s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) 8M->3M(124M) 6.533ms > > [0.940s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 14M->7M(124M) 17.822ms > > [1.212s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) 22M->10M(124M) 11.145ms > > [1.462s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) 29M->15M(180M) 16.451ms > > [1.695s][info][gc] GC(5) Pause Young (Normal) (G1 Evacuation Pause) 47M->25M(180M) 38.828ms > > [2.041s][info][gc] GC(6) Pause Young (Normal) (G1 Evacuation Pause) 55M->30M(180M) 22.151ms > > [2.346s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) 80M->49M(180M) 53.093ms > > Jan 19, 2019 1:01:05 PM test.Test init > > FINE: Starting application > > [4.094s][info][gc] GC(9) Pause Young (Concurrent Start) (Metadata GC Threshold) 64M->40M(258M) 50.663ms > > [4.094s][info][gc] GC(10) Concurrent Cycle > > [4.480s][info][gc] GC(10) Pause Remark 48M->48M(258M) 7.442ms > > [4.652s][info][gc] GC(10) Pause Cleanup 55M->55M(258M) 0.186ms > > [4.656s][info][gc] GC(10) Concurrent Cycle 562.009ms > > [5.174s][info][gc] GC(11) Pause Young (Concurrent Start) (G1 Evacuation Pause) 82M->46M(258M) 24.726ms > > [5.174s][info][gc] GC(12) Concurrent Cycle > > [5.513s][info][gc] GC(12) Pause Remark 56M->56M(258M) 11.695ms > > [5.657s][info][gc] GC(12) Pause Cleanup 62M->62M(258M) 0.176ms > > [5.660s][info][gc] GC(12) Concurrent Cycle 486.466ms > > [6.430s][info][gc] GC(13) Pause Young (Normal) (G1 Evacuation Pause) 107M->58M(258M) 55.665ms > > [7.538s][info][gc] GC(14) Pause Young (Normal) (G1 Evacuation Pause) 107M->63M(258M) 58.642ms > > [8.724s][info][gc] GC(15) Pause Young (Concurrent Start) (Metadata GC Threshold) 91M->68M(496M) 47.374ms > > [8.724s][info][gc] GC(16) Concurrent Cycle > > [9.417s][info][gc] GC(16) Pause Remark 79M->79M(496M) 19.350ms > > [9.557s][info][gc] GC(16) Pause Cleanup 81M->81M(496M) 0.340ms > > [9.575s][info][gc] GC(16) Concurrent Cycle 850.788ms > > [10.954s][info][gc] GC(17) Pause Young (Concurrent Start) (G1 Evacuation Pause) 134M->85M(496M) 162.779ms > > [10.954s][info][gc] GC(18) Concurrent Cycle > > [11.389s][info][gc] GC(19) Pause Young (Normal) (G1 Evacuation Pause) 99M->90M(496M) 79.954ms > > [11.519s][info][gc] GC(18) Pause Remark 93M->92M(496M) 24.636ms > > [11.896s][info][gc] GC(18) Pause Cleanup 102M->102M(496M) 0.326ms > > [11.903s][info][gc] GC(18) Concurrent Cycle 949.231ms > > [17.356s][info][gc] GC(22) Pause Young (Concurrent Start) (G1 Humongous Allocation) 131M->81M(365M) 56.745ms > > [17.357s][info][gc] GC(23) Concurrent Cycle > > [17.706s][info][gc] GC(23) Pause Remark 116M->113M(365M) 31.351ms > > [18.008s][info][gc] GC(23) Pause Cleanup 117M->117M(365M) 0.250ms > > [18.018s][info][gc] GC(23) Concurrent Cycle 661.284ms > > [18.888s][info][gc] GC(24) Pause Young (Normal) (G1 Evacuation Pause) 205M->91M(365M) 93.737ms > > [20.355s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation Pause) 324M->136M(378M) 125.037ms > > [21.091s][info][gc] GC(26) Pause Young (Normal) (G1 Evacuation Pause) 295M->150M(378M) 93.607ms > > [21.631s][info][gc] GC(27) Pause Young (Concurrent Start) (G1 Humongous Allocation) 261M->164M(516M) 77.013ms > > [21.631s][info][gc] GC(28) Concurrent Cycle > > [21.981s][info][gc] GC(28) Pause Remark 212M->169M(516M) 43.679ms > > [22.243s][info][gc] GC(28) Pause Cleanup 214M->214M(516M) 0.257ms > > [22.250s][info][gc] GC(28) Concurrent Cycle 619.296ms > > [22.827s][info][gc] GC(29) Pause Young (Normal) (G1 Evacuation Pause) 321M->155M(516M) 106.789ms > > [23.758s][info][gc] GC(30) Pause Young (Normal) (G1 Evacuation Pause) 363M->187M(516M) 121.025ms > > [24.957s][info][gc] GC(31) Pause Young (Normal) (GCLocker Initiated GC) 386M->207M(516M) 111.118ms > > [25.697s][info][gc] GC(32) Pause Young (Concurrent Start) (G1 Humongous Allocation) 346M->222M(656M) 104.765ms > > [25.697s][info][gc] GC(33) Concurrent Cycle > > [26.128s][info][gc] GC(33) Pause Remark 270M->192M(656M) 55.681ms > > [26.342s][info][gc] GC(33) Pause Cleanup 203M->203M(656M) 0.371ms > > [26.349s][info][gc] GC(33) Concurrent Cycle 651.801ms > > [27.421s][info][gc] GC(34) Pause Young (Prepare Mixed) (G1 Evacuation Pause) 380M->180M(656M) 195.335ms > > [27.543s][info][gc] GC(35) Pause Young (Mixed) (G1 Evacuation Pause) 195M->172M(656M) 38.254ms > > [28.694s][info][gc] GC(36) Pause Young (Normal) (G1 Evacuation Pause) 427M->214M(656M) 107.817ms > > [31.109s][info][gc] GC(37) Pause Young (Normal) (G1 Evacuation Pause) 486M->239M(656M) 117.275ms > > Notice, the higher memory usage. If a more complex applications is > started, the heap continue to grow indefinitely, with long GC pauses and > a growing heap. It is my understanding that the JVM should setup default > GC and or memory related configuration by itself, be it from the java > launcher or JNI_CreateJavaVM. > > The same custom launcher does not experience this when the embedded JVM > is version 8. > > Note: the launcher is a simple Rust program, not a complex thing, just > locate libjvm.so/jvm.dll, build the VM options, Use JNI_CreateJavaVM, > locate the main class and invoke the main method. These logs are for the > Linux version. > > Any help is appreciated, or a pointer to the correct mailing list. > > From magnus.ihse.bursie at oracle.com Mon Jan 21 10:00:59 2019 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 21 Jan 2019 11:00:59 +0100 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> <095dc325-6e6b-7112-56dc-8514c58ad0b7@oracle.com> Message-ID: <287fdbad-1daa-e2ef-7618-c63924b77363@oracle.com> On 2019-01-20 09:45, Kim Barrett wrote: >> On Jan 18, 2019, at 7:31 PM, Vladimir Ivanov wrote: >> >> Thanks, Vladimir. >> >>> I usually used --with-jvm-features=-aot,-jvmci,-graal >>> Did not work in this case too? >> I didn't know it supports comma-separated list, but it doesn't work as well: >> >> $ bash configure --with-jvm-features="-aot,-jvmci,-graal" >> >> checking if jvmci module jdk.internal.vm.ci should be built... yes >> checking if graal module jdk.internal.vm.compiler should be built... yes >> checking if aot should be enabled? yes > Isn?t the problem here simply incorrect syntax in that command line? > > Drop the quotes around the ?with-jvm-features argument and I think it should work fine. Let me bring some clarity in the syntax here. For --with-jvm-features, if you want to list multiple features, you can separate them by either space or comma. Both are valid and officially supported. My recommendation, though, is to use comma, to avoid the problem with spaces in command line options. If you want to use spaces, you *must* use quotes. A command line like: bash configure --with-jvm-features=-aot -jvmci would be interpreted as "-jvmci" was a flag for configure, which it is not. There are multiple ways of quoting, you could use ' or ", and include the flag name like "--with-jvm-features=aot graal", or just the argument list. My preference, if I need to use quotes, is to use the style Vladimir uses in his example; I believe that maximizes readability. But, as I said, for --with-jvm-features, I recommend using comma instead, to avoid the quoting issue completely. Internally, the list of enabled/disabled features are treated as lists of space-separated words; but that is an implementation detail and not part of the command-line interface. /Magnus From magnus.ihse.bursie at oracle.com Mon Jan 21 10:06:40 2019 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Mon, 21 Jan 2019 11:06:40 +0100 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> Message-ID: On 2019-01-19 01:39, Vladimir Ivanov wrote: > Thanks, Igor. > >> overall your fix looks reasonable, but w/ it we can get >> unintentionally disabled features (b/c grep doesn't do full word >> match). although this problem wasn't really introduced by your fix, I >> think it's be better to fix it as a part of your patch. I see two >> possible solutions: > > I was aware of such drawback, but decided to leave it as is, since it > doesn't affect existing features. > >> ? -? add "-w" to grep, but I am not sure if "-w" is supported by all >> grep implementations >> ? - use $XARGS instead of $ECHO when we get DISABLE_X. in this case >> you will need to revert your changes in 'if test ...' lines > > I'm in favor of using "-w" and I see different grep flags being used > already, but would like somebody from Build team confirm they are OK > with such solution. I think an even better solution is to use the pattern of HOTSPOT_CHECK_JVM_FEATURE. This should solve all potential problems, and is moving the abstraction level up slightly. I've been working for some time now on better structure for handling the JVM feature testing. While we are using the feature testing as I intended it, the underlying support for doing this in a good way has never been put into place. Unfortunately, this fix has been on low priority and been mostly idling on my disk, half done, for several months now. So we need to have an interim solution to this problem. But I'd like to see that the fix takes at least a step towards a better abstraction. Vladimir, if you're okay with it I'd like to propose this as a patch to the problem instead: http://cr.openjdk.java.net/~ihse/JDK-8217404-fix-multiple-disabled-jvm-features/webrev.01 /Magnus > > Best regards, > Vladimir Ivanov > >>> On Jan 18, 2019, at 3:33 PM, Vladimir Ivanov >>> wrote: >>> >>> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8217404 >>> >>> --with-jvm-features doesn't work properly when multiple features are >>> explicitly disabled: >>> >>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>> ... >>> checking if jvmci module jdk.internal.vm.ci should be built... yes >>> checking if graal module jdk.internal.vm.compiler should be built... >>> yes >>> checking if aot should be enabled... yes >>> ... >>> >>> The problem in the following code: >>> >>> ? DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` >>> ? if test "x$DISABLE_AOT" = "xaot"; then >>> ??? ENABLE_AOT="false" >>> ? fi >>> >>> Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of >>> explicitly disabled features, grep over it returns the whole list >>> when there's a match. The subsequent check fails because there's no >>> exact match, though DISABLE_AOT contains "aot" . >>> >>> Proposed fix is to check there's no match instead. >>> >>> After the fix it works as expected: >>> >>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>> ... >>> checking if jvmci module jdk.internal.vm.ci should be built... no, >>> forced >>> checking if graal module jdk.internal.vm.compiler should be built... >>> no, forced >>> checking if aot should be enabled... no, forced >>> ... >>> >>> (The fix doesn't address the case when one feature has a name which >>> is a proper substring of another feature, but there are no such >>> cases at the moment.) >>> >>> Best regards, >>> Vladimir Ivanov >> From robert at marcanoonline.com Mon Jan 21 12:25:16 2019 From: robert at marcanoonline.com (Robert Marcano) Date: Mon, 21 Jan 2019 08:25:16 -0400 Subject: Best mailing list for JVM embedding In-Reply-To: References: Message-ID: <54df1401-857f-35c4-9973-ab0ac7818194@marcanoonline.com> On 1/21/19 5:19 AM, Volker Simonis wrote: > -- Moved to hotspot-dev -- > > Hi Robert, > > You can use "-XX:+PrintFlagsFinal" and compare the output for the two > variants to see if for some reason there are differing option > settings. Thanks, compared the output of that on a java launcher call and my launcher and get the same flagsa values, so it doesn't look like different defaults isn't the problem > > Regards, > Volker > > On Sat, Jan 19, 2019 at 6:23 PM Robert Marcano wrote: >> >> Greetings, which is the best mailing list for discussions about >> embedding the JVM (via JNI_CreateJavaVM)? >> >> The JVM is being embedded for desktop integration issues, for example to >> show the appropriate application name on the process list instead of >> java/java.exe, among many other things. >> >> I am experiencing what looks like higher memory usage and/or failure to >> garbage collect correctly when OpenJDK 11 is the embedded JVM. Starting >> a test application using the java launcher, I get a GC log like this: >> >>> [0.007s][info][gc] Using G1 >>> [0.389s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 10M->8M(124M) 8.661ms >>> [0.705s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) 13M->10M(124M) 6.148ms >>> Jan 19, 2019 1:04:26 PM test.Test init >>> FINE: Starting application >>> [1.376s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) 18M->10M(40M) 4.763ms >>> [2.288s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) 23M->12M(40M) 6.382ms >>> [2.444s][info][gc] GC(5) Pause Young (Concurrent Start) (Metadata GC Threshold) 18M->12M(48M) 7.579ms >>> [2.444s][info][gc] GC(6) Concurrent Cycle >>> [2.481s][info][gc] GC(6) Pause Remark 13M->13M(48M) 5.255ms >>> [2.498s][info][gc] GC(6) Pause Cleanup 13M->13M(48M) 0.090ms >>> [2.499s][info][gc] GC(6) Concurrent Cycle 54.811ms >>> [2.905s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) 26M->13M(48M) 12.726ms >>> [3.204s][info][gc] GC(8) Pause Young (Normal) (GCLocker Initiated GC) 29M->15M(48M) 11.216ms >>> [3.462s][info][gc] GC(9) Pause Young (Normal) (G1 Evacuation Pause) 30M->17M(48M) 18.043ms >>> [3.679s][info][gc] GC(10) Pause Young (Normal) (G1 Evacuation Pause) 31M->18M(64M) 15.195ms >>> [3.933s][info][gc] GC(11) Pause Young (Normal) (G1 Evacuation Pause) 38M->20M(64M) 9.412ms >>> [4.230s][info][gc] GC(12) Pause Young (Normal) (G1 Evacuation Pause) 40M->21M(64M) 16.319ms >>> [4.536s][info][gc] GC(13) Pause Young (Normal) (G1 Evacuation Pause) 41M->23M(64M) 23.897ms >>> [4.750s][info][gc] GC(14) Pause Young (Normal) (G1 Evacuation Pause) 43M->24M(94M) 8.776ms >>> [5.180s][info][gc] GC(15) Pause Young (Normal) (G1 Evacuation Pause) 58M->26M(94M) 15.610ms >>> [5.546s][info][gc] GC(16) Pause Young (Normal) (G1 Evacuation Pause) 67M->27M(94M) 18.075ms >>> [6.058s][info][gc] GC(17) Pause Young (Normal) (G1 Evacuation Pause) 69M->30M(94M) 32.625ms >>> [7.268s][info][gc] GC(18) Pause Young (Normal) (G1 Evacuation Pause) 71M->31M(156M) 18.999ms >>> [7.458s][info][gc] GC(19) Pause Young (Concurrent Start) (Metadata GC Threshold) 40M->31M(156M) 20.217ms >>> [7.459s][info][gc] GC(20) Concurrent Cycle >>> [7.676s][info][gc] GC(20) Pause Remark 35M->35M(156M) 19.304ms >>> [7.748s][info][gc] GC(20) Pause Cleanup 36M->36M(156M) 0.183ms >>> [7.782s][info][gc] GC(20) Concurrent Cycle 323.765ms >>> [8.899s][info][gc] GC(21) Pause Young (Concurrent Start) (G1 Evacuation Pause) 84M->40M(156M) 69.976ms >>> [8.899s][info][gc] GC(22) Concurrent Cycle >>> [9.152s][info][gc] GC(22) Pause Remark 47M->47M(156M) 21.133ms >>> [9.244s][info][gc] GC(22) Pause Cleanup 49M->49M(156M) 0.127ms >>> [9.247s][info][gc] GC(22) Concurrent Cycle 348.256ms >>> [10.203s][info][gc] GC(23) Pause Young (Normal) (G1 Evacuation Pause) 97M->55M(156M) 74.572ms >>> [11.102s][info][gc] GC(24) Pause Full (System.gc()) 100M->21M(77M) 115.166ms >>> [15.382s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation Pause) 48M->22M(154M) 6.371ms >> >> When the same Java 11 JVM is loaded via JNI, with the same VM arguments >> than using the java launcher, the log look like this (to reach the same >> point at startup) >> >>> [0.015s][info][gc] Using G1 >>> [0.501s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 6M->1M(124M) 8.550ms >>> [0.785s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) 8M->3M(124M) 6.533ms >>> [0.940s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 14M->7M(124M) 17.822ms >>> [1.212s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) 22M->10M(124M) 11.145ms >>> [1.462s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) 29M->15M(180M) 16.451ms >>> [1.695s][info][gc] GC(5) Pause Young (Normal) (G1 Evacuation Pause) 47M->25M(180M) 38.828ms >>> [2.041s][info][gc] GC(6) Pause Young (Normal) (G1 Evacuation Pause) 55M->30M(180M) 22.151ms >>> [2.346s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) 80M->49M(180M) 53.093ms >>> Jan 19, 2019 1:01:05 PM test.Test init >>> FINE: Starting application >>> [4.094s][info][gc] GC(9) Pause Young (Concurrent Start) (Metadata GC Threshold) 64M->40M(258M) 50.663ms >>> [4.094s][info][gc] GC(10) Concurrent Cycle >>> [4.480s][info][gc] GC(10) Pause Remark 48M->48M(258M) 7.442ms >>> [4.652s][info][gc] GC(10) Pause Cleanup 55M->55M(258M) 0.186ms >>> [4.656s][info][gc] GC(10) Concurrent Cycle 562.009ms >>> [5.174s][info][gc] GC(11) Pause Young (Concurrent Start) (G1 Evacuation Pause) 82M->46M(258M) 24.726ms >>> [5.174s][info][gc] GC(12) Concurrent Cycle >>> [5.513s][info][gc] GC(12) Pause Remark 56M->56M(258M) 11.695ms >>> [5.657s][info][gc] GC(12) Pause Cleanup 62M->62M(258M) 0.176ms >>> [5.660s][info][gc] GC(12) Concurrent Cycle 486.466ms >>> [6.430s][info][gc] GC(13) Pause Young (Normal) (G1 Evacuation Pause) 107M->58M(258M) 55.665ms >>> [7.538s][info][gc] GC(14) Pause Young (Normal) (G1 Evacuation Pause) 107M->63M(258M) 58.642ms >>> [8.724s][info][gc] GC(15) Pause Young (Concurrent Start) (Metadata GC Threshold) 91M->68M(496M) 47.374ms >>> [8.724s][info][gc] GC(16) Concurrent Cycle >>> [9.417s][info][gc] GC(16) Pause Remark 79M->79M(496M) 19.350ms >>> [9.557s][info][gc] GC(16) Pause Cleanup 81M->81M(496M) 0.340ms >>> [9.575s][info][gc] GC(16) Concurrent Cycle 850.788ms >>> [10.954s][info][gc] GC(17) Pause Young (Concurrent Start) (G1 Evacuation Pause) 134M->85M(496M) 162.779ms >>> [10.954s][info][gc] GC(18) Concurrent Cycle >>> [11.389s][info][gc] GC(19) Pause Young (Normal) (G1 Evacuation Pause) 99M->90M(496M) 79.954ms >>> [11.519s][info][gc] GC(18) Pause Remark 93M->92M(496M) 24.636ms >>> [11.896s][info][gc] GC(18) Pause Cleanup 102M->102M(496M) 0.326ms >>> [11.903s][info][gc] GC(18) Concurrent Cycle 949.231ms >>> [17.356s][info][gc] GC(22) Pause Young (Concurrent Start) (G1 Humongous Allocation) 131M->81M(365M) 56.745ms >>> [17.357s][info][gc] GC(23) Concurrent Cycle >>> [17.706s][info][gc] GC(23) Pause Remark 116M->113M(365M) 31.351ms >>> [18.008s][info][gc] GC(23) Pause Cleanup 117M->117M(365M) 0.250ms >>> [18.018s][info][gc] GC(23) Concurrent Cycle 661.284ms >>> [18.888s][info][gc] GC(24) Pause Young (Normal) (G1 Evacuation Pause) 205M->91M(365M) 93.737ms >>> [20.355s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation Pause) 324M->136M(378M) 125.037ms >>> [21.091s][info][gc] GC(26) Pause Young (Normal) (G1 Evacuation Pause) 295M->150M(378M) 93.607ms >>> [21.631s][info][gc] GC(27) Pause Young (Concurrent Start) (G1 Humongous Allocation) 261M->164M(516M) 77.013ms >>> [21.631s][info][gc] GC(28) Concurrent Cycle >>> [21.981s][info][gc] GC(28) Pause Remark 212M->169M(516M) 43.679ms >>> [22.243s][info][gc] GC(28) Pause Cleanup 214M->214M(516M) 0.257ms >>> [22.250s][info][gc] GC(28) Concurrent Cycle 619.296ms >>> [22.827s][info][gc] GC(29) Pause Young (Normal) (G1 Evacuation Pause) 321M->155M(516M) 106.789ms >>> [23.758s][info][gc] GC(30) Pause Young (Normal) (G1 Evacuation Pause) 363M->187M(516M) 121.025ms >>> [24.957s][info][gc] GC(31) Pause Young (Normal) (GCLocker Initiated GC) 386M->207M(516M) 111.118ms >>> [25.697s][info][gc] GC(32) Pause Young (Concurrent Start) (G1 Humongous Allocation) 346M->222M(656M) 104.765ms >>> [25.697s][info][gc] GC(33) Concurrent Cycle >>> [26.128s][info][gc] GC(33) Pause Remark 270M->192M(656M) 55.681ms >>> [26.342s][info][gc] GC(33) Pause Cleanup 203M->203M(656M) 0.371ms >>> [26.349s][info][gc] GC(33) Concurrent Cycle 651.801ms >>> [27.421s][info][gc] GC(34) Pause Young (Prepare Mixed) (G1 Evacuation Pause) 380M->180M(656M) 195.335ms >>> [27.543s][info][gc] GC(35) Pause Young (Mixed) (G1 Evacuation Pause) 195M->172M(656M) 38.254ms >>> [28.694s][info][gc] GC(36) Pause Young (Normal) (G1 Evacuation Pause) 427M->214M(656M) 107.817ms >>> [31.109s][info][gc] GC(37) Pause Young (Normal) (G1 Evacuation Pause) 486M->239M(656M) 117.275ms >> >> Notice, the higher memory usage. If a more complex applications is >> started, the heap continue to grow indefinitely, with long GC pauses and >> a growing heap. It is my understanding that the JVM should setup default >> GC and or memory related configuration by itself, be it from the java >> launcher or JNI_CreateJavaVM. >> >> The same custom launcher does not experience this when the embedded JVM >> is version 8. >> >> Note: the launcher is a simple Rust program, not a complex thing, just >> locate libjvm.so/jvm.dll, build the VM options, Use JNI_CreateJavaVM, >> locate the main class and invoke the main method. These logs are for the >> Linux version. >> >> Any help is appreciated, or a pointer to the correct mailing list. >> >> From vladimir.x.ivanov at oracle.com Mon Jan 21 17:35:16 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 21 Jan 2019 09:35:16 -0800 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> Message-ID: <7ac843a0-ddfd-df85-588b-753701fe3b93@oracle.com> > Vladimir, if you're okay with it I'd like to propose this as a patch to > the problem instead: > > http://cr.openjdk.java.net/~ihse/JDK-8217404-fix-multiple-disabled-jvm-features/webrev.01 Looks good! I verified that it fixes the bug. Best regards, Vladimir Ivanov >>>> On Jan 18, 2019, at 3:33 PM, Vladimir Ivanov >>>> wrote: >>>> >>>> http://cr.openjdk.java.net/~vlivanov/8217404/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8217404 >>>> >>>> --with-jvm-features doesn't work properly when multiple features are >>>> explicitly disabled: >>>> >>>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>>> ... >>>> checking if jvmci module jdk.internal.vm.ci should be built... yes >>>> checking if graal module jdk.internal.vm.compiler should be built... >>>> yes >>>> checking if aot should be enabled... yes >>>> ... >>>> >>>> The problem in the following code: >>>> >>>> ? DISABLE_AOT=`$ECHO $DISABLED_JVM_FEATURES | $GREP aot` >>>> ? if test "x$DISABLE_AOT" = "xaot"; then >>>> ??? ENABLE_AOT="false" >>>> ? fi >>>> >>>> Since DISABLED_JVM_FEATURES ("aot jvmci graal") contains the list of >>>> explicitly disabled features, grep over it returns the whole list >>>> when there's a match. The subsequent check fails because there's no >>>> exact match, though DISABLE_AOT contains "aot" . >>>> >>>> Proposed fix is to check there's no match instead. >>>> >>>> After the fix it works as expected: >>>> >>>> $ bash configure --with-jvm-features="-aot -jvmci -graal" >>>> ... >>>> checking if jvmci module jdk.internal.vm.ci should be built... no, >>>> forced >>>> checking if graal module jdk.internal.vm.compiler should be built... >>>> no, forced >>>> checking if aot should be enabled... no, forced >>>> ... >>>> >>>> (The fix doesn't address the case when one feature has a name which >>>> is a proper substring of another feature, but there are no such >>>> cases at the moment.) >>>> >>>> Best regards, >>>> Vladimir Ivanov >>> > From robert at marcanoonline.com Mon Jan 21 18:48:37 2019 From: robert at marcanoonline.com (Robert Marcano) Date: Mon, 21 Jan 2019 14:48:37 -0400 Subject: High memory usage / leaks was: Best mailing list for JVM embedding In-Reply-To: <54df1401-857f-35c4-9973-ab0ac7818194@marcanoonline.com> References: <54df1401-857f-35c4-9973-ab0ac7818194@marcanoonline.com> Message-ID: On 1/21/19 8:25 AM, Robert Marcano wrote: > On 1/21/19 5:19 AM, Volker Simonis wrote: >> -- Moved to hotspot-dev -- >> >> Hi Robert, >> >> You can use "-XX:+PrintFlagsFinal" and compare the output for the two >> variants to see if for some reason there are differing option >> settings. > > Thanks, compared the output of that on a java launcher call and my > launcher and get the same flagsa values, so it doesn't look like > different defaults isn't the problem When testing this, trying to discard some weird Rust / OpenJDK 11 interactions, I wrote a simpler test case using the JVM invocation API from plain C. Noticed the same pattern of high memory usage, but it allowed me to detect there was a difference when using the provided java launcher and our custom launcher. Every VM option was the same (as strings), including the classpath. Both have something like -Djava.class.path=../lib/a.jar:../lib/b.jar But for some error in the configuration of our test environment, ../lib pointed to different directories for both launchers. Different ../lib directories with the same JARs, the difference between them is that the use for the java launcher are unsigned and ../lib for the custom launcher are signed. These jars are signed because they come from a JNLP application, the new launcher is part of our migration out of JNLP. So I managed to replicate the high memory usage using the standard java launcher. So the question now is, why signed jars could affect the memory usage of an application (we aren't doing JAR verification on our custom launcher, yet), just by being on the java.class.path? IIRC the initial application classpath JARs were never verified previously (by the java launcher alone, without JNLP around). Note: Tested with JARs signed with a self signed certificate and with one signed with a private CA. At most, signing the JARs could slow down the start up if it is now expected to these being verified by the java launcher (is it true?) but not higher memory usage and no reductions after a GC cycle but constants heap size increases. > >> >> Regards, >> Volker >> >> On Sat, Jan 19, 2019 at 6:23 PM Robert Marcano >> wrote: >>> >>> Greetings, which is the best mailing list for discussions about >>> embedding the JVM (via JNI_CreateJavaVM)? >>> >>> The JVM is being embedded for desktop integration issues, for example to >>> show the appropriate application name on the process list instead of >>> java/java.exe, among many other things. >>> >>> I am experiencing what looks like higher memory usage and/or failure to >>> garbage collect correctly when OpenJDK 11 is the embedded JVM. Starting >>> a test application using the java launcher, I get a GC log like this: >>> >>>> [0.007s][info][gc] Using G1 >>>> [0.389s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) >>>> 10M->8M(124M) 8.661ms >>>> [0.705s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) >>>> 13M->10M(124M) 6.148ms >>>> Jan 19, 2019 1:04:26 PM test.Test init >>>> FINE: Starting application >>>> [1.376s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) >>>> 18M->10M(40M) 4.763ms >>>> [2.288s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) >>>> 23M->12M(40M) 6.382ms >>>> [2.444s][info][gc] GC(5) Pause Young (Concurrent Start) (Metadata GC >>>> Threshold) 18M->12M(48M) 7.579ms >>>> [2.444s][info][gc] GC(6) Concurrent Cycle >>>> [2.481s][info][gc] GC(6) Pause Remark 13M->13M(48M) 5.255ms >>>> [2.498s][info][gc] GC(6) Pause Cleanup 13M->13M(48M) 0.090ms >>>> [2.499s][info][gc] GC(6) Concurrent Cycle 54.811ms >>>> [2.905s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) >>>> 26M->13M(48M) 12.726ms >>>> [3.204s][info][gc] GC(8) Pause Young (Normal) (GCLocker Initiated >>>> GC) 29M->15M(48M) 11.216ms >>>> [3.462s][info][gc] GC(9) Pause Young (Normal) (G1 Evacuation Pause) >>>> 30M->17M(48M) 18.043ms >>>> [3.679s][info][gc] GC(10) Pause Young (Normal) (G1 Evacuation Pause) >>>> 31M->18M(64M) 15.195ms >>>> [3.933s][info][gc] GC(11) Pause Young (Normal) (G1 Evacuation Pause) >>>> 38M->20M(64M) 9.412ms >>>> [4.230s][info][gc] GC(12) Pause Young (Normal) (G1 Evacuation Pause) >>>> 40M->21M(64M) 16.319ms >>>> [4.536s][info][gc] GC(13) Pause Young (Normal) (G1 Evacuation Pause) >>>> 41M->23M(64M) 23.897ms >>>> [4.750s][info][gc] GC(14) Pause Young (Normal) (G1 Evacuation Pause) >>>> 43M->24M(94M) 8.776ms >>>> [5.180s][info][gc] GC(15) Pause Young (Normal) (G1 Evacuation Pause) >>>> 58M->26M(94M) 15.610ms >>>> [5.546s][info][gc] GC(16) Pause Young (Normal) (G1 Evacuation Pause) >>>> 67M->27M(94M) 18.075ms >>>> [6.058s][info][gc] GC(17) Pause Young (Normal) (G1 Evacuation Pause) >>>> 69M->30M(94M) 32.625ms >>>> [7.268s][info][gc] GC(18) Pause Young (Normal) (G1 Evacuation Pause) >>>> 71M->31M(156M) 18.999ms >>>> [7.458s][info][gc] GC(19) Pause Young (Concurrent Start) (Metadata >>>> GC Threshold) 40M->31M(156M) 20.217ms >>>> [7.459s][info][gc] GC(20) Concurrent Cycle >>>> [7.676s][info][gc] GC(20) Pause Remark 35M->35M(156M) 19.304ms >>>> [7.748s][info][gc] GC(20) Pause Cleanup 36M->36M(156M) 0.183ms >>>> [7.782s][info][gc] GC(20) Concurrent Cycle 323.765ms >>>> [8.899s][info][gc] GC(21) Pause Young (Concurrent Start) (G1 >>>> Evacuation Pause) 84M->40M(156M) 69.976ms >>>> [8.899s][info][gc] GC(22) Concurrent Cycle >>>> [9.152s][info][gc] GC(22) Pause Remark 47M->47M(156M) 21.133ms >>>> [9.244s][info][gc] GC(22) Pause Cleanup 49M->49M(156M) 0.127ms >>>> [9.247s][info][gc] GC(22) Concurrent Cycle 348.256ms >>>> [10.203s][info][gc] GC(23) Pause Young (Normal) (G1 Evacuation >>>> Pause) 97M->55M(156M) 74.572ms >>>> [11.102s][info][gc] GC(24) Pause Full (System.gc()) 100M->21M(77M) >>>> 115.166ms >>>> [15.382s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation >>>> Pause) 48M->22M(154M) 6.371ms >>> >>> When the same Java 11 JVM is loaded via JNI, with the same VM arguments >>> than using the java launcher, the log look like this (to reach the same >>> point at startup) >>> >>>> [0.015s][info][gc] Using G1 >>>> [0.501s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) >>>> 6M->1M(124M) 8.550ms >>>> [0.785s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) >>>> 8M->3M(124M) 6.533ms >>>> [0.940s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) >>>> 14M->7M(124M) 17.822ms >>>> [1.212s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) >>>> 22M->10M(124M) 11.145ms >>>> [1.462s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) >>>> 29M->15M(180M) 16.451ms >>>> [1.695s][info][gc] GC(5) Pause Young (Normal) (G1 Evacuation Pause) >>>> 47M->25M(180M) 38.828ms >>>> [2.041s][info][gc] GC(6) Pause Young (Normal) (G1 Evacuation Pause) >>>> 55M->30M(180M) 22.151ms >>>> [2.346s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) >>>> 80M->49M(180M) 53.093ms >>>> Jan 19, 2019 1:01:05 PM test.Test init >>>> FINE: Starting application >>>> [4.094s][info][gc] GC(9) Pause Young (Concurrent Start) (Metadata GC >>>> Threshold) 64M->40M(258M) 50.663ms >>>> [4.094s][info][gc] GC(10) Concurrent Cycle >>>> [4.480s][info][gc] GC(10) Pause Remark 48M->48M(258M) 7.442ms >>>> [4.652s][info][gc] GC(10) Pause Cleanup 55M->55M(258M) 0.186ms >>>> [4.656s][info][gc] GC(10) Concurrent Cycle 562.009ms >>>> [5.174s][info][gc] GC(11) Pause Young (Concurrent Start) (G1 >>>> Evacuation Pause) 82M->46M(258M) 24.726ms >>>> [5.174s][info][gc] GC(12) Concurrent Cycle >>>> [5.513s][info][gc] GC(12) Pause Remark 56M->56M(258M) 11.695ms >>>> [5.657s][info][gc] GC(12) Pause Cleanup 62M->62M(258M) 0.176ms >>>> [5.660s][info][gc] GC(12) Concurrent Cycle 486.466ms >>>> [6.430s][info][gc] GC(13) Pause Young (Normal) (G1 Evacuation Pause) >>>> 107M->58M(258M) 55.665ms >>>> [7.538s][info][gc] GC(14) Pause Young (Normal) (G1 Evacuation Pause) >>>> 107M->63M(258M) 58.642ms >>>> [8.724s][info][gc] GC(15) Pause Young (Concurrent Start) (Metadata >>>> GC Threshold) 91M->68M(496M) 47.374ms >>>> [8.724s][info][gc] GC(16) Concurrent Cycle >>>> [9.417s][info][gc] GC(16) Pause Remark 79M->79M(496M) 19.350ms >>>> [9.557s][info][gc] GC(16) Pause Cleanup 81M->81M(496M) 0.340ms >>>> [9.575s][info][gc] GC(16) Concurrent Cycle 850.788ms >>>> [10.954s][info][gc] GC(17) Pause Young (Concurrent Start) (G1 >>>> Evacuation Pause) 134M->85M(496M) 162.779ms >>>> [10.954s][info][gc] GC(18) Concurrent Cycle >>>> [11.389s][info][gc] GC(19) Pause Young (Normal) (G1 Evacuation >>>> Pause) 99M->90M(496M) 79.954ms >>>> [11.519s][info][gc] GC(18) Pause Remark 93M->92M(496M) 24.636ms >>>> [11.896s][info][gc] GC(18) Pause Cleanup 102M->102M(496M) 0.326ms >>>> [11.903s][info][gc] GC(18) Concurrent Cycle 949.231ms >>>> [17.356s][info][gc] GC(22) Pause Young (Concurrent Start) (G1 >>>> Humongous Allocation) 131M->81M(365M) 56.745ms >>>> [17.357s][info][gc] GC(23) Concurrent Cycle >>>> [17.706s][info][gc] GC(23) Pause Remark 116M->113M(365M) 31.351ms >>>> [18.008s][info][gc] GC(23) Pause Cleanup 117M->117M(365M) 0.250ms >>>> [18.018s][info][gc] GC(23) Concurrent Cycle 661.284ms >>>> [18.888s][info][gc] GC(24) Pause Young (Normal) (G1 Evacuation >>>> Pause) 205M->91M(365M) 93.737ms >>>> [20.355s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation >>>> Pause) 324M->136M(378M) 125.037ms >>>> [21.091s][info][gc] GC(26) Pause Young (Normal) (G1 Evacuation >>>> Pause) 295M->150M(378M) 93.607ms >>>> [21.631s][info][gc] GC(27) Pause Young (Concurrent Start) (G1 >>>> Humongous Allocation) 261M->164M(516M) 77.013ms >>>> [21.631s][info][gc] GC(28) Concurrent Cycle >>>> [21.981s][info][gc] GC(28) Pause Remark 212M->169M(516M) 43.679ms >>>> [22.243s][info][gc] GC(28) Pause Cleanup 214M->214M(516M) 0.257ms >>>> [22.250s][info][gc] GC(28) Concurrent Cycle 619.296ms >>>> [22.827s][info][gc] GC(29) Pause Young (Normal) (G1 Evacuation >>>> Pause) 321M->155M(516M) 106.789ms >>>> [23.758s][info][gc] GC(30) Pause Young (Normal) (G1 Evacuation >>>> Pause) 363M->187M(516M) 121.025ms >>>> [24.957s][info][gc] GC(31) Pause Young (Normal) (GCLocker Initiated >>>> GC) 386M->207M(516M) 111.118ms >>>> [25.697s][info][gc] GC(32) Pause Young (Concurrent Start) (G1 >>>> Humongous Allocation) 346M->222M(656M) 104.765ms >>>> [25.697s][info][gc] GC(33) Concurrent Cycle >>>> [26.128s][info][gc] GC(33) Pause Remark 270M->192M(656M) 55.681ms >>>> [26.342s][info][gc] GC(33) Pause Cleanup 203M->203M(656M) 0.371ms >>>> [26.349s][info][gc] GC(33) Concurrent Cycle 651.801ms >>>> [27.421s][info][gc] GC(34) Pause Young (Prepare Mixed) (G1 >>>> Evacuation Pause) 380M->180M(656M) 195.335ms >>>> [27.543s][info][gc] GC(35) Pause Young (Mixed) (G1 Evacuation Pause) >>>> 195M->172M(656M) 38.254ms >>>> [28.694s][info][gc] GC(36) Pause Young (Normal) (G1 Evacuation >>>> Pause) 427M->214M(656M) 107.817ms >>>> [31.109s][info][gc] GC(37) Pause Young (Normal) (G1 Evacuation >>>> Pause) 486M->239M(656M) 117.275ms >>> >>> Notice, the higher memory usage. If a more complex applications is >>> started, the heap continue to grow indefinitely, with long GC pauses and >>> a growing heap. It is my understanding that the JVM should setup default >>> GC and or memory related configuration by itself, be it from the java >>> launcher or JNI_CreateJavaVM. >>> >>> The same custom launcher does not experience this when the embedded JVM >>> is version 8. >>> >>> Note: the launcher is a simple Rust program, not a complex thing, just >>> locate libjvm.so/jvm.dll, build the VM options, Use JNI_CreateJavaVM, >>> locate the main class and invoke the main method. These logs are for the >>> Linux version. >>> >>> Any help is appreciated, or a pointer to the correct mailing list. >>> >>> > From kim.barrett at oracle.com Mon Jan 21 18:49:02 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 21 Jan 2019 13:49:02 -0500 Subject: [13] RFR (S): 8217404: --with-jvm-features doesn't work when multiple features are explicitly disabled In-Reply-To: References: <5a505bea-ca69-3dd0-6748-e2391005b574@oracle.com> <93c9b305-292d-5779-7afe-bb21d7cd9672@oracle.com> Message-ID: > On Jan 21, 2019, at 5:06 AM, Magnus Ihse Bursie wrote: > Vladimir, if you're okay with it I'd like to propose this as a patch to the problem instead: > > http://cr.openjdk.java.net/~ihse/JDK-8217404-fix-multiple-disabled-jvm-features/webrev.01 Looks good! From david.holmes at oracle.com Mon Jan 21 20:38:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 22 Jan 2019 06:38:00 +1000 Subject: High memory usage / leaks was: Best mailing list for JVM embedding In-Reply-To: References: <54df1401-857f-35c4-9973-ab0ac7818194@marcanoonline.com> Message-ID: Hi Robert, I've cc'd core-libs-dev as this is now about signed-jars and the launcher. David On 22/01/2019 4:48 am, Robert Marcano wrote: > On 1/21/19 8:25 AM, Robert Marcano wrote: >> On 1/21/19 5:19 AM, Volker Simonis wrote: >>> -- Moved to hotspot-dev -- >>> >>> Hi Robert, >>> >>> You can use "-XX:+PrintFlagsFinal" and compare the output for the two >>> variants to see if for some reason there are differing option >>> settings. >> >> Thanks, compared the output of that on a java launcher call and my >> launcher and get the same flagsa values, so it doesn't look like >> different defaults isn't the problem > > When testing this, trying to discard some weird Rust / OpenJDK 11 > interactions, I wrote a simpler test case using the JVM invocation API > from plain C. Noticed the same pattern of high memory usage, but it > allowed me to detect there was a difference when using the provided java > launcher and our custom launcher. > > Every VM option was the same (as strings), including the classpath. Both > have something like -Djava.class.path=../lib/a.jar:../lib/b.jar > > But for some error in the configuration of our test environment, ../lib > pointed to different directories for both launchers. Different ../lib > directories with the same JARs, the difference between them is that the > use for the java launcher are unsigned and ../lib for the custom > launcher are signed. These jars are signed because they come from a JNLP > application, the new launcher is part of our migration out of JNLP. > > So I managed to replicate the high memory usage using the standard java > launcher. > > So the question now is, why signed jars could affect the memory usage of > an application (we aren't doing JAR verification on our custom launcher, > yet), just by being on the java.class.path? IIRC the initial application > classpath JARs were never verified previously (by the java launcher > alone, without JNLP around). > > Note: Tested with JARs signed with a self signed certificate and with > one signed with a private CA. At most, signing the JARs could slow down > the start up if it is now expected to these being verified by the java > launcher (is it true?) but not higher memory usage and no reductions > after a GC cycle but constants heap size increases. > > > >> >>> >>> Regards, >>> Volker >>> >>> On Sat, Jan 19, 2019 at 6:23 PM Robert Marcano >>> wrote: >>>> >>>> Greetings, which is the best mailing list for discussions about >>>> embedding the JVM (via JNI_CreateJavaVM)? >>>> >>>> The JVM is being embedded for desktop integration issues, for >>>> example to >>>> show the appropriate application name on the process list instead of >>>> java/java.exe, among many other things. >>>> >>>> I am experiencing what looks like higher memory usage and/or failure to >>>> garbage collect correctly when OpenJDK 11 is the embedded JVM. Starting >>>> a test application using the java launcher, I get a GC log like this: >>>> >>>>> [0.007s][info][gc] Using G1 >>>>> [0.389s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 10M->8M(124M) 8.661ms >>>>> [0.705s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 13M->10M(124M) 6.148ms >>>>> Jan 19, 2019 1:04:26 PM test.Test init >>>>> FINE: Starting application >>>>> [1.376s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 18M->10M(40M) 4.763ms >>>>> [2.288s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 23M->12M(40M) 6.382ms >>>>> [2.444s][info][gc] GC(5) Pause Young (Concurrent Start) (Metadata >>>>> GC Threshold) 18M->12M(48M) 7.579ms >>>>> [2.444s][info][gc] GC(6) Concurrent Cycle >>>>> [2.481s][info][gc] GC(6) Pause Remark 13M->13M(48M) 5.255ms >>>>> [2.498s][info][gc] GC(6) Pause Cleanup 13M->13M(48M) 0.090ms >>>>> [2.499s][info][gc] GC(6) Concurrent Cycle 54.811ms >>>>> [2.905s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 26M->13M(48M) 12.726ms >>>>> [3.204s][info][gc] GC(8) Pause Young (Normal) (GCLocker Initiated >>>>> GC) 29M->15M(48M) 11.216ms >>>>> [3.462s][info][gc] GC(9) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 30M->17M(48M) 18.043ms >>>>> [3.679s][info][gc] GC(10) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 31M->18M(64M) 15.195ms >>>>> [3.933s][info][gc] GC(11) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 38M->20M(64M) 9.412ms >>>>> [4.230s][info][gc] GC(12) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 40M->21M(64M) 16.319ms >>>>> [4.536s][info][gc] GC(13) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 41M->23M(64M) 23.897ms >>>>> [4.750s][info][gc] GC(14) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 43M->24M(94M) 8.776ms >>>>> [5.180s][info][gc] GC(15) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 58M->26M(94M) 15.610ms >>>>> [5.546s][info][gc] GC(16) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 67M->27M(94M) 18.075ms >>>>> [6.058s][info][gc] GC(17) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 69M->30M(94M) 32.625ms >>>>> [7.268s][info][gc] GC(18) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 71M->31M(156M) 18.999ms >>>>> [7.458s][info][gc] GC(19) Pause Young (Concurrent Start) (Metadata >>>>> GC Threshold) 40M->31M(156M) 20.217ms >>>>> [7.459s][info][gc] GC(20) Concurrent Cycle >>>>> [7.676s][info][gc] GC(20) Pause Remark 35M->35M(156M) 19.304ms >>>>> [7.748s][info][gc] GC(20) Pause Cleanup 36M->36M(156M) 0.183ms >>>>> [7.782s][info][gc] GC(20) Concurrent Cycle 323.765ms >>>>> [8.899s][info][gc] GC(21) Pause Young (Concurrent Start) (G1 >>>>> Evacuation Pause) 84M->40M(156M) 69.976ms >>>>> [8.899s][info][gc] GC(22) Concurrent Cycle >>>>> [9.152s][info][gc] GC(22) Pause Remark 47M->47M(156M) 21.133ms >>>>> [9.244s][info][gc] GC(22) Pause Cleanup 49M->49M(156M) 0.127ms >>>>> [9.247s][info][gc] GC(22) Concurrent Cycle 348.256ms >>>>> [10.203s][info][gc] GC(23) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 97M->55M(156M) 74.572ms >>>>> [11.102s][info][gc] GC(24) Pause Full (System.gc()) 100M->21M(77M) >>>>> 115.166ms >>>>> [15.382s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 48M->22M(154M) 6.371ms >>>> >>>> When the same Java 11 JVM is loaded via JNI, with the same VM arguments >>>> than using the java launcher, the log look like this (to reach the same >>>> point at startup) >>>> >>>>> [0.015s][info][gc] Using G1 >>>>> [0.501s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 6M->1M(124M) 8.550ms >>>>> [0.785s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 8M->3M(124M) 6.533ms >>>>> [0.940s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 14M->7M(124M) 17.822ms >>>>> [1.212s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 22M->10M(124M) 11.145ms >>>>> [1.462s][info][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 29M->15M(180M) 16.451ms >>>>> [1.695s][info][gc] GC(5) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 47M->25M(180M) 38.828ms >>>>> [2.041s][info][gc] GC(6) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 55M->30M(180M) 22.151ms >>>>> [2.346s][info][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) >>>>> 80M->49M(180M) 53.093ms >>>>> Jan 19, 2019 1:01:05 PM test.Test init >>>>> FINE: Starting application >>>>> [4.094s][info][gc] GC(9) Pause Young (Concurrent Start) (Metadata >>>>> GC Threshold) 64M->40M(258M) 50.663ms >>>>> [4.094s][info][gc] GC(10) Concurrent Cycle >>>>> [4.480s][info][gc] GC(10) Pause Remark 48M->48M(258M) 7.442ms >>>>> [4.652s][info][gc] GC(10) Pause Cleanup 55M->55M(258M) 0.186ms >>>>> [4.656s][info][gc] GC(10) Concurrent Cycle 562.009ms >>>>> [5.174s][info][gc] GC(11) Pause Young (Concurrent Start) (G1 >>>>> Evacuation Pause) 82M->46M(258M) 24.726ms >>>>> [5.174s][info][gc] GC(12) Concurrent Cycle >>>>> [5.513s][info][gc] GC(12) Pause Remark 56M->56M(258M) 11.695ms >>>>> [5.657s][info][gc] GC(12) Pause Cleanup 62M->62M(258M) 0.176ms >>>>> [5.660s][info][gc] GC(12) Concurrent Cycle 486.466ms >>>>> [6.430s][info][gc] GC(13) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 107M->58M(258M) 55.665ms >>>>> [7.538s][info][gc] GC(14) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 107M->63M(258M) 58.642ms >>>>> [8.724s][info][gc] GC(15) Pause Young (Concurrent Start) (Metadata >>>>> GC Threshold) 91M->68M(496M) 47.374ms >>>>> [8.724s][info][gc] GC(16) Concurrent Cycle >>>>> [9.417s][info][gc] GC(16) Pause Remark 79M->79M(496M) 19.350ms >>>>> [9.557s][info][gc] GC(16) Pause Cleanup 81M->81M(496M) 0.340ms >>>>> [9.575s][info][gc] GC(16) Concurrent Cycle 850.788ms >>>>> [10.954s][info][gc] GC(17) Pause Young (Concurrent Start) (G1 >>>>> Evacuation Pause) 134M->85M(496M) 162.779ms >>>>> [10.954s][info][gc] GC(18) Concurrent Cycle >>>>> [11.389s][info][gc] GC(19) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 99M->90M(496M) 79.954ms >>>>> [11.519s][info][gc] GC(18) Pause Remark 93M->92M(496M) 24.636ms >>>>> [11.896s][info][gc] GC(18) Pause Cleanup 102M->102M(496M) 0.326ms >>>>> [11.903s][info][gc] GC(18) Concurrent Cycle 949.231ms >>>>> [17.356s][info][gc] GC(22) Pause Young (Concurrent Start) (G1 >>>>> Humongous Allocation) 131M->81M(365M) 56.745ms >>>>> [17.357s][info][gc] GC(23) Concurrent Cycle >>>>> [17.706s][info][gc] GC(23) Pause Remark 116M->113M(365M) 31.351ms >>>>> [18.008s][info][gc] GC(23) Pause Cleanup 117M->117M(365M) 0.250ms >>>>> [18.018s][info][gc] GC(23) Concurrent Cycle 661.284ms >>>>> [18.888s][info][gc] GC(24) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 205M->91M(365M) 93.737ms >>>>> [20.355s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 324M->136M(378M) 125.037ms >>>>> [21.091s][info][gc] GC(26) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 295M->150M(378M) 93.607ms >>>>> [21.631s][info][gc] GC(27) Pause Young (Concurrent Start) (G1 >>>>> Humongous Allocation) 261M->164M(516M) 77.013ms >>>>> [21.631s][info][gc] GC(28) Concurrent Cycle >>>>> [21.981s][info][gc] GC(28) Pause Remark 212M->169M(516M) 43.679ms >>>>> [22.243s][info][gc] GC(28) Pause Cleanup 214M->214M(516M) 0.257ms >>>>> [22.250s][info][gc] GC(28) Concurrent Cycle 619.296ms >>>>> [22.827s][info][gc] GC(29) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 321M->155M(516M) 106.789ms >>>>> [23.758s][info][gc] GC(30) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 363M->187M(516M) 121.025ms >>>>> [24.957s][info][gc] GC(31) Pause Young (Normal) (GCLocker Initiated >>>>> GC) 386M->207M(516M) 111.118ms >>>>> [25.697s][info][gc] GC(32) Pause Young (Concurrent Start) (G1 >>>>> Humongous Allocation) 346M->222M(656M) 104.765ms >>>>> [25.697s][info][gc] GC(33) Concurrent Cycle >>>>> [26.128s][info][gc] GC(33) Pause Remark 270M->192M(656M) 55.681ms >>>>> [26.342s][info][gc] GC(33) Pause Cleanup 203M->203M(656M) 0.371ms >>>>> [26.349s][info][gc] GC(33) Concurrent Cycle 651.801ms >>>>> [27.421s][info][gc] GC(34) Pause Young (Prepare Mixed) (G1 >>>>> Evacuation Pause) 380M->180M(656M) 195.335ms >>>>> [27.543s][info][gc] GC(35) Pause Young (Mixed) (G1 Evacuation >>>>> Pause) 195M->172M(656M) 38.254ms >>>>> [28.694s][info][gc] GC(36) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 427M->214M(656M) 107.817ms >>>>> [31.109s][info][gc] GC(37) Pause Young (Normal) (G1 Evacuation >>>>> Pause) 486M->239M(656M) 117.275ms >>>> >>>> Notice, the higher memory usage. If a more complex applications is >>>> started, the heap continue to grow indefinitely, with long GC pauses >>>> and >>>> a growing heap. It is my understanding that the JVM should setup >>>> default >>>> GC and or memory related configuration by itself, be it from the java >>>> launcher or JNI_CreateJavaVM. >>>> >>>> The same custom launcher does not experience this when the embedded JVM >>>> is version 8. >>>> >>>> Note: the launcher is a simple Rust program, not a complex thing, just >>>> locate libjvm.so/jvm.dll, build the VM options, Use JNI_CreateJavaVM, >>>> locate the main class and invoke the main method. These logs are for >>>> the >>>> Linux version. >>>> >>>> Any help is appreciated, or a pointer to the correct mailing list. >>>> >>>> >> > From david.holmes at oracle.com Mon Jan 21 21:42:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 22 Jan 2019 07:42:00 +1000 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics Message-ID: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ This is the "hg backout" of: http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 8217250: Optimize CodeHeap Analytics The fix breaks the build on Windows (and I don't believe it necessarily work as intended). Thanks, David From claes.redestad at oracle.com Mon Jan 21 21:48:59 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 21 Jan 2019 22:48:59 +0100 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> Message-ID: <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> Looks ok to me! (are hg backout:s implicitly trivial?) /Claes On 2019-01-21 22:42, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 > webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ > > This is the "hg backout" of: > > http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 > > 8217250: Optimize CodeHeap Analytics > > The fix breaks the build on Windows (and I don't believe it necessarily > work as intended). > > Thanks, > David From shade at redhat.com Mon Jan 21 21:52:03 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 21 Jan 2019 22:52:03 +0100 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> Message-ID: On 1/21/19 10:42 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 > webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ Okay. > This is the "hg backout" of: > > http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 > > 8217250: Optimize CodeHeap Analytics > > The fix breaks the build on Windows (and I don't believe it necessarily work as intended). Please put some build output into bug? -Aleksey From david.holmes at oracle.com Mon Jan 21 21:50:44 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 22 Jan 2019 07:50:44 +1000 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> Message-ID: <93b26c86-0b50-d935-a493-adc9a589f2c2@oracle.com> On 22/01/2019 7:48 am, Claes Redestad wrote: > Looks ok to me! Thanks Claes! > (are hg backout:s implicitly trivial?) Yes they are. Cheers, David > /Claes > > On 2019-01-21 22:42, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 >> webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ >> >> This is the "hg backout" of: >> >> http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 >> >> 8217250: Optimize CodeHeap Analytics >> >> The fix breaks the build on Windows (and I don't believe it >> necessarily work as intended). >> >> Thanks, >> David From igor.ignatyev at oracle.com Mon Jan 21 21:53:04 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 21 Jan 2019 13:53:04 -0800 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> Message-ID: <599D14A0-7FC0-408A-9307-A1790FC93C92@oracle.com> David, Looks good to me as well. -- Igor > On Jan 21, 2019, at 1:48 PM, Claes Redestad wrote: > > Looks ok to me! > > (are hg backout:s implicitly trivial?) > > /Claes > > On 2019-01-21 22:42, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 >> webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ >> This is the "hg backout" of: >> http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 >> 8217250: Optimize CodeHeap Analytics >> The fix breaks the build on Windows (and I don't believe it necessarily work as intended). >> Thanks, >> David From JESPER.WILHELMSSON at ORACLE.COM Mon Jan 21 21:53:40 2019 From: JESPER.WILHELMSSON at ORACLE.COM (Jesper Wilhelmsson) Date: Mon, 21 Jan 2019 22:53:40 +0100 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> Message-ID: <3DE7EF7E-8C93-44B9-8D44-D51C67887DEB@ORACLE.COM> Looks good to me to. And, yes, hg backouts are considered trivial. /Jesper > 21 jan. 2019 kl. 22:48 skrev Claes Redestad : > > Looks ok to me! > > (are hg backout:s implicitly trivial?) > > /Claes > >> On 2019-01-21 22:42, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 >> webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ >> This is the "hg backout" of: >> http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 >> 8217250: Optimize CodeHeap Analytics >> The fix breaks the build on Windows (and I don't believe it necessarily work as intended). >> Thanks, >> David From claes.redestad at oracle.com Mon Jan 21 21:56:15 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 21 Jan 2019 22:56:15 +0100 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> Message-ID: <599e3731-d326-59ea-df43-9b470018c722@oracle.com> On 2019-01-21 22:52, Aleksey Shipilev wrote: > On 1/21/19 10:42 PM, David Holmes wrote: >> The fix breaks the build on Windows (and I don't believe it necessarily work as intended). > > Please put some build output into bug? Done. Seems like a problem with Windows builds being strict about not having anything precede the precompiled.hpp include. /Claes From david.holmes at oracle.com Mon Jan 21 21:54:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 22 Jan 2019 07:54:36 +1000 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> Message-ID: On 22/01/2019 7:52 am, Aleksey Shipilev wrote: > On 1/21/19 10:42 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 >> webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ > > Okay. Thanks. >> This is the "hg backout" of: >> >> http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 >> >> 8217250: Optimize CodeHeap Analytics >> >> The fix breaks the build on Windows (and I don't believe it necessarily work as intended). > > Please put some build output into bug? Done. The REDO bug is: https://bugs.openjdk.java.net/browse/JDK-8217465 David > -Aleksey > > From david.holmes at oracle.com Mon Jan 21 21:57:31 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 22 Jan 2019 07:57:31 +1000 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: <3DE7EF7E-8C93-44B9-8D44-D51C67887DEB@ORACLE.COM> References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> <3DE7EF7E-8C93-44B9-8D44-D51C67887DEB@ORACLE.COM> Message-ID: <780445c7-9663-1813-66d6-29fb9624d660@oracle.com> Thanks Jesper! David On 22/01/2019 7:53 am, Jesper Wilhelmsson wrote: > Looks good to me to. > And, yes, hg backouts are considered trivial. > /Jesper > >> 21 jan. 2019 kl. 22:48 skrev Claes Redestad : >> >> Looks ok to me! >> >> (are hg backout:s implicitly trivial?) >> >> /Claes >> >>> On 2019-01-21 22:42, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 >>> webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ >>> This is the "hg backout" of: >>> http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 >>> 8217250: Optimize CodeHeap Analytics >>> The fix breaks the build on Windows (and I don't believe it necessarily work as intended). >>> Thanks, >>> David > From david.holmes at oracle.com Mon Jan 21 21:56:58 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 22 Jan 2019 07:56:58 +1000 Subject: URGENT RFR: 8217466: [BACKOUT] Optimize CodeHeap Analytics In-Reply-To: <599D14A0-7FC0-408A-9307-A1790FC93C92@oracle.com> References: <5f7e8b83-f5a2-7001-3a84-532b16234f50@oracle.com> <34d732d2-a70a-31fe-5383-3b2e37aae858@oracle.com> <599D14A0-7FC0-408A-9307-A1790FC93C92@oracle.com> Message-ID: <91d25598-a8e1-1901-6ebf-fe9df26c7451@oracle.com> Thanks Igor! On 22/01/2019 7:53 am, Igor Ignatyev wrote: > David, > > Looks good to me as well. > > -- Igor > >> On Jan 21, 2019, at 1:48 PM, Claes Redestad wrote: >> >> Looks ok to me! >> >> (are hg backout:s implicitly trivial?) >> >> /Claes >> >> On 2019-01-21 22:42, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8217466 >>> webrev: http://cr.openjdk.java.net/~dholmes/8217466/webrev/ >>> This is the "hg backout" of: >>> http://hg.openjdk.java.net/jdk/jdk/rev/6e993d9ae8a7 >>> 8217250: Optimize CodeHeap Analytics >>> The fix breaks the build on Windows (and I don't believe it necessarily work as intended). >>> Thanks, >>> David > From hohensee at amazon.com Tue Jan 22 14:10:13 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 22 Jan 2019 14:10:13 +0000 Subject: RFR (L) 8213501 : Deploy ExceptionJniWrapper for a few tests In-Reply-To: References: <895ef766-9c96-7185-4222-178379629ce4@oracle.com> <04a464fa-c1c8-5d86-3633-0b532840561c@oracle.com> <7ef06464-a614-8941-bb51-ce1c467889b2@oracle.com> <45341168-e7e0-90d1-449f-210500882b8f@oracle.com> <55283958-de3d-07f2-51e3-ad34c5046a96@oracle.com> <31613f88-5f7d-938d-e9f6-69cdaf857268@oracle.com> <839301b7-c247-df3b-e485-283e8bb7388b@oracle.com> <95fe277d-ba6e-4fec-77aa-d1f1051751aa@oracle.com> <72bf2f4a-5bf7-98de-5f00-68485072923d@oracle.com> Message-ID: Lgtm :) Paul ?On 1/14/19, 7:46 AM, "hotspot-dev on behalf of JC Beyler" wrote: Hi all, Friendly ping on this one, I know that it has been a long process with back and forths, to which I apologize... But is there any way I could get a final LGTM for version 6? Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 Thanks! Jc On Tue, Jan 8, 2019 at 10:05 AM JC Beyler wrote: > Happy new year all! > > Could I get a final LGTM for version 6? > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > > Thanks! > Jc > > On Mon, Dec 17, 2018 at 8:43 AM JC Beyler wrote: > >> Hi all, >> >> I don't believe I got actual LGTM for this version: >> >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >> >> >> It removed the namespaces and uses explicit static instead :) >> >> Thanks! >> Jc >> >> On Wed, Dec 12, 2018 at 8:06 PM JC Beyler wrote: >> >>> So did I Alexey but with David & Serguei preferring static, it seems >>> more reasonable to go down their route :-) >>> >>> So here is the latest webrev with static instead of an anonymous >>> namespace: >>> >>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >>> >>> Let me know what you think, can I get a webrev 06 review? >>> >>> Thanks! >>> Jc >>> >>> On Wed, Dec 12, 2018 at 3:10 PM Alex Menkov >>> wrote: >>> >>>> Hm.. >>>> I considered unnamed namespaces "C++ style" (and static globals as "C >>>> style"). >>>> Static globals were deprecated in C++ (but some time ago the >>>> deprecation >>>> was reverted). >>>> >>>> --alex >>>> >>>> On 12/12/2018 13:55, serguei.spitsyn at oracle.com wrote: >>>> > Agreed. >>>> > >>>> > Thanks, >>>> > Serguei >>>> > >>>> > >>>> > On 12/12/18 13:52, David Holmes wrote: >>>> >> FWIW I think namespaces are overkill in all of this test code and >>>> just >>>> >> obfuscates things - the declaration is easily missed. A static >>>> >> variable in a .cpp is clearly a global variable to the file. >>>> >> >>>> >> Cheers, >>>> >> David >>>> >> >>>> >> >>>> >> >>>> >> On 13/12/2018 5:37 am, serguei.spitsyn at oracle.com wrote: >>>> >>> Hi Jc, >>>> >>> >>>> >>> >>>> >>> On 12/11/18 21:16, JC Beyler wrote: >>>> >>>> Hi all, >>>> >>>> >>>> >>>> Here is the new webrev with the TEST.groups change. Serguei, let >>>> me >>>> >>>> know if I convinced you with the static vs anonymous namespaces or >>>> >>>> if you'd still rather have a "static" for now :-) >>>> >>> >>>> >>> >>>> >>> What do you think about this post? : >>>> >>> >>>> https://stackoverflow.com/questions/11623451/static-vs-non-static-variables-in-namespace >>>> >>> >>>> >>> >>>> >>>> >>>> >>>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.05/ >>>> >>>> >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>> >>>> >>> The update looks fine. >>>> >>> >>>> >>> Thanks, >>>> >>> Serguei >>>> >>> >>>> >>> >>>> >>> Thanks, >>>> >>> Serguei >>>> >>> >>>> >>>> >>>> >>>> Thanks again for the reviews! >>>> >>>> Jc >>>> >>>> >>>> >>>> On Mon, Dec 10, 2018 at 3:10 PM JC Beyler >>> >>>> > wrote: >>>> >>>> >>>> >>>> Hi Serguei, >>>> >>>> >>>> >>>> Yes basically it is equivalent :) I can put them in but they >>>> are >>>> >>>> not required. The norm actually wanted to deprecate it but then >>>> >>>> remembered that C compatibility would require the static >>>> key-word >>>> >>>> for this case [1] >>>> >>>> >>>> >>>> So, really, they are not required here and will amount to the >>>> same >>>> >>>> thing: only that file can refer to them and you cannot get to >>>> them >>>> >>>> without a globally available method to return a pointer to them >>>> >>>> (ie same as a static variable in C). >>>> >>>> >>>> >>>> I can put static if it makes it easier to see but, by being in >>>> an >>>> >>>> anonymous namespace they are only available for the file's >>>> >>>> translation unit. For example: >>>> >>>> >>>> >>>> $ cat main.cpp >>>> >>>> >>>> >>>> int totally_global; >>>> >>>> static int explictly_static; >>>> >>>> >>>> >>>> namespace { >>>> >>>> int implicitly_static; >>>> >>>> } >>>> >>>> >>>> >>>> void foo(); >>>> >>>> int main() { >>>> >>>> foo(); >>>> >>>> } >>>> >>>> >>>> >>>> $ g++ -O3 main.cpp -c >>>> >>>> $ nm main.o >>>> >>>> U _GLOBAL_OFFSET_TABLE_ >>>> >>>> 0000000000000000 T main >>>> >>>> 0000000000000000 B totally_global >>>> >>>> U _Z3foov >>>> >>>> >>>> >>>> As you can see, the static and anonymous namespace variables >>>> are >>>> >>>> not in the file due to not being used. If you were to use them, >>>> >>>> you'd see them show up as something like: >>>> >>>> 0000000000000008 b _ZL17explicitly_static >>>> >>>> 0000000000000004 b _ZN12_GLOBAL__N_117implicitly_staticE >>>> >>>> >>>> >>>> Where again, it shows that it is mangling the names so that no >>>> >>>> external usage can happen without tinkering. >>>> >>>> >>>> >>>> Hopefully that helps :-), >>>> >>>> Jc >>>> >>>> >>>> >>>> [1] >>>> >>>> http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1012 >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Dec 10, 2018 at 2:04 PM serguei.spitsyn at oracle.com >>>> >>>> < >>>> serguei.spitsyn at oracle.com >>>> >>>> > wrote: >>>> >>>> >>>> >>>> Hi Jc, >>>> >>>> >>>> >>>> I had little experience with the C++ namespaces. >>>> >>>> My understanding is that static in this context should mean >>>> >>>> internal linkage. >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Serguei >>>> >>>> >>>> >>>> >>>> >>>> On 12/10/18 13:57, JC Beyler wrote: >>>> >>>>> Hi Serguei, >>>> >>>>> >>>> >>>>> The variables and functions are in a anonymous namespace; >>>> my >>>> >>>>> understanding of C++ is that this is equivalent to >>>> putting it >>>> >>>>> as static.Hence, I didn't add them there. Does that make >>>> >>>>> sense? >>>> >>>>> >>>> >>>>> Thanks! >>>> >>>>> Jc >>>> >>>>> >>>> >>>>> On Mon, Dec 10, 2018 at 1:33 PM >>>> serguei.spitsyn at oracle.com >>>> >>>>> >>>> >>>>> >>> >>>>> > wrote: >>>> >>>>> >>>> >>>>> Hi Jc, >>>> >>>>> >>>> >>>>> It looks good in general. >>>> >>>>> One question though. >>>> >>>>> >>>> >>>>> >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a_04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html >>>> >>>>> >>>> >>>>> >>>> >>>>> I wonder if the variables and functions have to be >>>> static. >>>> >>>>> >>>> >>>>> Thanks, >>>> >>>>> Serguei >>>> >>>>> >>>> >>>>> >>>> >>>>> On 12/5/18 11:36, JC Beyler wrote: >>>> >>>>>> Hi all, >>>> >>>>>> >>>> >>>>>> My apologies to having to come back for another >>>> review >>>> >>>>>> for this change: I ran into a snag when trying to >>>> pull >>>> >>>>>> the latest changes compared to the base I was working >>>> >>>>>> on. I basically forgot that there was an issue with >>>> >>>>>> snprintf and that I had solved it via JDK-8213622. >>>> >>>>>> >>>> >>>>>> Could I have a new review of this webrev: >>>> >>>>>> Webrev: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/ >>>> >>>>>> >>>> >>>>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> Incremental from the port of webrev.03 that got >>>> LGTMs: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04/ >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> A few comments on this because it took me a while to >>>> get >>>> >>>>>> things in a state I thought was good: >>>> >>>>>> - I had to implement an itoa method, do we have >>>> >>>>>> something like that in the test base (remember that >>>> >>>>>> JDK-8213622 could not use sprintf due to being in the >>>> >>>>>> test code)? >>>> >>>>>> >>>> >>>>>> - The differences here compared to the one you all >>>> >>>>>> reviewed are: >>>> >>>>>> - I found that adding to the strlen/memcpy >>>> error >>>> >>>>>> prone and thought that I would try to make it less >>>> so. >>>> >>>>>> If you want to compare, I extended the strlen/memcpy >>>> >>>>>> with the new format to show you if you prefer [1] >>>> >>>>>> - Note that the diff between the "old >>>> >>>>>> extended way from [1]" to the webrev.04 can be found >>>> >>>>>> in [2] >>>> >>>>>> >>>> >>>>>> - I added a test to test the exception wrapper >>>> in >>>> >>>>>> tests :); I'm not sure it is deemed useful or not but >>>> >>>>>> helped me assure myself that I was not doing things >>>> >>>>>> wrong; you can find the base test file here [3]; >>>> should >>>> >>>>>> we have this or not? (I know that normally we don't >>>> add >>>> >>>>>> tests to vmTestbase but thought this might be an >>>> >>>>>> exception) >>>> >>>>>> >>>> >>>>>> Thanks for your help and my apologies for the snag, >>>> >>>>>> Jc >>>> >>>>>> >>>> >>>>>> [1]: >>>> >>>>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html >>>> >>>>>> >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html> >>>> >>>> >>>>>> >>>> >>>>>> [2]: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04 >>>> >>>>>> >>>> >>>>>> [3] >>>> >>>>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html >>>> >>>>>> >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html> >>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> On Mon, Dec 3, 2018 at 11:29 PM David Holmes >>>> >>>>>> >>> >>>>>> > wrote: >>>> >>>>>> >>>> >>>>>> Looks fine to me. >>>> >>>>>> >>>> >>>>>> Thanks, >>>> >>>>>> David >>>> >>>>>> >>>> >>>>>> On 4/12/2018 4:04 pm, JC Beyler wrote: >>>> >>>>>> > Hi both, >>>> >>>>>> > >>>> >>>>>> > Thanks for the reviews! Since Serguei did not >>>> >>>>>> insist on get_basename, I >>>> >>>>>> > went for get_dirname since the method is a >>>> local >>>> >>>>>> static method and won't >>>> >>>>>> > have its name start spreading, I think it's ok >>>> too. >>>> >>>>>> > >>>> >>>>>> > For the naming of the local variable, the idea >>>> >>>>>> initially was to use the >>>> >>>>>> > same name as the local variable for JNIEnv >>>> already >>>> >>>>>> used to reduce the >>>> >>>>>> > code change. Since I'm now adding the line >>>> macro >>>> >>>>>> at the end anyway, this >>>> >>>>>> > does not matter anymore so I converged all >>>> local >>>> >>>>>> variables to "jni". >>>> >>>>>> > >>>> >>>>>> > So, without further ado, here is the new >>>> version: >>>> >>>>>> > Webrev: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03/ >>>> >>>>>> >>>> >>>>>> > Bug: >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>>> >>>>>> > This passes the various tests changed by the >>>> >>>>>> webrev on my dev machine. >>>> >>>>>> > >>>> >>>>>> > Let me know what you think, >>>> >>>>>> > Jc >>>> >>>>>> > >>>> >>>>>> > On Mon, Dec 3, 2018 at 8:40 PM >>>> >>>>>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >> wrote: >>>> >>>>>> > >>>> >>>>>> > On 12/3/18 20:15, Chris Plummer wrote: >>>> >>>>>> > > Hi JC, >>>> >>>>>> > > >>>> >>>>>> > > Overall it looks good. A few naming nits >>>> >>>>>> thought: >>>> >>>>>> > > >>>> >>>>>> > > In bi01t001.cpp, why have you declared >>>> the >>>> >>>>>> > ExceptionCheckingJniEnvPtr >>>> >>>>>> > > using jni_env(jni). Elsewhere you use >>>> >>>>>> jni(jni_env) and rename the >>>> >>>>>> > > method argument passed in from jni to >>>> >>>>>> jni_env. >>>> >>>>>> > > >>>> >>>>>> > > Related to this, I also noticed in some >>>> >>>>>> files that already are using >>>> >>>>>> > > ExceptionCheckingJniEnvPtr, such as >>>> >>>>>> CharArrayCriticalLocker.cpp, you >>>> >>>>>> > > delcared it as env(jni_env). So that >>>> means >>>> >>>>>> there are 3 different >>>> >>>>>> > names >>>> >>>>>> > > you have used for the >>>> >>>>>> ExceptionCheckingJniEnvPtr local variable. >>>> >>>>>> > They >>>> >>>>>> > > should be consistent. >>>> >>>>>> > > >>>> >>>>>> > > Also, can you rename get_basename() to >>>> >>>>>> get_dirname()? I know Serguei >>>> >>>>>> > > suggested get_basename() a while back, >>>> but >>>> >>>>>> unless "basename" is >>>> >>>>>> > > commonly used for this purpose, I think >>>> >>>>>> "dirname" is more self >>>> >>>>>> > > explanatory. >>>> >>>>>> > >>>> >>>>>> > In general, I'm Okay with get_dirname(). >>>> >>>>>> > Just to mention dirname can be both short >>>> or >>>> >>>>>> full, so it is a little >>>> >>>>>> > confusing as well. >>>> >>>>>> > It is the reason why the get_basename() was >>>> >>>>>> suggested. >>>> >>>>>> > However, I do not insist on get_basename() >>>> nor >>>> >>>>>> get_full_dirname(). :) >>>> >>>>>> > >>>> >>>>>> > Thanks, >>>> >>>>>> > Serguei >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > thanks, >>>> >>>>>> > > >>>> >>>>>> > > Chris >>>> >>>>>> > > >>>> >>>>>> > > On 12/2/18 10:29 PM, David Holmes wrote: >>>> >>>>>> > >> Hi Jc, >>>> >>>>>> > >> >>>> >>>>>> > >> I've been lurking on this one and have >>>> had >>>> >>>>>> a look through. I'm okay >>>> >>>>>> > >> with the FatalError approach for the >>>> tests >>>> >>>>>> - we don't expect >>>> >>>>>> > anything >>>> >>>>>> > >> to go wrong in a well written test in a >>>> >>>>>> correctly functioning VM. >>>> >>>>>> > >> >>>> >>>>>> > >> Thanks, >>>> >>>>>> > >> David >>>> >>>>>> > >> >>>> >>>>>> > >> >>>> >>>>>> > >> >>>> >>>>>> > >> On 3/12/2018 3:24 pm, JC Beyler wrote: >>>> >>>>>> > >>> Hi all, >>>> >>>>>> > >>> >>>> >>>>>> > >>> Would someone on the GC or runtime >>>> team >>>> >>>>>> be motivated to give >>>> >>>>>> > this a >>>> >>>>>> > >>> review? :) >>>> >>>>>> > >>> >>>> >>>>>> > >>> It would be much appreciated! >>>> >>>>>> > >>> >>>> >>>>>> > >>> Webrev: >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ >>>> >>>>>> >>>> >>>>>> > >>> Bug: >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>> >>>> >>>>>> > >>> Thanks for your help, >>>> >>>>>> > >>> Jc >>>> >>>>>> > >>> >>>> >>>>>> > >>> On Tue, Nov 27, 2018 at 4:36 PM JC >>>> Beyler >>>> >>>>>> >>> > >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>> >>> >>>>>> >>>> >>>>>> >>> >>>>>> >>> wrote: >>>> >>>>>> > >>> >>>> >>>>>> > >>> Hi Chris, >>>> >>>>>> > >>> >>>> >>>>>> > >>> Yes I was waiting for another >>>> review >>>> >>>>>> since you had explicitly >>>> >>>>>> > >>> asked :) >>>> >>>>>> > >>> >>>> >>>>>> > >>> And sounds good that when someone >>>> >>>>>> from GC or runtime gives a >>>> >>>>>> > >>> review, >>>> >>>>>> > >>> I'll wait for your full review on >>>> the >>>> >>>>>> webrev.02! >>>> >>>>>> > >>> >>>> >>>>>> > >>> Thanks again for your help, >>>> >>>>>> > >>> Jc >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> On Tue, Nov 27, 2018 at 12:48 PM >>>> >>>>>> Chris Plummer >>>> >>>>>> > >>> >>> >>>>>> >>>> >>>>>> >>> >>>>>> > >>>> >>>>>> > >>> >>>>>> >>>> >>>>>> >>> >>>>>> >>> >>>> >>>>>> > wrote: >>>> >>>>>> > >>> >>>> >>>>>> > >>> Hi JC, >>>> >>>>>> > >>> >>>> >>>>>> > >>> I think it would be good to >>>> get a >>>> >>>>>> review from the gc or >>>> >>>>>> > runtime >>>> >>>>>> > >>> teams, since this also affects >>>> >>>>>> their tests. >>>> >>>>>> > >>> >>>> >>>>>> > >>> Also, once we are settled on >>>> this >>>> >>>>>> FatalError approach, >>>> >>>>>> > I still >>>> >>>>>> > >>> need to give your webrev-02 a >>>> >>>>>> full review. I only >>>> >>>>>> > skimmed over >>>> >>>>>> > >>> parts of it (I did look at all >>>> >>>>>> the changes in webrevo-01). >>>> >>>>>> > >>> >>>> >>>>>> > >>> thanks, >>>> >>>>>> > >>> >>>> >>>>>> > >>> Chris >>>> >>>>>> > >>> >>>> >>>>>> > >>> On 11/27/18 8:58 AM, >>>> >>>>>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >> wrote: >>>> >>>>>> > >>>> Hi Jc, >>>> >>>>>> > >>>> >>>> >>>>>> > >>>> I've already reviewed this >>>> too. >>>> >>>>>> > >>>> >>>> >>>>>> > >>>> Thanks, >>>> >>>>>> > >>>> Serguei >>>> >>>>>> > >>>> >>>> >>>>>> > >>>> >>>> >>>>>> > >>>> On 11/27/18 06:56, JC Beyler >>>> >>>>>> wrote: >>>> >>>>>> > >>>>> Thanks Chris, >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> Anybody else motivated to look at >>>> this >>>> >>>>>> and review it? :) >>>> >>>>>> > >>>>> Jc >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> On Mon, Nov 26, 2018 at >>>> 1:26 PM >>>> >>>>>> Chris Plummer >>>> >>>>>> > >>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> >>>> >>>>>> > >>>>> wrote: >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> Hi JC, >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> I'm ok with the FatalError approach, >>>> >>>>>> but would >>>> >>>>>> > like to >>>> >>>>>> > >>>>> hear opinions from others also. >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> thanks, >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> Chris >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> On 11/21/18 8:19 AM, JC Beyler >>>> wrote: >>>> >>>>>> > >>>>>> Hi Chris, >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Thanks for taking the >>>> time >>>> >>>>>> to look at it and yes you >>>> >>>>>> > >>>>>> have raised exactly why >>>> >>>>>> the webrev is between two >>>> >>>>>> > >>>>>> worlds: in cases where >>>> a >>>> >>>>>> fatal error on failure is >>>> >>>>>> > >>>>>> wanted, should we >>>> simplify >>>> >>>>>> the code to remove >>>> >>>>>> > the return >>>> >>>>>> > >>>>>> tests since we do them >>>> >>>>>> internally? Now that I've >>>> >>>>>> > looked >>>> >>>>>> > >>>>>> around for non-fatal >>>> >>>>>> cases, I think the answer >>>> >>>>>> > is yes, >>>> >>>>>> > >>>>>> it simplifies the code >>>> >>>>>> while maintaining the checks. >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> I looked a bit and it >>>> >>>>>> seems that I can't find >>>> >>>>>> > easily a >>>> >>>>>> > >>>>>> case where the test >>>> >>>>>> accepts a JNI failure to >>>> >>>>>> > then move >>>> >>>>>> > >>>>>> on. Therefore, perhaps, >>>> >>>>>> for now, the fail with a >>>> >>>>>> > Fatal >>>> >>>>>> > >>>>>> is enough and we can >>>> work >>>> >>>>>> on the tests to clean >>>> >>>>>> > them up? >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> That means that this is >>>> >>>>>> the new webrev with only >>>> >>>>>> > Fatal >>>> >>>>>> > >>>>>> and cleans up the >>>> tests so >>>> >>>>>> that it is no longer in >>>> >>>>>> > >>>>>> between two worlds: >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Webrev: >>>> >>>>>> > >>>>>> >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ >>>> >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> >>>> >>>>>> > >>>>>> Bug: >>>> >>>>>> > >>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> (This passes testing >>>> on my >>>> >>>>>> dev machine for all the >>>> >>>>>> > >>>>>> modified tests) >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> with the example you >>>> >>>>>> provided, it now looks like: >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html >>>> >>>>>> >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Where it does, to me at >>>> >>>>>> least, seem cleaner and less >>>> >>>>>> > >>>>>> "noisy". >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Let me know what you >>>> think, >>>> >>>>>> > >>>>>> Jc >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> On Tue, Nov 20, 2018 at >>>> >>>>>> 9:33 PM Chris Plummer >>>> >>>>>> > >>>>>> < >>>> chris.plummer at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> wrote: >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Hi JC, >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Sorry about the >>>> delay. >>>> >>>>>> I had to go back an >>>> >>>>>> > look at >>>> >>>>>> > >>>>>> the initial 8210842 >>>> >>>>>> webrev and RFR thread to see >>>> >>>>>> > >>>>>> what this was >>>> >>>>>> initially all about. >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> In general the >>>> changes >>>> >>>>>> look good. >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> I don't have a good >>>> >>>>>> answer to your >>>> >>>>>> > >>>>>> FatalError/NonFatalError question. It >>>> makes >>>> >>>>>> > the code >>>> >>>>>> > >>>>>> a lot cleaner to >>>> use >>>> >>>>>> FatalError, but then it >>>> >>>>>> > is a >>>> >>>>>> > >>>>>> behavior change, >>>> and >>>> >>>>>> you also need to deal with >>>> >>>>>> > >>>>>> tests that >>>> >>>>>> intentionally induce errors (do >>>> >>>>>> > you have >>>> >>>>>> > >>>>>> an example of >>>> that). >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> In any case, right >>>> now >>>> >>>>>> your webrev seems to be >>>> >>>>>> > >>>>>> between two worlds. >>>> >>>>>> You are producing >>>> >>>>>> > FatalError, >>>> >>>>>> > >>>>>> but still checking >>>> >>>>>> results. Here's a good >>>> >>>>>> > example: >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html >>>> >>>>>> >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> < >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> I'm not sure if >>>> this >>>> >>>>>> is just a temporary >>>> >>>>>> > state until >>>> >>>>>> > >>>>>> it was decided >>>> which >>>> >>>>>> approach to take. >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> thanks, >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> Chris >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> On 11/20/18 2:14 >>>> PM, >>>> >>>>>> JC Beyler wrote: >>>> >>>>>> > >>>>>>> Hi all, >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Chris thought it >>>> made >>>> >>>>>> sense to have more >>>> >>>>>> > eyes on >>>> >>>>>> > >>>>>>> this change than >>>> just >>>> >>>>>> serviceability as it will >>>> >>>>>> > >>>>>>> modify to tests >>>> that >>>> >>>>>> are not only >>>> >>>>>> > serviceability >>>> >>>>>> > >>>>>>> tests so I've >>>> moved >>>> >>>>>> this to conversation >>>> >>>>>> > here :) >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> For convenience, >>>> I've >>>> >>>>>> copy-pasted the >>>> >>>>>> > initial RFR: >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Could I have a >>>> review >>>> >>>>>> for the extension and >>>> >>>>>> > usage >>>> >>>>>> > >>>>>>> of the >>>> >>>>>> ExceptionJniWrapper. This adds lines and >>>> >>>>>> > >>>>>>> filenames to the >>>> end >>>> >>>>>> of the wrapper JNI >>>> >>>>>> > methods, >>>> >>>>>> > >>>>>>> adds tracing, and >>>> >>>>>> throws an error if need >>>> >>>>>> > be. I've >>>> >>>>>> > >>>>>>> ported the gc/lock >>>> >>>>>> files to use the new >>>> >>>>>> > >>>>>>> TRACE_JNI_CALL >>>> add-on >>>> >>>>>> and I've ported a few >>>> >>>>>> > of the >>>> >>>>>> > >>>>>>> tests that were >>>> >>>>>> already changed for the >>>> >>>>>> > assignment >>>> >>>>>> > >>>>>>> webrev for >>>> >>>>>> JDK-8212884. >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Webrev: >>>> >>>>>> > >>>>>>> >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01 >>>> >>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> >>>> >>>>>> > >>>>>>> Bug: >>>> >>>>>> > >>>>>>> >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> For illustration, >>>> if >>>> >>>>>> I force an error to the >>>> >>>>>> > >>>>>>> AP04/ap04t03 test >>>> and >>>> >>>>>> set the verbosity on, >>>> >>>>>> > I get >>>> >>>>>> > >>>>>>> something like: >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >> Calling JNI >>>> method >>>> >>>>>> FindClass from >>>> >>>>>> > >>>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>> >> Calling with >>>> these >>>> >>>>>> parameter(s): >>>> >>>>>> > >>>>>>> java/lang/Threadd >>>> >>>>>> > >>>>>>> Wait for thread >>>> to >>>> >>>>>> finish >>>> >>>>>> > >>>>>>> << Called JNI >>>> method >>>> >>>>>> FindClass from >>>> >>>>>> > >>>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>> Exception in >>>> thread >>>> >>>>>> "Thread-0" >>>> >>>>>> > >>>>>>> java.lang.NoClassDefFoundError: >>>> >>>>>> > java/lang/Threadd >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>>> >>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Method) >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Caused by: >>>> >>>>>> java.lang.ClassNotFoundException: >>>> >>>>>> > >>>>>>> java.lang.Threadd >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) >>>> >>>>>> > >>>>>>> ... 3 more >>>> >>>>>> > >>>>>>> FATAL ERROR in >>>> native >>>> >>>>>> method: JNI method >>>> >>>>>> > FindClass >>>> >>>>>> > >>>>>>> : internal error >>>> from >>>> >>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>>> >>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Method) >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> at >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> Questions/comments I >>>> >>>>>> have about this are: >>>> >>>>>> > >>>>>>> - Do we want to >>>> >>>>>> force fatal errors when a JNI >>>> >>>>>> > >>>>>>> call fails in >>>> >>>>>> general? Most of these tests >>>> >>>>>> > do the >>>> >>>>>> > >>>>>>> right thing and >>>> test >>>> >>>>>> the return of the JNI >>>> >>>>>> > calls, >>>> >>>>>> > >>>>>>> for example: >>>> >>>>>> > >>>>>>> thrClass = >>>> >>>>>> > jni->FindClass("java/lang/Threadd", >>>> >>>>>> > >>>>>>> TRACE_JNI_CALL); >>>> >>>>>> > >>>>>>> if (thrClass >>>> == >>>> >>>>>> NULL) { >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> but now the >>>> wrapper >>>> >>>>>> actually would do a >>>> >>>>>> > fatal if >>>> >>>>>> > >>>>>>> the FindClass call >>>> >>>>>> would return a nullptr, >>>> >>>>>> > so we >>>> >>>>>> > >>>>>>> could remove that >>>> >>>>>> test altogether. What do you >>>> >>>>>> > >>>>>>> think? >>>> >>>>>> > >>>>>>> - I prefer to >>>> >>>>>> leave them as the tests then >>>> >>>>>> > >>>>>>> become closer to >>>> what >>>> >>>>>> real users would have in >>>> >>>>>> > >>>>>>> their code and is >>>> the >>>> >>>>>> "recommended" way of >>>> >>>>>> > doing it >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> - The >>>> alternative >>>> >>>>>> is to use the >>>> >>>>>> > NonFatalError I >>>> >>>>>> > >>>>>>> added which then >>>> just >>>> >>>>>> prints out that something >>>> >>>>>> > >>>>>>> went wrong, >>>> letting >>>> >>>>>> the test continue. Question >>>> >>>>>> > >>>>>>> will be what >>>> should >>>> >>>>>> be the default? The >>>> >>>>>> > fatal or >>>> >>>>>> > >>>>>>> the non-fatal >>>> error >>>> >>>>>> handling? >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> On a different >>>> >>>>>> subject: >>>> >>>>>> > >>>>>>> - On the new >>>> tests, >>>> >>>>>> I've removed the >>>> >>>>>> > >>>>>>> NSK_JNI_VERIFY >>>> since >>>> >>>>>> the JNI wrapper >>>> >>>>>> > handles the >>>> >>>>>> > >>>>>>> tracing and the >>>> >>>>>> verify in almost the same >>>> >>>>>> > way; only >>>> >>>>>> > >>>>>>> difference I can >>>> >>>>>> really tell is that the >>>> >>>>>> > complain >>>> >>>>>> > >>>>>>> method from NSK >>>> has a >>>> >>>>>> max complain before >>>> >>>>>> > stopping >>>> >>>>>> > >>>>>>> to "complain"; I >>>> have >>>> >>>>>> not added that part >>>> >>>>>> > of the >>>> >>>>>> > >>>>>>> code in this >>>> webrev >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Once we decide on >>>> >>>>>> these, I can continue on the >>>> >>>>>> > >>>>>>> files from >>>> >>>>>> JDK-8212884 and then do both the >>>> >>>>>> > >>>>>>> assignment in an >>>> if >>>> >>>>>> extraction followed-by this >>>> >>>>>> > >>>>>>> type of webrev in >>>> an >>>> >>>>>> easier fashion. >>>> >>>>>> > Depending on >>>> >>>>>> > >>>>>>> decisions here, >>>> >>>>>> NSK*VERIFY can be deprecated as >>>> >>>>>> > >>>>>>> well as we go >>>> forward. >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Thanks! >>>> >>>>>> > >>>>>>> Jc >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> On Mon, Nov 19, >>>> 2018 >>>> >>>>>> at 11:34 AM Chris Plummer >>>> >>>>>> > >>>>>>> < >>>> chris.plummer at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> wrote: >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> On 11/19/18 >>>> 10:07 >>>> >>>>>> AM, JC Beyler wrote: >>>> >>>>>> > >>>>>>>> Hi all, >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> @David/Chris: >>>> >>>>>> should I then push this >>>> >>>>>> > RFR to >>>> >>>>>> > >>>>>>>> the hotspot >>>> >>>>>> mailing or the runtime >>>> >>>>>> > one? For >>>> >>>>>> > >>>>>>>> what it's >>>> worth, >>>> >>>>>> a lot of the tests >>>> >>>>>> > under the >>>> >>>>>> > >>>>>>>> vmTestbase >>>> are >>>> >>>>>> jvmti so the review also >>>> >>>>>> > >>>>>>>> affects >>>> >>>>>> serviceability; it just turns >>>> >>>>>> > out I >>>> >>>>>> > >>>>>>>> started with >>>> the >>>> >>>>>> GC originally and >>>> >>>>>> > then hit >>>> >>>>>> > >>>>>>>> some other >>>> tests >>>> >>>>>> I had touched via the >>>> >>>>>> > >>>>>>>> assignment >>>> >>>>>> extraction. >>>> >>>>>> > >>>>>>> I think >>>> hotspot >>>> >>>>>> would be best. >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> Chris >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> @Serguei: >>>> Done >>>> >>>>>> for the method >>>> >>>>>> > renaming, for >>>> >>>>>> > >>>>>>>> the indent, >>>> are >>>> >>>>>> you talking about >>>> >>>>>> > going from >>>> >>>>>> > >>>>>>>> the 8-indent >>>> to >>>> >>>>>> 4-indent? If so, would >>>> >>>>>> > it not >>>> >>>>>> > >>>>>>>> just be >>>> better >>>> >>>>>> to do a new JBS bug and >>>> >>>>>> > do the >>>> >>>>>> > >>>>>>>> whole files >>>> in >>>> >>>>>> one go? I ask because >>>> >>>>>> > >>>>>>>> otherwise, it >>>> >>>>>> will look a bit weird to >>>> >>>>>> > have >>>> >>>>>> > >>>>>>>> parts of the >>>> >>>>>> file as 8-indent and others >>>> >>>>>> > >>>>>>>> 4-indent? >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> Thanks for >>>> >>>>>> looking at it! >>>> >>>>>> > >>>>>>>> Jc >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> On Mon, Nov >>>> 19, >>>> >>>>>> 2018 at 1:25 AM >>>> >>>>>> > >>>>>>>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> >>> >>>>>> > >>>> >>>>>> > >>>>>>>> >>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> >> >>>> >>>>>> > >>>>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>>>>>>> >>> serguei.spitsyn at oracle.com >>>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> wrote: >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> Hi Jc, >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> We have >>>> to >>>> >>>>>> start this review >>>> >>>>>> > anyway. :) >>>> >>>>>> > >>>>>>>> It looks >>>> >>>>>> good to me in general. >>>> >>>>>> > >>>>>>>> Thank you >>>> >>>>>> for your consistency in this >>>> >>>>>> > >>>>>>>> >>>> refactoring! >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> Some >>>> minor >>>> >>>>>> comments. >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> +static >>>> >>>>>> const char* >>>> >>>>>> > remove_folders(const >>>> >>>>>> > >>>>>>>> char* >>>> >>>>>> fullname) { I'd suggest to >>>> >>>>>> > rename >>>> >>>>>> > >>>>>>>> the >>>> function >>>> >>>>>> name to something >>>> >>>>>> > traditional >>>> >>>>>> > >>>>>>>> like >>>> >>>>>> get_basename. Otherwise, it >>>> >>>>>> > sounds >>>> >>>>>> > >>>>>>>> like this >>>> >>>>>> function has to really >>>> >>>>>> > remove >>>> >>>>>> > >>>>>>>> folders. >>>> :) >>>> >>>>>> Also, all *Locker.cpp have >>>> >>>>>> > >>>>>>>> wrong >>>> indent >>>> >>>>>> in the bodies of if >>>> >>>>>> > and while >>>> >>>>>> > >>>>>>>> >>>> statements. >>>> >>>>>> Could this be fixed >>>> >>>>>> > with the >>>> >>>>>> > >>>>>>>> >>>> refactoring? >>>> >>>>>> I did not look on how >>>> >>>>>> > this >>>> >>>>>> > >>>>>>>> impacts >>>> the >>>> >>>>>> tests other than >>>> >>>>>> > >>>>>>>> serviceability. >>>> Thanks, >>>> >>>>>> Serguei >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> On >>>> 11/16/18 >>>> >>>>>> 19:43, JC Beyler wrote: >>>> >>>>>> > >>>>>>>>> Hi all, >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Anybody >>>> >>>>>> motivated to review this? :) >>>> >>>>>> > >>>>>>>>> Jc >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> On Wed, Nov >>>> 7, >>>> >>>>>> 2018 at 9:53 PM JC >>>> >>>>>> > Beyler >>>> >>>>>> > >>>>>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>> >>>>>> >>>> >>>>>> > >>> >>>>>> >>> wrote: >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Hi all, >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Could I >>>> have >>>> >>>>>> a review for the >>>> >>>>>> > >>>>>>>>> extension >>>> >>>>>> and usage of the >>>> >>>>>> > >>>>>>>>> ExceptionJniWrapper. This >>>> >>>>>> > adds lines >>>> >>>>>> > >>>>>>>>> and >>>> >>>>>> filenames to the end of the >>>> >>>>>> > >>>>>>>>> wrapper >>>> JNI >>>> >>>>>> methods, adds >>>> >>>>>> > tracing, >>>> >>>>>> > >>>>>>>>> and >>>> throws >>>> >>>>>> an error if need >>>> >>>>>> > be. I've >>>> >>>>>> > >>>>>>>>> ported >>>> the >>>> >>>>>> gc/lock files to >>>> >>>>>> > use the >>>> >>>>>> > >>>>>>>>> new >>>> >>>>>> TRACE_JNI_CALL add-on and >>>> >>>>>> > I've >>>> >>>>>> > >>>>>>>>> ported a >>>> few >>>> >>>>>> of the tests >>>> >>>>>> > that were >>>> >>>>>> > >>>>>>>>> already >>>> >>>>>> changed for the >>>> >>>>>> > assignment >>>> >>>>>> > >>>>>>>>> webrev >>>> for >>>> >>>>>> JDK-8212884. >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Webrev: >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.00/ >>>> >>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> >>>> >>>>>> > >>>>>>>>> Bug: >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> For >>>> >>>>>> illustration, if I force >>>> >>>>>> > an error >>>> >>>>>> > >>>>>>>>> to the >>>> >>>>>> AP04/ap04t03 test and >>>> >>>>>> > set the >>>> >>>>>> > >>>>>>>>> verbosity >>>> >>>>>> on, I get something >>>> >>>>>> > like: >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> >> >>>> Calling >>>> >>>>>> JNI method >>>> >>>>>> > FindClass from >>>> >>>>>> > >>>>>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>>>> >> >>>> Calling >>>> >>>>>> with these >>>> >>>>>> > parameter(s): >>>> >>>>>> > >>>>>>>>> java/lang/Threadd >>>> >>>>>> > >>>>>>>>> Wait for >>>> >>>>>> thread to finish >>>> >>>>>> > >>>>>>>>> << Called >>>> >>>>>> JNI method >>>> >>>>>> > FindClass from >>>> >>>>>> > >>>>>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>>>> >>>> Exception in >>>> >>>>>> thread "Thread-0" >>>> >>>>>> > >>>>>>>>> java.lang.NoClassDefFoundError: >>>> >>>>>> > >>>>>>>>> java/lang/Threadd >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>>> >>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Method) >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Caused >>>> by: >>>> >>>>>> > >>>>>>>>> java.lang.ClassNotFoundException: >>>> >>>>>> > >>>>>>>>> java.lang.Threadd >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) >>>> >>>>>> > >>>>>>>>> ... 3 >>>> more >>>> >>>>>> > >>>>>>>>> FATAL >>>> ERROR >>>> >>>>>> in native method: JNI >>>> >>>>>> > >>>>>>>>> method >>>> >>>>>> FindClass : internal error >>>> >>>>>> > >>>>>>>>> from >>>> >>>>>> ap04t003.cpp:343 >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native >>>> >>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Method) >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) >>>> >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> at >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> >>>> nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Questions/comments I have about >>>> >>>>>> > >>>>>>>>> this are: >>>> >>>>>> > >>>>>>>>> - Do we >>>> >>>>>> want to force fatal >>>> >>>>>> > errors >>>> >>>>>> > >>>>>>>>> when a >>>> JNI >>>> >>>>>> call fails in general? >>>> >>>>>> > >>>>>>>>> Most of >>>> >>>>>> these tests do the right >>>> >>>>>> > >>>>>>>>> thing and >>>> >>>>>> test the return of >>>> >>>>>> > the JNI >>>> >>>>>> > >>>>>>>>> calls, >>>> for >>>> >>>>>> example: >>>> >>>>>> > >>>>>>>>> thrClass >>>> = >>>> >>>>>> > >>>>>>>>> jni->FindClass("java/lang/Threadd", >>>> >>>>>> > >>>>>>>>> TRACE_JNI_CALL); >>>> >>>>>> > >>>>>>>>> if >>>> >>>>>> (thrClass == NULL) { >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> but now >>>> the >>>> >>>>>> wrapper actually >>>> >>>>>> > would do >>>> >>>>>> > >>>>>>>>> a fatal >>>> if >>>> >>>>>> the FindClass call >>>> >>>>>> > would >>>> >>>>>> > >>>>>>>>> return a >>>> >>>>>> nullptr, so we could >>>> >>>>>> > remove >>>> >>>>>> > >>>>>>>>> that test >>>> >>>>>> altogether. What do >>>> >>>>>> > you >>>> >>>>>> > >>>>>>>>> think? >>>> >>>>>> > >>>>>>>>> - I >>>> >>>>>> prefer to leave them >>>> >>>>>> > as the >>>> >>>>>> > >>>>>>>>> tests >>>> then >>>> >>>>>> become closer to >>>> >>>>>> > what real >>>> >>>>>> > >>>>>>>>> users >>>> would >>>> >>>>>> have in their >>>> >>>>>> > code and is >>>> >>>>>> > >>>>>>>>> the >>>> >>>>>> "recommended" way of doing it >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> - The >>>> >>>>>> alternative is to >>>> >>>>>> > use the >>>> >>>>>> > >>>>>>>>> NonFatalError I >>>> added >>>> >>>>>> which >>>> >>>>>> > then just >>>> >>>>>> > >>>>>>>>> prints >>>> out >>>> >>>>>> that something >>>> >>>>>> > went wrong, >>>> >>>>>> > >>>>>>>>> letting >>>> the >>>> >>>>>> test continue. >>>> >>>>>> > Question >>>> >>>>>> > >>>>>>>>> will be >>>> what >>>> >>>>>> should be the >>>> >>>>>> > default? >>>> >>>>>> > >>>>>>>>> The >>>> fatal or >>>> >>>>>> the non-fatal error >>>> >>>>>> > >>>>>>>>> handling? >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> On a >>>> >>>>>> different subject: >>>> >>>>>> > >>>>>>>>> - On >>>> the >>>> >>>>>> new tests, I've >>>> >>>>>> > removed >>>> >>>>>> > >>>>>>>>> the >>>> >>>>>> NSK_JNI_VERIFY since the JNI >>>> >>>>>> > >>>>>>>>> wrapper >>>> >>>>>> handles the tracing >>>> >>>>>> > and the >>>> >>>>>> > >>>>>>>>> verify in >>>> >>>>>> almost the same >>>> >>>>>> > way; only >>>> >>>>>> > >>>>>>>>> >>>> difference I >>>> >>>>>> can really tell >>>> >>>>>> > is that >>>> >>>>>> > >>>>>>>>> the >>>> complain >>>> >>>>>> method from NSK >>>> >>>>>> > has a >>>> >>>>>> > >>>>>>>>> max >>>> complain >>>> >>>>>> before stopping to >>>> >>>>>> > >>>>>>>>> >>>> "complain"; >>>> >>>>>> I have not added that >>>> >>>>>> > >>>>>>>>> part of >>>> the >>>> >>>>>> code in this webrev >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Once we >>>> >>>>>> decide on these, I can >>>> >>>>>> > >>>>>>>>> continue >>>> on >>>> >>>>>> the files from >>>> >>>>>> > >>>>>>>>> >>>> JDK-8212884 >>>> >>>>>> and then do both the >>>> >>>>>> > >>>>>>>>> >>>> assignment >>>> >>>>>> in an if extraction >>>> >>>>>> > >>>>>>>>> >>>> followed-by >>>> >>>>>> this type of >>>> >>>>>> > webrev in an >>>> >>>>>> > >>>>>>>>> easier >>>> >>>>>> fashion. Depending on >>>> >>>>>> > >>>>>>>>> decisions >>>> >>>>>> here, NSK*VERIFY can be >>>> >>>>>> > >>>>>>>>> >>>> deprecated >>>> >>>>>> as well as we go >>>> >>>>>> > forward. >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> Thank you >>>> >>>>>> for the >>>> >>>>>> > reviews/comments :) >>>> >>>>>> > >>>>>>>>> Jc >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>>>>>>> -- >>>> >>>>>> > >>>>>>>>> Thanks, >>>> >>>>>> > >>>>>>>>> Jc >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>>>>>> -- >>>> >>>>>> > >>>>>>>> Thanks, >>>> >>>>>> > >>>>>>>> Jc >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> >>>> >>>>>> > >>>>>>> -- >>>> >>>>>> > >>>>>>> Thanks, >>>> >>>>>> > >>>>>>> Jc >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>>>> -- >>>> >>>>>> > >>>>>> Thanks, >>>> >>>>>> > >>>>>> Jc >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>>> -- >>>> >>>>>> > >>>>> Thanks, >>>> >>>>>> > >>>>> Jc >>>> >>>>>> > >>>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> -- >>>> >>>>>> > >>> Thanks, >>>> >>>>>> > >>> Jc >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> >>>> >>>>>> > >>> -- >>>> >>>>>> > >>> >>>> >>>>>> > >>> Thanks, >>>> >>>>>> > >>> Jc >>>> >>>>>> > > >>>> >>>>>> > > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > -- >>>> >>>>>> > >>>> >>>>>> > Thanks, >>>> >>>>>> > Jc >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> -- >>>> >>>>>> Thanks, >>>> >>>>>> Jc >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> -- >>>> >>>>> Thanks, >>>> >>>>> Jc >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> >>>> Jc >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Jc >>>> >>> >>>> > >>>> >>> >>> >>> -- >>> >>> Thanks, >>> Jc >>> >> >> >> -- >> >> Thanks, >> Jc >> > > > -- > > Thanks, > Jc > -- Thanks, Jc From robbin.ehn at oracle.com Tue Jan 22 15:39:26 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 22 Jan 2019 16:39:26 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: <5d6d97d6-fbda-28f4-b625-a9ae351af5ba@oracle.com> Hi all, here is v01 and v02. v01 contains update after comments from list: http://cr.openjdk.java.net/~rehn/8203469/v01/ http://cr.openjdk.java.net/~rehn/8203469/v01/inc/ v02 contains a bug fix, explained below: http://cr.openjdk.java.net/~rehn/8203469/v02/ http://cr.openjdk.java.net/~rehn/8203469/v02/inc/ Patricio had some good questions about try_stable_load_state. In previous internal versions I have done the stable load by loading thread state before and after safepoint id. For some reason I changed during a refactoring to the reverse, which is incorrect. Consider the following: JavaThread: state / safepoint id / poll |VMThread: global state / safepoint counter / WaitBarrier ########################################|################################ _thread_in_native / 0 / disarmed | _not_synchronized / 0 / disarmed | _not_synchronized / 0 / armed(1) | _not_synchronized / 1 / armed(1) | _synchronizing / 1 / armed(1) _thread_in_native / 0 / armed | | | | | _synchonized / 1 / armed(1) | _thread_in_native_trans / 0 / armed | | | | _not_synchonized / 1 / armed(1) | _not_synchonized / 2 / armed(1) _thread_in_native_trans / 0 / disarmed | | _not_synchonized / 2 / disarmed Next safepoint starts: | _not_synchronized / 2 / armed(3) | _not_synchronized / 3 / armed(3) | _synchronizing / 3 / armed(3) _thread_in_native_trans / 0 / armed | | | | _thread_in_native_trans / 1 / armed | _thread_blocked / 1 / armed | | | _thread_in_native_trans / 1 / armed | _thread_in_native_trans / 0 / armed | | A false positive is read. When do it the correct the safe matrix looks like: State load 1 | Safepoint id | State load 2 | Result ##################|##############|##################|####### any | !0/current | any | treat all as unsafe any | any | !state1 | treat all as unsafe any | 0/current | state1 | suspend flag is safe thread_in_native | 0/current | thread_in_native | safe thread_in_blocked | 0/current | thread_in_blocked| safe !thread_in_blocked && !thread_in_native | 0/current | state1 | unsafe The case with blocked/0/blocked I added this comment for: 755 // To handle the thread_blocked state on the backedge of the WaitBarrier from 756 // previous safepoint and reading the resetted (0/InactiveSafepointCounter) we 757 // re-read state after we read thread safepoint id. The JavaThread changes it 758 // state before resetting, the second read will either see a different thread 759 // state making this an unsafe state or it can see blocked again. 760 // When we see blocked twice with a 0 safepoint id, either: 761 // - It is normally blocked, e.g. on Mutex, TBIVM. 762 // - It was in SS:block(), looped around to SS:block() and is blocked on the WaitBarrier. 763 // - It was in SS:block() but now on a Mutex. 764 // Either case safe. I hope above explains why loading state before and after safepoint id is sufficient. Passes, with flying colors, t1-5, stress test, KS 24h stress. Thanks, Robbin On 1/15/19 11:39 AM, Robbin Ehn wrote: > Hi all, please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 > Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ > > Thanks to Dan for pre-reviewing a lot! > > Background: > ZGC often does very short safepoint operations. For a perspective, in a > specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While > in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which > operation it is. The time it takes to stop and start the JavaThreads is relative > very large to a ZGC safepoint. With an operation that just takes 0.2ms the > overhead of stopping and starting JavaThreads is several times the operation. > > High-level functionality change: > Serializing the starting over Threads_lock takes time. > - Don't wait on Threads_lock use the WaitBarrier. > Serializing the stopping over Safepoint_lock takes time. > - Let threads stop in parallel, remove Safepoint_lock. > > Details: > JavaThreads have 2 abstract logical states: unsafe or safe. > - Safe means the JavaThread will not touch Java heap or VM internal structures > ? without doing a transition and block before doing so. > ??????? - The safe states are: > ??????????????? - When polls armed: _thread_in_native and _thread_blocked. > ??????????????? - When Threads_lock is held: externally suspended flag is set. > ??????? - VM Thread have polls armed and holds the Threads_lock during a > ????????? safepoint. > - Unsafe means that either Java heap or VM internal structures can be accessed > ? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. > ??????? - All combination that are not safe are unsafe. > > We cannot start a safepoint until all unsafe threads have transitioned to a safe > state. To make them safe, we arm polls in compiled code and make sure any > transition to another unsafe state will be blocked. JavaThreads which are unsafe > with state _thread_in_Java may transition to _thread_in_native without being > blocked, since it just became a safe thread and we can proceed. Any safe thread > may try to transition at any time to an unsafe state, thus coming into the > safepoint blocking code at any moment, e.g., after the safepoint is over, or > even at the beginning of next safepoint. > > The VMThread cannot tolerate false positives from the JavaThread thread state > because that would mean starting the safepoint without all JavaThreads being > safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe > false positives from the safepoint blocking code, if we remove them, how do we > handle false positives? > > By first publishing which barrier tag (safepoint counter) we will call > WaitBarrier.wait() with as the threads safepoint id and then change the state to > _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of > the state. A stable load of the thread state is successful if the thread > safepoint id is the same both before and after the load of the state and > safepoint id is current or InactiveSafepointCounter. If the stable load fails, > the thread is considered safepoint unsafe. It's no longer enough that thread is > have state _thread_blocked it must also have correct safepoint id before and > after we read the state. > > Performance: > The result of faster safepoints is that the average CPU time for JavaThreads > between safepoints is higher, thus increasing the allocation rate. The thread > that stops first waits shorter time until it gets started. Even the thread that > stops last also have shorter stop since we start them faster. If your > application is using a concurrent GC it may need re-tunning since each java > worker thread have an increased CPU time/allocation rate. Often this means max > performance is achieved using slightly less java worker threads than before. > Also the increase allocation rate means shorter time between GC safepoints. > - If you are using a non-concurrent GC, you should see improved latency and > ? throughput. > - After re-tunning with a concurrent GC throughput should be equal or better but > ? with better latency. But bear in mind this is a latency patch, not a > ? throughput one. > With current code a java thread is not to guarantee to run between safepoint (in > theory a java thread can be starved indefinitely), since the VM thread may > re-grab the Threads_locks before it woke up from previous safepoint. If the > GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very > over-provisioned this can happen. > The current schema thus re-safepoint quickly if the java threads have not > started yet at the cost of latency. Since the new code uses the WaitBarrier with > the safepoint counter, all threads must roll forward to next safepoint by > getting at least some CPU time between two safepoints. Meaning MMU violations > are more obvious. > > Some examples on numbers: > - On a 16 strand machine synchronization and un-synchronization/starting is at > ? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and > ? starting ~400->~100us. > ? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). > - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster > ? synchronization time on 16 strands and ~5% score increase. In this case the GC > ? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. > - specJBB2015 ParGC ~9% increase in critical-jops. > > Thanks, Robbin From lois.foltan at oracle.com Tue Jan 22 16:10:13 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Tue, 22 Jan 2019 11:10:13 -0500 Subject: RFR (S) JDK-8216970: condy causes JVM crash In-Reply-To: References: Message-ID: Updated webrev that includes preliminary comments from John Rose. open webrev at: http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.2/webrev/ Thanks, Lois On 1/18/2019 1:50 PM, Lois Foltan wrote: > Please review this change that allows escape analysis to correctly > handle a dynamic constant whose return type is an array. > > open webrev at: > http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.1/webrev/ > bug link: https://bugs.openjdk.java.net/browse/JDK-8216970 > > Testing: hs-tier1-3, jdk-tier1-3 (all platforms).? hs-tier4-5 (linux > only) > > Thanks, > Lois > > > From jcbeyler at google.com Tue Jan 22 18:29:21 2019 From: jcbeyler at google.com (JC Beyler) Date: Tue, 22 Jan 2019 10:29:21 -0800 Subject: RFR (L) 8213501 : Deploy ExceptionJniWrapper for a few tests In-Reply-To: References: <895ef766-9c96-7185-4222-178379629ce4@oracle.com> <04a464fa-c1c8-5d86-3633-0b532840561c@oracle.com> <7ef06464-a614-8941-bb51-ce1c467889b2@oracle.com> <45341168-e7e0-90d1-449f-210500882b8f@oracle.com> <55283958-de3d-07f2-51e3-ad34c5046a96@oracle.com> <31613f88-5f7d-938d-e9f6-69cdaf857268@oracle.com> <839301b7-c247-df3b-e485-283e8bb7388b@oracle.com> <95fe277d-ba6e-4fec-77aa-d1f1051751aa@oracle.com> <72bf2f4a-5bf7-98de-5f00-68485072923d@oracle.com> Message-ID: Thanks Paul! Anybody else for the review for version 6? Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 Thanks, Jc On Tue, Jan 22, 2019 at 6:10 AM Hohensee, Paul wrote: > Lgtm :) > > Paul > > ?On 1/14/19, 7:46 AM, "hotspot-dev on behalf of JC Beyler" < > hotspot-dev-bounces at openjdk.java.net on behalf of jcbeyler at google.com> > wrote: > > Hi all, > > Friendly ping on this one, I know that it has been a long process with > back > and forths, to which I apologize... > > But is there any way I could get a final LGTM for version 6? > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > > Thanks! > Jc > > On Tue, Jan 8, 2019 at 10:05 AM JC Beyler wrote: > > > Happy new year all! > > > > Could I get a final LGTM for version 6? > > > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > > > > Thanks! > > Jc > > > > On Mon, Dec 17, 2018 at 8:43 AM JC Beyler > wrote: > > > >> Hi all, > >> > >> I don't believe I got actual LGTM for this version: > >> > >> > >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > >> > >> > >> It removed the namespaces and uses explicit static instead :) > >> > >> Thanks! > >> Jc > >> > >> On Wed, Dec 12, 2018 at 8:06 PM JC Beyler > wrote: > >> > >>> So did I Alexey but with David & Serguei preferring static, it > seems > >>> more reasonable to go down their route :-) > >>> > >>> So here is the latest webrev with static instead of an anonymous > >>> namespace: > >>> > >>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > >>> > >>> Let me know what you think, can I get a webrev 06 review? > >>> > >>> Thanks! > >>> Jc > >>> > >>> On Wed, Dec 12, 2018 at 3:10 PM Alex Menkov < > alexey.menkov at oracle.com> > >>> wrote: > >>> > >>>> Hm.. > >>>> I considered unnamed namespaces "C++ style" (and static globals > as "C > >>>> style"). > >>>> Static globals were deprecated in C++ (but some time ago the > >>>> deprecation > >>>> was reverted). > >>>> > >>>> --alex > >>>> > >>>> On 12/12/2018 13:55, serguei.spitsyn at oracle.com wrote: > >>>> > Agreed. > >>>> > > >>>> > Thanks, > >>>> > Serguei > >>>> > > >>>> > > >>>> > On 12/12/18 13:52, David Holmes wrote: > >>>> >> FWIW I think namespaces are overkill in all of this test code > and > >>>> just > >>>> >> obfuscates things - the declaration is easily missed. A static > >>>> >> variable in a .cpp is clearly a global variable to the file. > >>>> >> > >>>> >> Cheers, > >>>> >> David > >>>> >> > >>>> >> > >>>> >> > >>>> >> On 13/12/2018 5:37 am, serguei.spitsyn at oracle.com wrote: > >>>> >>> Hi Jc, > >>>> >>> > >>>> >>> > >>>> >>> On 12/11/18 21:16, JC Beyler wrote: > >>>> >>>> Hi all, > >>>> >>>> > >>>> >>>> Here is the new webrev with the TEST.groups change. Serguei, > let > >>>> me > >>>> >>>> know if I convinced you with the static vs anonymous > namespaces or > >>>> >>>> if you'd still rather have a "static" for now :-) > >>>> >>> > >>>> >>> > >>>> >>> What do you think about this post? : > >>>> >>> > >>>> > https://stackoverflow.com/questions/11623451/static-vs-non-static-variables-in-namespace > >>>> >>> > >>>> >>> > >>>> >>>> > >>>> >>>> Webrev: > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.05/ > >>>> >>>> > >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > >>>> >>> > >>>> >>> The update looks fine. > >>>> >>> > >>>> >>> Thanks, > >>>> >>> Serguei > >>>> >>> > >>>> >>> > >>>> >>> Thanks, > >>>> >>> Serguei > >>>> >>> > >>>> >>>> > >>>> >>>> Thanks again for the reviews! > >>>> >>>> Jc > >>>> >>>> > >>>> >>>> On Mon, Dec 10, 2018 at 3:10 PM JC Beyler < > jcbeyler at google.com > >>>> >>>> > wrote: > >>>> >>>> > >>>> >>>> Hi Serguei, > >>>> >>>> > >>>> >>>> Yes basically it is equivalent :) I can put them in but > they > >>>> are > >>>> >>>> not required. The norm actually wanted to deprecate it > but then > >>>> >>>> remembered that C compatibility would require the static > >>>> key-word > >>>> >>>> for this case [1] > >>>> >>>> > >>>> >>>> So, really, they are not required here and will amount > to the > >>>> same > >>>> >>>> thing: only that file can refer to them and you cannot > get to > >>>> them > >>>> >>>> without a globally available method to return a pointer > to them > >>>> >>>> (ie same as a static variable in C). > >>>> >>>> > >>>> >>>> I can put static if it makes it easier to see but, by > being in > >>>> an > >>>> >>>> anonymous namespace they are only available for the > file's > >>>> >>>> translation unit. For example: > >>>> >>>> > >>>> >>>> $ cat main.cpp > >>>> >>>> > >>>> >>>> int totally_global; > >>>> >>>> static int explictly_static; > >>>> >>>> > >>>> >>>> namespace { > >>>> >>>> int implicitly_static; > >>>> >>>> } > >>>> >>>> > >>>> >>>> void foo(); > >>>> >>>> int main() { > >>>> >>>> foo(); > >>>> >>>> } > >>>> >>>> > >>>> >>>> $ g++ -O3 main.cpp -c > >>>> >>>> $ nm main.o > >>>> >>>> U _GLOBAL_OFFSET_TABLE_ > >>>> >>>> 0000000000000000 T main > >>>> >>>> 0000000000000000 B totally_global > >>>> >>>> U _Z3foov > >>>> >>>> > >>>> >>>> As you can see, the static and anonymous namespace > variables > >>>> are > >>>> >>>> not in the file due to not being used. If you were to > use them, > >>>> >>>> you'd see them show up as something like: > >>>> >>>> 0000000000000008 b _ZL17explicitly_static > >>>> >>>> 0000000000000004 b _ZN12_GLOBAL__N_117implicitly_staticE > >>>> >>>> > >>>> >>>> Where again, it shows that it is mangling the names so > that no > >>>> >>>> external usage can happen without tinkering. > >>>> >>>> > >>>> >>>> Hopefully that helps :-), > >>>> >>>> Jc > >>>> >>>> > >>>> >>>> [1] > >>>> >>>> > http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1012 > >>>> >>>> > >>>> >>>> > >>>> >>>> On Mon, Dec 10, 2018 at 2:04 PM > serguei.spitsyn at oracle.com > >>>> >>>> < > >>>> serguei.spitsyn at oracle.com > >>>> >>>> > wrote: > >>>> >>>> > >>>> >>>> Hi Jc, > >>>> >>>> > >>>> >>>> I had little experience with the C++ namespaces. > >>>> >>>> My understanding is that static in this context > should mean > >>>> >>>> internal linkage. > >>>> >>>> > >>>> >>>> Thanks, > >>>> >>>> Serguei > >>>> >>>> > >>>> >>>> > >>>> >>>> On 12/10/18 13:57, JC Beyler wrote: > >>>> >>>>> Hi Serguei, > >>>> >>>>> > >>>> >>>>> The variables and functions are in a anonymous > namespace; > >>>> my > >>>> >>>>> understanding of C++ is that this is equivalent to > >>>> putting it > >>>> >>>>> as static.Hence, I didn't add them there. Does that > make > >>>> >>>>> sense? > >>>> >>>>> > >>>> >>>>> Thanks! > >>>> >>>>> Jc > >>>> >>>>> > >>>> >>>>> On Mon, Dec 10, 2018 at 1:33 PM > >>>> serguei.spitsyn at oracle.com > >>>> >>>>> > >>>> >>>>> >>>> >>>>> > wrote: > >>>> >>>>> > >>>> >>>>> Hi Jc, > >>>> >>>>> > >>>> >>>>> It looks good in general. > >>>> >>>>> One question though. > >>>> >>>>> > >>>> >>>>> > >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a_04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html > >>>> >>>>> > >>>> >>>>> > >>>> >>>>> I wonder if the variables and functions have to > be > >>>> static. > >>>> >>>>> > >>>> >>>>> Thanks, > >>>> >>>>> Serguei > >>>> >>>>> > >>>> >>>>> > >>>> >>>>> On 12/5/18 11:36, JC Beyler wrote: > >>>> >>>>>> Hi all, > >>>> >>>>>> > >>>> >>>>>> My apologies to having to come back for another > >>>> review > >>>> >>>>>> for this change: I ran into a snag when trying > to > >>>> pull > >>>> >>>>>> the latest changes compared to the base I was > working > >>>> >>>>>> on. I basically forgot that there was an issue > with > >>>> >>>>>> snprintf and that I had solved it via > JDK-8213622. > >>>> >>>>>> > >>>> >>>>>> Could I have a new review of this webrev: > >>>> >>>>>> Webrev: > >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/ > >>>> >>>>>> > > >>>> >>>>>> Bug: > >>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > >>>> >>>>>> Incremental from the port of webrev.03 that got > >>>> LGTMs: > >>>> >>>>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04/ > >>>> >>>>>> < > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a_04/> > >>>> >>>>>> > >>>> >>>>>> A few comments on this because it took me a > while to > >>>> get > >>>> >>>>>> things in a state I thought was good: > >>>> >>>>>> - I had to implement an itoa method, do we > have > >>>> >>>>>> something like that in the test base (remember > that > >>>> >>>>>> JDK-8213622 could not use sprintf due to being > in the > >>>> >>>>>> test code)? > >>>> >>>>>> > >>>> >>>>>> - The differences here compared to the one > you all > >>>> >>>>>> reviewed are: > >>>> >>>>>> - I found that adding to the > strlen/memcpy > >>>> error > >>>> >>>>>> prone and thought that I would try to make it > less > >>>> so. > >>>> >>>>>> If you want to compare, I extended the > strlen/memcpy > >>>> >>>>>> with the new format to show you if you prefer > [1] > >>>> >>>>>> - Note that the diff between the > "old > >>>> >>>>>> extended way from [1]" to the webrev.04 can be > found > >>>> >>>>>> in [2] > >>>> >>>>>> > >>>> >>>>>> - I added a test to test the exception > wrapper > >>>> in > >>>> >>>>>> tests :); I'm not sure it is deemed useful or > not but > >>>> >>>>>> helped me assure myself that I was not doing > things > >>>> >>>>>> wrong; you can find the base test file here > [3]; > >>>> should > >>>> >>>>>> we have this or not? (I know that normally we > don't > >>>> add > >>>> >>>>>> tests to vmTestbase but thought this might be > an > >>>> >>>>>> exception) > >>>> >>>>>> > >>>> >>>>>> Thanks for your help and my apologies for the > snag, > >>>> >>>>>> Jc > >>>> >>>>>> > >>>> >>>>>> [1]: > >>>> >>>>>> > >>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html > >>>> >>>>>> > >>>> >>>>>> < > >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html > > > >>>> > >>>> >>>>>> > >>>> >>>>>> [2]: > >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04 > >>>> >>>>>> < > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a_04> > >>>> >>>>>> [3] > >>>> >>>>>> > >>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html > >>>> >>>>>> > >>>> >>>>>> < > >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html > > > >>>> > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> On Mon, Dec 3, 2018 at 11:29 PM David Holmes > >>>> >>>>>> >>>> >>>>>> > wrote: > >>>> >>>>>> > >>>> >>>>>> Looks fine to me. > >>>> >>>>>> > >>>> >>>>>> Thanks, > >>>> >>>>>> David > >>>> >>>>>> > >>>> >>>>>> On 4/12/2018 4:04 pm, JC Beyler wrote: > >>>> >>>>>> > Hi both, > >>>> >>>>>> > > >>>> >>>>>> > Thanks for the reviews! Since Serguei > did not > >>>> >>>>>> insist on get_basename, I > >>>> >>>>>> > went for get_dirname since the method is > a > >>>> local > >>>> >>>>>> static method and won't > >>>> >>>>>> > have its name start spreading, I think > it's ok > >>>> too. > >>>> >>>>>> > > >>>> >>>>>> > For the naming of the local variable, > the idea > >>>> >>>>>> initially was to use the > >>>> >>>>>> > same name as the local variable for > JNIEnv > >>>> already > >>>> >>>>>> used to reduce the > >>>> >>>>>> > code change. Since I'm now adding the > line > >>>> macro > >>>> >>>>>> at the end anyway, this > >>>> >>>>>> > does not matter anymore so I converged > all > >>>> local > >>>> >>>>>> variables to "jni". > >>>> >>>>>> > > >>>> >>>>>> > So, without further ado, here is the new > >>>> version: > >>>> >>>>>> > Webrev: > >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03/ > >>>> >>>>>> > > >>>> >>>>>> > Bug: > >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > >>>> >>>>>> > > >>>> >>>>>> > This passes the various tests changed by > the > >>>> >>>>>> webrev on my dev machine. > >>>> >>>>>> > > >>>> >>>>>> > Let me know what you think, > >>>> >>>>>> > Jc > >>>> >>>>>> > > >>>> >>>>>> > On Mon, Dec 3, 2018 at 8:40 PM > >>>> >>>>>> serguei.spitsyn at oracle.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >> wrote: > >>>> >>>>>> > > >>>> >>>>>> > On 12/3/18 20:15, Chris Plummer > wrote: > >>>> >>>>>> > > Hi JC, > >>>> >>>>>> > > > >>>> >>>>>> > > Overall it looks good. A few > naming nits > >>>> >>>>>> thought: > >>>> >>>>>> > > > >>>> >>>>>> > > In bi01t001.cpp, why have you > declared > >>>> the > >>>> >>>>>> > ExceptionCheckingJniEnvPtr > >>>> >>>>>> > > using jni_env(jni). Elsewhere you > use > >>>> >>>>>> jni(jni_env) and rename the > >>>> >>>>>> > > method argument passed in from > jni to > >>>> >>>>>> jni_env. > >>>> >>>>>> > > > >>>> >>>>>> > > Related to this, I also noticed > in some > >>>> >>>>>> files that already are using > >>>> >>>>>> > > ExceptionCheckingJniEnvPtr, such > as > >>>> >>>>>> CharArrayCriticalLocker.cpp, you > >>>> >>>>>> > > delcared it as env(jni_env). So > that > >>>> means > >>>> >>>>>> there are 3 different > >>>> >>>>>> > names > >>>> >>>>>> > > you have used for the > >>>> >>>>>> ExceptionCheckingJniEnvPtr local variable. > >>>> >>>>>> > They > >>>> >>>>>> > > should be consistent. > >>>> >>>>>> > > > >>>> >>>>>> > > Also, can you rename > get_basename() to > >>>> >>>>>> get_dirname()? I know Serguei > >>>> >>>>>> > > suggested get_basename() a while > back, > >>>> but > >>>> >>>>>> unless "basename" is > >>>> >>>>>> > > commonly used for this purpose, I > think > >>>> >>>>>> "dirname" is more self > >>>> >>>>>> > > explanatory. > >>>> >>>>>> > > >>>> >>>>>> > In general, I'm Okay with > get_dirname(). > >>>> >>>>>> > Just to mention dirname can be both > short > >>>> or > >>>> >>>>>> full, so it is a little > >>>> >>>>>> > confusing as well. > >>>> >>>>>> > It is the reason why the > get_basename() was > >>>> >>>>>> suggested. > >>>> >>>>>> > However, I do not insist on > get_basename() > >>>> nor > >>>> >>>>>> get_full_dirname(). :) > >>>> >>>>>> > > >>>> >>>>>> > Thanks, > >>>> >>>>>> > Serguei > >>>> >>>>>> > > >>>> >>>>>> > > >>>> >>>>>> > > thanks, > >>>> >>>>>> > > > >>>> >>>>>> > > Chris > >>>> >>>>>> > > > >>>> >>>>>> > > On 12/2/18 10:29 PM, David Holmes > wrote: > >>>> >>>>>> > >> Hi Jc, > >>>> >>>>>> > >> > >>>> >>>>>> > >> I've been lurking on this one > and have > >>>> had > >>>> >>>>>> a look through. I'm okay > >>>> >>>>>> > >> with the FatalError approach for > the > >>>> tests > >>>> >>>>>> - we don't expect > >>>> >>>>>> > anything > >>>> >>>>>> > >> to go wrong in a well written > test in a > >>>> >>>>>> correctly functioning VM. > >>>> >>>>>> > >> > >>>> >>>>>> > >> Thanks, > >>>> >>>>>> > >> David > >>>> >>>>>> > >> > >>>> >>>>>> > >> > >>>> >>>>>> > >> > >>>> >>>>>> > >> On 3/12/2018 3:24 pm, JC Beyler > wrote: > >>>> >>>>>> > >>> Hi all, > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Would someone on the GC or > runtime > >>>> team > >>>> >>>>>> be motivated to give > >>>> >>>>>> > this a > >>>> >>>>>> > >>> review? :) > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> It would be much appreciated! > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Webrev: > >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ > >>>> >>>>>> > > >>>> >>>>>> > >>> Bug: > >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Thanks for your help, > >>>> >>>>>> > >>> Jc > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> On Tue, Nov 27, 2018 at 4:36 PM > JC > >>>> Beyler > >>>> >>>>>> jcbeyler at google.com > >>>> > > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>> >>>> >>>>>> > >>>> >>>>>> >>>> >>>>>> >>> wrote: > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Hi Chris, > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Yes I was waiting for > another > >>>> review > >>>> >>>>>> since you had explicitly > >>>> >>>>>> > >>> asked :) > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> And sounds good that when > someone > >>>> >>>>>> from GC or runtime gives a > >>>> >>>>>> > >>> review, > >>>> >>>>>> > >>> I'll wait for your full > review on > >>>> the > >>>> >>>>>> webrev.02! > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Thanks again for your help, > >>>> >>>>>> > >>> Jc > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> On Tue, Nov 27, 2018 at > 12:48 PM > >>>> >>>>>> Chris Plummer > >>>> >>>>>> > >>> >>>> >>>>>> > >>>> >>>>>> >>>> >>>>>> > > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >>>> >>>>>> >>> > >>>> >>>>>> > wrote: > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Hi JC, > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> I think it would be > good to > >>>> get a > >>>> >>>>>> review from the gc or > >>>> >>>>>> > runtime > >>>> >>>>>> > >>> teams, since this also > affects > >>>> >>>>>> their tests. > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Also, once we are > settled on > >>>> this > >>>> >>>>>> FatalError approach, > >>>> >>>>>> > I still > >>>> >>>>>> > >>> need to give your > webrev-02 a > >>>> >>>>>> full review. I only > >>>> >>>>>> > skimmed over > >>>> >>>>>> > >>> parts of it (I did look > at all > >>>> >>>>>> the changes in webrevo-01). > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> thanks, > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Chris > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> On 11/27/18 8:58 AM, > >>>> >>>>>> serguei.spitsyn at oracle.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>> serguei.spitsyn at oracle.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >> wrote: > >>>> >>>>>> > >>>> Hi Jc, > >>>> >>>>>> > >>>> > >>>> >>>>>> > >>>> I've already reviewed > this > >>>> too. > >>>> >>>>>> > >>>> > >>>> >>>>>> > >>>> Thanks, > >>>> >>>>>> > >>>> Serguei > >>>> >>>>>> > >>>> > >>>> >>>>>> > >>>> > >>>> >>>>>> > >>>> On 11/27/18 06:56, JC > Beyler > >>>> >>>>>> wrote: > >>>> >>>>>> > >>>>> Thanks Chris, > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> Anybody else motivated to > look at > >>>> this > >>>> >>>>>> and review it? :) > >>>> >>>>>> > >>>>> Jc > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> On Mon, Nov 26, 2018 > at > >>>> 1:26 PM > >>>> >>>>>> Chris Plummer > >>>> >>>>>> > >>>>> >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >>> > >>>> >>>>>> > >>>>> wrote: > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> Hi JC, > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> I'm ok with the FatalError > approach, > >>>> >>>>>> but would > >>>> >>>>>> > like to > >>>> >>>>>> > >>>>> hear opinions from others > also. > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> thanks, > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> Chris > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> On 11/21/18 8:19 AM, JC Beyler > >>>> wrote: > >>>> >>>>>> > >>>>>> Hi Chris, > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> Thanks for > taking the > >>>> time > >>>> >>>>>> to look at it and yes you > >>>> >>>>>> > >>>>>> have raised > exactly why > >>>> >>>>>> the webrev is between two > >>>> >>>>>> > >>>>>> worlds: in cases > where > >>>> a > >>>> >>>>>> fatal error on failure is > >>>> >>>>>> > >>>>>> wanted, should we > >>>> simplify > >>>> >>>>>> the code to remove > >>>> >>>>>> > the return > >>>> >>>>>> > >>>>>> tests since we > do them > >>>> >>>>>> internally? Now that I've > >>>> >>>>>> > looked > >>>> >>>>>> > >>>>>> around for > non-fatal > >>>> >>>>>> cases, I think the answer > >>>> >>>>>> > is yes, > >>>> >>>>>> > >>>>>> it simplifies > the code > >>>> >>>>>> while maintaining the checks. > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> I looked a bit > and it > >>>> >>>>>> seems that I can't find > >>>> >>>>>> > easily a > >>>> >>>>>> > >>>>>> case where the > test > >>>> >>>>>> accepts a JNI failure to > >>>> >>>>>> > then move > >>>> >>>>>> > >>>>>> on. Therefore, > perhaps, > >>>> >>>>>> for now, the fail with a > >>>> >>>>>> > Fatal > >>>> >>>>>> > >>>>>> is enough and we > can > >>>> work > >>>> >>>>>> on the tests to clean > >>>> >>>>>> > them up? > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> That means that > this is > >>>> >>>>>> the new webrev with only > >>>> >>>>>> > Fatal > >>>> >>>>>> > >>>>>> and cleans up the > >>>> tests so > >>>> >>>>>> that it is no longer in > >>>> >>>>>> > >>>>>> between two > worlds: > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> Webrev: > >>>> >>>>>> > >>>>>> > >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ > >>>> >>>>>> > > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>> Bug: > >>>> >>>>>> > > >>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> (This passes > testing > >>>> on my > >>>> >>>>>> dev machine for all the > >>>> >>>>>> > >>>>>> modified tests) > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> with the example > you > >>>> >>>>>> provided, it now looks like: > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html > >>>> >>>>>> > >>>> >>>>>> < > >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html > > > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > > >>>> >>>>>> < > >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html > > > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> Where it does, > to me at > >>>> >>>>>> least, seem cleaner and less > >>>> >>>>>> > >>>>>> "noisy". > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> Let me know what > you > >>>> think, > >>>> >>>>>> > >>>>>> Jc > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> On Tue, Nov 20, > 2018 at > >>>> >>>>>> 9:33 PM Chris Plummer > >>>> >>>>>> > >>>>>> < > >>>> chris.plummer at oracle.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >>> wrote: > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> Hi JC, > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> Sorry about > the > >>>> delay. > >>>> >>>>>> I had to go back an > >>>> >>>>>> > look at > >>>> >>>>>> > >>>>>> the initial > 8210842 > >>>> >>>>>> webrev and RFR thread to see > >>>> >>>>>> > >>>>>> what this was > >>>> >>>>>> initially all about. > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> In general > the > >>>> changes > >>>> >>>>>> look good. > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> I don't have > a good > >>>> >>>>>> answer to your > >>>> >>>>>> > >>>>>> FatalError/NonFatalError > question. It > >>>> makes > >>>> >>>>>> > the code > >>>> >>>>>> > >>>>>> a lot > cleaner to > >>>> use > >>>> >>>>>> FatalError, but then it > >>>> >>>>>> > is a > >>>> >>>>>> > >>>>>> behavior > change, > >>>> and > >>>> >>>>>> you also need to deal with > >>>> >>>>>> > >>>>>> tests that > >>>> >>>>>> intentionally induce errors (do > >>>> >>>>>> > you have > >>>> >>>>>> > >>>>>> an example of > >>>> that). > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> In any case, > right > >>>> now > >>>> >>>>>> your webrev seems to be > >>>> >>>>>> > >>>>>> between two > worlds. > >>>> >>>>>> You are producing > >>>> >>>>>> > FatalError, > >>>> >>>>>> > >>>>>> but still > checking > >>>> >>>>>> results. Here's a good > >>>> >>>>>> > example: > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html > >>>> >>>>>> > >>>> >>>>>> < > >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html > > > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > > >>>> >>>>>> < > >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html > > > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> I'm not sure > if > >>>> this > >>>> >>>>>> is just a temporary > >>>> >>>>>> > state until > >>>> >>>>>> > >>>>>> it was > decided > >>>> which > >>>> >>>>>> approach to take. > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> thanks, > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> Chris > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> On 11/20/18 > 2:14 > >>>> PM, > >>>> >>>>>> JC Beyler wrote: > >>>> >>>>>> > >>>>>>> Hi all, > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Chris > thought it > >>>> made > >>>> >>>>>> sense to have more > >>>> >>>>>> > eyes on > >>>> >>>>>> > >>>>>>> this change > than > >>>> just > >>>> >>>>>> serviceability as it will > >>>> >>>>>> > >>>>>>> modify to > tests > >>>> that > >>>> >>>>>> are not only > >>>> >>>>>> > serviceability > >>>> >>>>>> > >>>>>>> tests so > I've > >>>> moved > >>>> >>>>>> this to conversation > >>>> >>>>>> > here :) > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> For > convenience, > >>>> I've > >>>> >>>>>> copy-pasted the > >>>> >>>>>> > initial RFR: > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Could I > have a > >>>> review > >>>> >>>>>> for the extension and > >>>> >>>>>> > usage > >>>> >>>>>> > >>>>>>> of the > >>>> >>>>>> ExceptionJniWrapper. This adds lines and > >>>> >>>>>> > >>>>>>> filenames > to the > >>>> end > >>>> >>>>>> of the wrapper JNI > >>>> >>>>>> > methods, > >>>> >>>>>> > >>>>>>> adds > tracing, and > >>>> >>>>>> throws an error if need > >>>> >>>>>> > be. I've > >>>> >>>>>> > >>>>>>> ported the > gc/lock > >>>> >>>>>> files to use the new > >>>> >>>>>> > >>>>>>> > TRACE_JNI_CALL > >>>> add-on > >>>> >>>>>> and I've ported a few > >>>> >>>>>> > of the > >>>> >>>>>> > >>>>>>> tests that > were > >>>> >>>>>> already changed for the > >>>> >>>>>> > assignment > >>>> >>>>>> > >>>>>>> webrev for > >>>> >>>>>> JDK-8212884. > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Webrev: > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01 > >>>> >>>>>> > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>> >>>>>> > >>>>>>> Bug: > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> For > illustration, > >>>> if > >>>> >>>>>> I force an error to the > >>>> >>>>>> > >>>>>>> > AP04/ap04t03 test > >>>> and > >>>> >>>>>> set the verbosity on, > >>>> >>>>>> > I get > >>>> >>>>>> > >>>>>>> something > like: > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> >> Calling > JNI > >>>> method > >>>> >>>>>> FindClass from > >>>> >>>>>> > >>>>>>> ap04t003.cpp:343 > >>>> >>>>>> > >>>>>>> >> Calling > with > >>>> these > >>>> >>>>>> parameter(s): > >>>> >>>>>> > >>>>>>> java/lang/Threadd > >>>> >>>>>> > >>>>>>> Wait for > thread > >>>> to > >>>> >>>>>> finish > >>>> >>>>>> > >>>>>>> << Called > JNI > >>>> method > >>>> >>>>>> FindClass from > >>>> >>>>>> > >>>>>>> ap04t003.cpp:343 > >>>> >>>>>> > >>>>>>> Exception in > >>>> thread > >>>> >>>>>> "Thread-0" > >>>> >>>>>> > >>>>>>> > java.lang.NoClassDefFoundError: > >>>> >>>>>> > java/lang/Threadd > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native > >>>> >>>>>> > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Method) > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Caused by: > >>>> >>>>>> java.lang.ClassNotFoundException: > >>>> >>>>>> > >>>>>>> > java.lang.Threadd > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > >>>> >>>>>> > >>>>>>> ... > 3 more > >>>> >>>>>> > >>>>>>> FATAL ERROR > in > >>>> native > >>>> >>>>>> method: JNI method > >>>> >>>>>> > FindClass > >>>> >>>>>> > >>>>>>> : internal > error > >>>> from > >>>> >>>>>> ap04t003.cpp:343 > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native > >>>> >>>>>> > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Method) > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> at > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> > >>>> Questions/comments I > >>>> >>>>>> have about this are: > >>>> >>>>>> > >>>>>>> - Do we > want to > >>>> >>>>>> force fatal errors when a JNI > >>>> >>>>>> > >>>>>>> call fails > in > >>>> >>>>>> general? Most of these tests > >>>> >>>>>> > do the > >>>> >>>>>> > >>>>>>> right thing > and > >>>> test > >>>> >>>>>> the return of the JNI > >>>> >>>>>> > calls, > >>>> >>>>>> > >>>>>>> for example: > >>>> >>>>>> > >>>>>>> > thrClass = > >>>> >>>>>> > jni->FindClass("java/lang/Threadd", > >>>> >>>>>> > >>>>>>> > TRACE_JNI_CALL); > >>>> >>>>>> > >>>>>>> if > (thrClass > >>>> == > >>>> >>>>>> NULL) { > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> but now the > >>>> wrapper > >>>> >>>>>> actually would do a > >>>> >>>>>> > fatal if > >>>> >>>>>> > >>>>>>> the > FindClass call > >>>> >>>>>> would return a nullptr, > >>>> >>>>>> > so we > >>>> >>>>>> > >>>>>>> could > remove that > >>>> >>>>>> test altogether. What do you > >>>> >>>>>> > >>>>>>> think? > >>>> >>>>>> > >>>>>>> - I > prefer to > >>>> >>>>>> leave them as the tests then > >>>> >>>>>> > >>>>>>> become > closer to > >>>> what > >>>> >>>>>> real users would have in > >>>> >>>>>> > >>>>>>> their code > and is > >>>> the > >>>> >>>>>> "recommended" way of > >>>> >>>>>> > doing it > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> - The > >>>> alternative > >>>> >>>>>> is to use the > >>>> >>>>>> > NonFatalError I > >>>> >>>>>> > >>>>>>> added which > then > >>>> just > >>>> >>>>>> prints out that something > >>>> >>>>>> > >>>>>>> went wrong, > >>>> letting > >>>> >>>>>> the test continue. Question > >>>> >>>>>> > >>>>>>> will be what > >>>> should > >>>> >>>>>> be the default? The > >>>> >>>>>> > fatal or > >>>> >>>>>> > >>>>>>> the > non-fatal > >>>> error > >>>> >>>>>> handling? > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> On a > different > >>>> >>>>>> subject: > >>>> >>>>>> > >>>>>>> - On the > new > >>>> tests, > >>>> >>>>>> I've removed the > >>>> >>>>>> > >>>>>>> > NSK_JNI_VERIFY > >>>> since > >>>> >>>>>> the JNI wrapper > >>>> >>>>>> > handles the > >>>> >>>>>> > >>>>>>> tracing and > the > >>>> >>>>>> verify in almost the same > >>>> >>>>>> > way; only > >>>> >>>>>> > >>>>>>> difference > I can > >>>> >>>>>> really tell is that the > >>>> >>>>>> > complain > >>>> >>>>>> > >>>>>>> method from > NSK > >>>> has a > >>>> >>>>>> max complain before > >>>> >>>>>> > stopping > >>>> >>>>>> > >>>>>>> to > "complain"; I > >>>> have > >>>> >>>>>> not added that part > >>>> >>>>>> > of the > >>>> >>>>>> > >>>>>>> code in this > >>>> webrev > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Once we > decide on > >>>> >>>>>> these, I can continue on the > >>>> >>>>>> > >>>>>>> files from > >>>> >>>>>> JDK-8212884 and then do both the > >>>> >>>>>> > >>>>>>> assignment > in an > >>>> if > >>>> >>>>>> extraction followed-by this > >>>> >>>>>> > >>>>>>> type of > webrev in > >>>> an > >>>> >>>>>> easier fashion. > >>>> >>>>>> > Depending on > >>>> >>>>>> > >>>>>>> decisions > here, > >>>> >>>>>> NSK*VERIFY can be deprecated as > >>>> >>>>>> > >>>>>>> well as we > go > >>>> forward. > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Thanks! > >>>> >>>>>> > >>>>>>> Jc > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> On Mon, Nov > 19, > >>>> 2018 > >>>> >>>>>> at 11:34 AM Chris Plummer > >>>> >>>>>> > >>>>>>> < > >>>> chris.plummer at oracle.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>> chris.plummer at oracle.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >>> wrote: > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> On > 11/19/18 > >>>> 10:07 > >>>> >>>>>> AM, JC Beyler wrote: > >>>> >>>>>> > >>>>>>>> Hi all, > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> > @David/Chris: > >>>> >>>>>> should I then push this > >>>> >>>>>> > RFR to > >>>> >>>>>> > >>>>>>>> the > hotspot > >>>> >>>>>> mailing or the runtime > >>>> >>>>>> > one? For > >>>> >>>>>> > >>>>>>>> what > it's > >>>> worth, > >>>> >>>>>> a lot of the tests > >>>> >>>>>> > under the > >>>> >>>>>> > >>>>>>>> > vmTestbase > >>>> are > >>>> >>>>>> jvmti so the review also > >>>> >>>>>> > >>>>>>>> affects > >>>> >>>>>> serviceability; it just turns > >>>> >>>>>> > out I > >>>> >>>>>> > >>>>>>>> > started with > >>>> the > >>>> >>>>>> GC originally and > >>>> >>>>>> > then hit > >>>> >>>>>> > >>>>>>>> some > other > >>>> tests > >>>> >>>>>> I had touched via the > >>>> >>>>>> > >>>>>>>> > assignment > >>>> >>>>>> extraction. > >>>> >>>>>> > >>>>>>> I think > >>>> hotspot > >>>> >>>>>> would be best. > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> Chris > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> > @Serguei: > >>>> Done > >>>> >>>>>> for the method > >>>> >>>>>> > renaming, for > >>>> >>>>>> > >>>>>>>> the > indent, > >>>> are > >>>> >>>>>> you talking about > >>>> >>>>>> > going from > >>>> >>>>>> > >>>>>>>> the > 8-indent > >>>> to > >>>> >>>>>> 4-indent? If so, would > >>>> >>>>>> > it not > >>>> >>>>>> > >>>>>>>> just be > >>>> better > >>>> >>>>>> to do a new JBS bug and > >>>> >>>>>> > do the > >>>> >>>>>> > >>>>>>>> whole > files > >>>> in > >>>> >>>>>> one go? I ask because > >>>> >>>>>> > >>>>>>>> > otherwise, it > >>>> >>>>>> will look a bit weird to > >>>> >>>>>> > have > >>>> >>>>>> > >>>>>>>> parts > of the > >>>> >>>>>> file as 8-indent and others > >>>> >>>>>> > >>>>>>>> 4-indent? > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> Thanks > for > >>>> >>>>>> looking at it! > >>>> >>>>>> > >>>>>>>> Jc > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> On > Mon, Nov > >>>> 19, > >>>> >>>>>> 2018 at 1:25 AM > >>>> >>>>>> > >>>>>>>> serguei.spitsyn at oracle.com > >>>> >>>>>> > >>>> >>>>>> >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>> >>>> serguei.spitsyn at oracle.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >> > >>>> >>>>>> > >>>>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>> >>>> serguei.spitsyn at oracle.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >>> wrote: > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> Hi > Jc, > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> We > have > >>>> to > >>>> >>>>>> start this review > >>>> >>>>>> > anyway. :) > >>>> >>>>>> > >>>>>>>> It > looks > >>>> >>>>>> good to me in general. > >>>> >>>>>> > >>>>>>>> > Thank you > >>>> >>>>>> for your consistency in this > >>>> >>>>>> > >>>>>>>> > >>>> refactoring! > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> > Some > >>>> minor > >>>> >>>>>> comments. > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> > +static > >>>> >>>>>> const char* > >>>> >>>>>> > remove_folders(const > >>>> >>>>>> > >>>>>>>> > char* > >>>> >>>>>> fullname) { I'd suggest to > >>>> >>>>>> > rename > >>>> >>>>>> > >>>>>>>> the > >>>> function > >>>> >>>>>> name to something > >>>> >>>>>> > traditional > >>>> >>>>>> > >>>>>>>> > like > >>>> >>>>>> get_basename. Otherwise, it > >>>> >>>>>> > sounds > >>>> >>>>>> > >>>>>>>> > like this > >>>> >>>>>> function has to really > >>>> >>>>>> > remove > >>>> >>>>>> > >>>>>>>> > folders. > >>>> :) > >>>> >>>>>> Also, all *Locker.cpp have > >>>> >>>>>> > >>>>>>>> > wrong > >>>> indent > >>>> >>>>>> in the bodies of if > >>>> >>>>>> > and while > >>>> >>>>>> > >>>>>>>> > >>>> statements. > >>>> >>>>>> Could this be fixed > >>>> >>>>>> > with the > >>>> >>>>>> > >>>>>>>> > >>>> refactoring? > >>>> >>>>>> I did not look on how > >>>> >>>>>> > this > >>>> >>>>>> > >>>>>>>> > impacts > >>>> the > >>>> >>>>>> tests other than > >>>> >>>>>> > >>>>>>>> serviceability. > >>>> Thanks, > >>>> >>>>>> Serguei > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> On > >>>> 11/16/18 > >>>> >>>>>> 19:43, JC Beyler wrote: > >>>> >>>>>> > >>>>>>>>> Hi all, > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> Anybody > >>>> >>>>>> motivated to review this? :) > >>>> >>>>>> > >>>>>>>>> Jc > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> On > Wed, Nov > >>>> 7, > >>>> >>>>>> 2018 at 9:53 PM JC > >>>> >>>>>> > Beyler > >>>> >>>>>> > >>>>>>>>> < > jcbeyler at google.com > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> >>> wrote: > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> Hi > all, > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > Could I > >>>> have > >>>> >>>>>> a review for the > >>>> >>>>>> > >>>>>>>>> > extension > >>>> >>>>>> and usage of the > >>>> >>>>>> > >>>>>>>>> ExceptionJniWrapper. This > >>>> >>>>>> > adds lines > >>>> >>>>>> > >>>>>>>>> and > >>>> >>>>>> filenames to the end of the > >>>> >>>>>> > >>>>>>>>> > wrapper > >>>> JNI > >>>> >>>>>> methods, adds > >>>> >>>>>> > tracing, > >>>> >>>>>> > >>>>>>>>> and > >>>> throws > >>>> >>>>>> an error if need > >>>> >>>>>> > be. I've > >>>> >>>>>> > >>>>>>>>> > ported > >>>> the > >>>> >>>>>> gc/lock files to > >>>> >>>>>> > use the > >>>> >>>>>> > >>>>>>>>> new > >>>> >>>>>> TRACE_JNI_CALL add-on and > >>>> >>>>>> > I've > >>>> >>>>>> > >>>>>>>>> > ported a > >>>> few > >>>> >>>>>> of the tests > >>>> >>>>>> > that were > >>>> >>>>>> > >>>>>>>>> > already > >>>> >>>>>> changed for the > >>>> >>>>>> > assignment > >>>> >>>>>> > >>>>>>>>> > webrev > >>>> for > >>>> >>>>>> JDK-8212884. > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > Webrev: > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.00/ > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> > Bug: > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> For > >>>> >>>>>> illustration, if I force > >>>> >>>>>> > an error > >>>> >>>>>> > >>>>>>>>> to > the > >>>> >>>>>> AP04/ap04t03 test and > >>>> >>>>>> > set the > >>>> >>>>>> > >>>>>>>>> > verbosity > >>>> >>>>>> on, I get something > >>>> >>>>>> > like: > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> >> > >>>> Calling > >>>> >>>>>> JNI method > >>>> >>>>>> > FindClass from > >>>> >>>>>> > >>>>>>>>> ap04t003.cpp:343 > >>>> >>>>>> > >>>>>>>>> >> > >>>> Calling > >>>> >>>>>> with these > >>>> >>>>>> > parameter(s): > >>>> >>>>>> > >>>>>>>>> java/lang/Threadd > >>>> >>>>>> > >>>>>>>>> > Wait for > >>>> >>>>>> thread to finish > >>>> >>>>>> > >>>>>>>>> << > Called > >>>> >>>>>> JNI method > >>>> >>>>>> > FindClass from > >>>> >>>>>> > >>>>>>>>> ap04t003.cpp:343 > >>>> >>>>>> > >>>>>>>>> > >>>> Exception in > >>>> >>>>>> thread "Thread-0" > >>>> >>>>>> > >>>>>>>>> java.lang.NoClassDefFoundError: > >>>> >>>>>> > >>>>>>>>> java/lang/Threadd > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native > >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > Method) > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > Caused > >>>> by: > >>>> >>>>>> > >>>>>>>>> > java.lang.ClassNotFoundException: > >>>> >>>>>> > >>>>>>>>> java.lang.Threadd > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > >>>> >>>>>> > >>>>>>>>> > ... 3 > >>>> more > >>>> >>>>>> > >>>>>>>>> > FATAL > >>>> ERROR > >>>> >>>>>> in native method: JNI > >>>> >>>>>> > >>>>>>>>> > method > >>>> >>>>>> FindClass : internal error > >>>> >>>>>> > >>>>>>>>> > from > >>>> >>>>>> ap04t003.cpp:343 > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native > >>>> >>>>>> > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > Method) > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) > >>>> > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> at > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) > >>>> >>>>>> > >>>> >>>>>> > > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> Questions/comments I have about > >>>> >>>>>> > >>>>>>>>> this are: > >>>> >>>>>> > >>>>>>>>> > - Do we > >>>> >>>>>> want to force fatal > >>>> >>>>>> > errors > >>>> >>>>>> > >>>>>>>>> > when a > >>>> JNI > >>>> >>>>>> call fails in general? > >>>> >>>>>> > >>>>>>>>> > Most of > >>>> >>>>>> these tests do the right > >>>> >>>>>> > >>>>>>>>> > thing and > >>>> >>>>>> test the return of > >>>> >>>>>> > the JNI > >>>> >>>>>> > >>>>>>>>> > calls, > >>>> for > >>>> >>>>>> example: > >>>> >>>>>> > >>>>>>>>> > thrClass > >>>> = > >>>> >>>>>> > >>>>>>>>> > jni->FindClass("java/lang/Threadd", > >>>> >>>>>> > >>>>>>>>> TRACE_JNI_CALL); > >>>> >>>>>> > >>>>>>>>> > if > >>>> >>>>>> (thrClass == NULL) { > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > but now > >>>> the > >>>> >>>>>> wrapper actually > >>>> >>>>>> > would do > >>>> >>>>>> > >>>>>>>>> a > fatal > >>>> if > >>>> >>>>>> the FindClass call > >>>> >>>>>> > would > >>>> >>>>>> > >>>>>>>>> > return a > >>>> >>>>>> nullptr, so we could > >>>> >>>>>> > remove > >>>> >>>>>> > >>>>>>>>> > that test > >>>> >>>>>> altogether. What do > >>>> >>>>>> > you > >>>> >>>>>> > >>>>>>>>> think? > >>>> >>>>>> > >>>>>>>>> > - I > >>>> >>>>>> prefer to leave them > >>>> >>>>>> > as the > >>>> >>>>>> > >>>>>>>>> > tests > >>>> then > >>>> >>>>>> become closer to > >>>> >>>>>> > what real > >>>> >>>>>> > >>>>>>>>> > users > >>>> would > >>>> >>>>>> have in their > >>>> >>>>>> > code and is > >>>> >>>>>> > >>>>>>>>> the > >>>> >>>>>> "recommended" way of doing it > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > - The > >>>> >>>>>> alternative is to > >>>> >>>>>> > use the > >>>> >>>>>> > >>>>>>>>> NonFatalError I > >>>> added > >>>> >>>>>> which > >>>> >>>>>> > then just > >>>> >>>>>> > >>>>>>>>> > prints > >>>> out > >>>> >>>>>> that something > >>>> >>>>>> > went wrong, > >>>> >>>>>> > >>>>>>>>> > letting > >>>> the > >>>> >>>>>> test continue. > >>>> >>>>>> > Question > >>>> >>>>>> > >>>>>>>>> > will be > >>>> what > >>>> >>>>>> should be the > >>>> >>>>>> > default? > >>>> >>>>>> > >>>>>>>>> The > >>>> fatal or > >>>> >>>>>> the non-fatal error > >>>> >>>>>> > >>>>>>>>> > handling? > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> On > a > >>>> >>>>>> different subject: > >>>> >>>>>> > >>>>>>>>> > - On > >>>> the > >>>> >>>>>> new tests, I've > >>>> >>>>>> > removed > >>>> >>>>>> > >>>>>>>>> the > >>>> >>>>>> NSK_JNI_VERIFY since the JNI > >>>> >>>>>> > >>>>>>>>> > wrapper > >>>> >>>>>> handles the tracing > >>>> >>>>>> > and the > >>>> >>>>>> > >>>>>>>>> > verify in > >>>> >>>>>> almost the same > >>>> >>>>>> > way; only > >>>> >>>>>> > >>>>>>>>> > >>>> difference I > >>>> >>>>>> can really tell > >>>> >>>>>> > is that > >>>> >>>>>> > >>>>>>>>> the > >>>> complain > >>>> >>>>>> method from NSK > >>>> >>>>>> > has a > >>>> >>>>>> > >>>>>>>>> max > >>>> complain > >>>> >>>>>> before stopping to > >>>> >>>>>> > >>>>>>>>> > >>>> "complain"; > >>>> >>>>>> I have not added that > >>>> >>>>>> > >>>>>>>>> > part of > >>>> the > >>>> >>>>>> code in this webrev > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > Once we > >>>> >>>>>> decide on these, I can > >>>> >>>>>> > >>>>>>>>> > continue > >>>> on > >>>> >>>>>> the files from > >>>> >>>>>> > >>>>>>>>> > >>>> JDK-8212884 > >>>> >>>>>> and then do both the > >>>> >>>>>> > >>>>>>>>> > >>>> assignment > >>>> >>>>>> in an if extraction > >>>> >>>>>> > >>>>>>>>> > >>>> followed-by > >>>> >>>>>> this type of > >>>> >>>>>> > webrev in an > >>>> >>>>>> > >>>>>>>>> > easier > >>>> >>>>>> fashion. Depending on > >>>> >>>>>> > >>>>>>>>> > decisions > >>>> >>>>>> here, NSK*VERIFY can be > >>>> >>>>>> > >>>>>>>>> > >>>> deprecated > >>>> >>>>>> as well as we go > >>>> >>>>>> > forward. > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > Thank you > >>>> >>>>>> for the > >>>> >>>>>> > reviews/comments :) > >>>> >>>>>> > >>>>>>>>> Jc > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> > >>>> >>>>>> > >>>>>>>>> -- > >>>> >>>>>> > >>>>>>>>> Thanks, > >>>> >>>>>> > >>>>>>>>> Jc > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> > >>>> >>>>>> > >>>>>>>> -- > >>>> >>>>>> > >>>>>>>> Thanks, > >>>> >>>>>> > >>>>>>>> Jc > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> > >>>> >>>>>> > >>>>>>> -- > >>>> >>>>>> > >>>>>>> Thanks, > >>>> >>>>>> > >>>>>>> Jc > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> > >>>> >>>>>> > >>>>>> -- > >>>> >>>>>> > >>>>>> Thanks, > >>>> >>>>>> > >>>>>> Jc > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> > >>>> >>>>>> > >>>>> -- > >>>> >>>>>> > >>>>> Thanks, > >>>> >>>>>> > >>>>> Jc > >>>> >>>>>> > >>>> > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> -- > >>>> >>>>>> > >>> Thanks, > >>>> >>>>>> > >>> Jc > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> -- > >>>> >>>>>> > >>> > >>>> >>>>>> > >>> Thanks, > >>>> >>>>>> > >>> Jc > >>>> >>>>>> > > > >>>> >>>>>> > > > >>>> >>>>>> > > >>>> >>>>>> > > >>>> >>>>>> > > >>>> >>>>>> > -- > >>>> >>>>>> > > >>>> >>>>>> > Thanks, > >>>> >>>>>> > Jc > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> -- > >>>> >>>>>> Thanks, > >>>> >>>>>> Jc > >>>> >>>>> > >>>> >>>>> > >>>> >>>>> > >>>> >>>>> -- > >>>> >>>>> Thanks, > >>>> >>>>> Jc > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> -- > >>>> >>>> Thanks, > >>>> >>>> Jc > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> -- > >>>> >>>> > >>>> >>>> Thanks, > >>>> >>>> Jc > >>>> >>> > >>>> > > >>>> > >>> > >>> > >>> -- > >>> > >>> Thanks, > >>> Jc > >>> > >> > >> > >> -- > >> > >> Thanks, > >> Jc > >> > > > > > > -- > > > > Thanks, > > Jc > > > > > -- > > Thanks, > Jc > > > -- Thanks, Jc From daniel.daugherty at oracle.com Tue Jan 22 20:34:08 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 22 Jan 2019 15:34:08 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <5d6d97d6-fbda-28f4-b625-a9ae351af5ba@oracle.com> References: <5d6d97d6-fbda-28f4-b625-a9ae351af5ba@oracle.com> Message-ID: On 1/22/19 10:39 AM, Robbin Ehn wrote: > Hi all, here is v01 and v02. > > v01 contains update after comments from list: > http://cr.openjdk.java.net/~rehn/8203469/v01/ > http://cr.openjdk.java.net/~rehn/8203469/v01/inc/ > > v02 contains a bug fix, explained below: > http://cr.openjdk.java.net/~rehn/8203469/v02/ src/hotspot/share/code/dependencyContext.hpp ??? No comments. src/hotspot/share/jfr/recorder/repository/jfrEmergencyDump.cpp ??? No comments. src/hotspot/share/jfr/recorder/stacktrace/jfrStackTraceRepository.cpp ??? No comments. src/hotspot/share/runtime/handshake.cpp ??? No comments. src/hotspot/share/runtime/mutex.cpp ??? No comments. src/hotspot/share/runtime/mutex.hpp ??? No comments. src/hotspot/share/runtime/mutexLocker.cpp ??? No comments. src/hotspot/share/runtime/mutexLocker.hpp ??? No comments. src/hotspot/share/runtime/safepoint.cpp ??? L136, L138, L141 - nit - why add the blank lines? ??? L148: // We need to save the desc since it is removed before we need it. ??????? Perhaps: ????????? // We need a place to save the desc since it is released before we need it. ??? L400: ? // Save the starting time, so that it can be compared to see if this has taken ??? L401: ? // too long to complete. ??? L402: ? jlong safepoint_limit_time = 0; ??? L403: ? if (SafepointTimeout) { ??? L404: ??? safepoint_limit_time = os::javaTimeNanos() + (jlong)SafepointTimeoutDelay * MICROUNITS; ??????? Mostly re-existing, but this should be: ??????????? jlong safepoint_limit_time = 0; ??? ? ? ? ? if (SafepointTimeout) { ????????????? // Set the limit time, so that it can be compared to see if this has taken ????????????? // too long to complete. ??? ? ? ?? ?? safepoint_limit_time = os::javaTimeNanos() + (jlong)SafepointTimeoutDelay * MICROUNITS; ??????????? } ??? L477: ? // We first do the safepoint cleanup since if this is a GC safepoint, ??? L478: ? // needs it to be completed before running the GC op. ??????? Perhaps this rewrite: ??????????? // We do the safepoint cleanup first since a GC related safepoint ??????????? // needs cleanup to be completed before running the GC op. ??? L515: ??? OrderAccess::fence(); ??????? Perhaps add comment: ???????????? OrderAccess::fence();? // keep read and write of _state from floating up ??? L526: ??? // Keep the local state from floating up. ??? L527: ??? OrderAccess::fence(); ??????? nit - comment on L526 can be on L527, e.g.: ????????????? OrderAccess::fence();? // Keep the local state from floating up. ??? L749: nit - please delete blank line at top of function. ??? L750: ? assert((safepoint_count != InactiveSafepointCounter && ??? L751: ????????? Thread::current() == (Thread*)VMThread::vm_thread() && ??? L752: ????????? SafepointSynchronize::_state != _not_synchronized) || ??? L753: ????????? safepoint_count == InactiveSafepointCounter, "Invalid check"); ??????? Had to read this a couple of times. Perhaps this would be more clear: ??? ? ?? ? assert((safepoint_count != InactiveSafepointCounter && ??? ? ?? ????????? Thread::current() == (Thread*)VMThread::vm_thread() && ??? ? ?? ????????? SafepointSynchronize::_state != _not_synchronized) ? ? ? ? ? ? ? ? ? || safepoint_count == InactiveSafepointCounter, "Invalid check"); ??????? Please note that the final expression's '||' is lined up with ??????? first parenthetical expression... ??? L755: ? // To handle the thread_blocked state on the backedge of the WaitBarrier from ??? L756: ? // previous safepoint and reading the resetted (0/InactiveSafepointCounter) we ??? L757: ? // re-read state after we read thread safepoint id. The JavaThread changes it ??? L758: ? // state before resetting, the second read will either see a different thread ??? L759: ? // state making this an unsafe state or it can see blocked again. ??? L760: ? // When we see blocked twice with a 0 safepoint id, either: ??? L761: ? // - It is normally blocked, e.g. on Mutex, TBIVM. ??? L762: ? // - It was in SS:block(), looped around to SS:block() and is blocked on the WaitBarrier. ??? L763: ? // - It was in SS:block() but now on a Mutex. ??? L764: ? // Either case safe. ??????? Please consider these minor tweaks: ??? ? ?? ? // To handle the thread_blocked state on the backedge of the WaitBarrier from a ??? ? ?? ? // previous safepoint and reading the possibly reset (0/InactiveSafepointCounter) ??? ? ?? ? // id, re-read state after we read thread safepoint id. If the JavaThread changes ??? ? ?? ? // its state before resetting, the second read will either see a different thread ??? ? ?? ? // state making this an unsafe state or it can see blocked again. ??? ? ?? ? // When we see blocked twice with a 0 safepoint id, this means: ??? ? ?? ? // - It is normally blocked, e.g., on Mutex, TBIVM. ??? ? ?? ? // - It was in SS:block(), looped around to SS:block() and is blocked on the WaitBarrier. ??? ? ?? ? // - It was in SS:block() but now on a Mutex. ?????????? // All of these cases are safe. ??? old L743: ??? return !thread->has_last_Java_frame() || thread->frame_anchor()->walkable(); ??? new L780: ??? return !thread->has_last_Java_frame() || ??? new L781: thread->frame_anchor()->walkable(); ??????? nit - Why was this line reformatted/split? ??? L893: ????? // Load here stopped by above release. ??????? Perhaps: ??????????????? // Load here cannot float because of the above release. src/hotspot/share/runtime/safepoint.hpp ??? No comment. src/hotspot/share/runtime/safepointMechanism.inline.hpp ??? No comments. src/hotspot/share/runtime/thread.hpp ??? No comments. src/hotspot/share/runtime/vmThread.cpp ??? No comments. src/hotspot/share/services/runtimeService.cpp ??? No comments. src/hotspot/share/services/runtimeService.hpp ??? No comments. test/hotspot/jtreg/runtime/logging/SafepointTest.java ??? No comments. Thanks for persisting with this work. Thumbs up! All of my comments in this round are editorial. I don't need to see another webrev if you choose to make the above changes. Dan > http://cr.openjdk.java.net/~rehn/8203469/v02/inc/ > > Patricio had some good questions about try_stable_load_state. > In previous internal versions I have done the stable load by loading > thread state before and after safepoint id. For some reason I changed > during a > refactoring to the reverse, which is incorrect. Consider the following: > > JavaThread: state / safepoint id / poll |VMThread: global state / > safepoint counter / WaitBarrier > ########################################|################################ > _thread_in_native?????? / 0 / disarmed? | _not_synchronized / 0 / > disarmed > ??????????????????????????????????????? | _not_synchronized / 0 / > armed(1) > ??????????????????????????????????????? | _not_synchronized / 1 / > armed(1) > ??????????????????????????????????????? | _synchronizing??? / 1 / > armed(1) > _thread_in_native?????? / 0 / armed???? | > ??????????????????????????????????????? | id:0> > ??????????????????????????????????????? | state id:_thread_in_native> > ??????????????????????????????????????? | id:0> > ??????????????????????????????????????? | _synchonized????? / 1 / > armed(1) > ????????? | > _thread_in_native_trans / 0 / armed???? | > ???????????? | > ????????????? | > ??????????????????????????????????????? | _not_synchonized? / 1 / > armed(1) > ??????????????????????????????????????? | _not_synchonized? / 2 / > armed(1) > _thread_in_native_trans / 0 / disarmed? | > ??????????????????????????????????????? | _not_synchonized? / 2 / > disarmed > Next safepoint starts: > ??????????????????????????????????????? | _not_synchronized / 2 / > armed(3) > ??????????????????????????????????????? | _not_synchronized / 3 / > armed(3) > ??????????????????????????????????????? | _synchronizing??? / 3 / > armed(3) > _thread_in_native_trans / 0 / armed???? | > ??????????????????????????????????????? | id:0> > ?????????????? | > ???? | > _thread_in_native_trans / 1 / armed???? | > _thread_blocked???????? / 1 / armed???? | > ?????????? | > ??????????????????????????????????????? | state id:_thread_blocked> > _thread_in_native_trans / 1 / armed???? | > _thread_in_native_trans / 0 / armed???? | > ??????????????????????????????????????? | id:0> > > A false positive is read. > > When do it the correct the safe matrix looks like: > State load 1????? | Safepoint id | State load 2???? | Result > ##################|##############|##################|####### > any?????????????? | !0/current?? | any????????????? | treat all as unsafe > any?????????????? | any????????? | !state1????????? | treat all as unsafe > any?????????????? | 0/current??? | state1?????????? | suspend flag is > safe > thread_in_native? | 0/current??? | thread_in_native | safe > thread_in_blocked | 0/current??? | thread_in_blocked| safe > !thread_in_blocked > && > !thread_in_native | 0/current??? | state1?????????? | unsafe > > The case with blocked/0/blocked I added this comment for: > > ?755?? // To handle the thread_blocked state on the backedge of the > WaitBarrier from > ?756?? // previous safepoint and reading the resetted > (0/InactiveSafepointCounter) we > ?757?? // re-read state after we read thread safepoint id. The > JavaThread changes it > ?758?? // state before resetting, the second read will either see a > different thread > ?759?? // state making this an unsafe state or it can see blocked again. > ?760?? // When we see blocked twice with a 0 safepoint id, either: > ?761?? // - It is normally blocked, e.g. on Mutex, TBIVM. > ?762?? // - It was in SS:block(), looped around to SS:block() and is > blocked on the WaitBarrier. > ?763?? // - It was in SS:block() but now on a Mutex. > ?764?? // Either case safe. > > I hope above explains why loading state before and after safepoint id is > sufficient. > > Passes, with flying colors, t1-5, stress test, KS 24h stress. > > Thanks, Robbin > > On 1/15/19 11:39 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes >> 0.2ms the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock >> during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have >> transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure >> any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native >> without being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming >> into the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all >> JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a >> stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. >> The thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since >> each java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM >> thread may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have >> not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next >> safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From jesper.wilhelmsson at oracle.com Tue Jan 22 22:11:16 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Tue, 22 Jan 2019 23:11:16 +0100 Subject: RFR: JDK-8217580 - Remove tests from problemList as bugs has been closed Message-ID: <0140719C-4D7D-42A6-8667-DBB0F086DC73@oracle.com> Hi, Please review this patch that removes tests from the problemLists. The bugs referred to in these problemList entries has been closed and therefore the tests should not be problemlisted anymore. Please note that some of the bugs were closed as "Can not reproduce" and "Will not fix". If these tests starts failing again we need to re-evaluate these bugs. If a bug is closed as "Will not fix" and there are tests that reproduces that failure, the tests needs to be re-written to work around the bug or be removed. Bug: https://bugs.openjdk.java.net/browse/JDK-8217580 Webrev: http://cr.openjdk.java.net/~jwilhelm/8217580/webrev.00/ Thanks, /Jesper From igor.ignatyev at oracle.com Tue Jan 22 22:23:29 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 22 Jan 2019 14:23:29 -0800 Subject: RFR: JDK-8217580 - Remove tests from problemList as bugs has been closed In-Reply-To: <0140719C-4D7D-42A6-8667-DBB0F086DC73@oracle.com> References: <0140719C-4D7D-42A6-8667-DBB0F086DC73@oracle.com> Message-ID: Hi Jesper, looks good, thanks for taking care of it. (I haven't checked all the bugs, but I trust you did the right thing). one question, as it affects 12 (and hence 12u), should we push it to 12 repo? Thanks, -- Igor > On Jan 22, 2019, at 2:11 PM, jesper.wilhelmsson at oracle.com wrote: > > Hi, > > Please review this patch that removes tests from the problemLists. The bugs referred to in these problemList entries has been closed and therefore the tests should not be problemlisted anymore. > > Please note that some of the bugs were closed as "Can not reproduce" and "Will not fix". If these tests starts failing again we need to re-evaluate these bugs. If a bug is closed as "Will not fix" and there are tests that reproduces that failure, the tests needs to be re-written to work around the bug or be removed. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8217580 > Webrev: http://cr.openjdk.java.net/~jwilhelm/8217580/webrev.00/ > > Thanks, > /Jesper > From mikhailo.seledtsov at oracle.com Tue Jan 22 23:06:39 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Tue, 22 Jan 2019 15:06:39 -0800 Subject: RFR: JDK-8217580 - Remove tests from problemList as bugs has been closed In-Reply-To: References: <0140719C-4D7D-42A6-8667-DBB0F086DC73@oracle.com> Message-ID: <43ce478b-5145-3d56-591e-012eaf120cb5@oracle.com> Looks good. I double-checked the bug numbers. Misha On 1/22/19 2:23 PM, Igor Ignatyev wrote: > Hi Jesper, > > looks good, thanks for taking care of it. (I haven't checked all the bugs, but I trust you did the right thing). one question, as it affects 12 (and hence 12u), should we push it to 12 repo? > > Thanks, > -- Igor > >> On Jan 22, 2019, at 2:11 PM, jesper.wilhelmsson at oracle.com wrote: >> >> Hi, >> >> Please review this patch that removes tests from the problemLists. The bugs referred to in these problemList entries has been closed and therefore the tests should not be problemlisted anymore. >> >> Please note that some of the bugs were closed as "Can not reproduce" and "Will not fix". If these tests starts failing again we need to re-evaluate these bugs. If a bug is closed as "Will not fix" and there are tests that reproduces that failure, the tests needs to be re-written to work around the bug or be removed. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8217580 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8217580/webrev.00/ >> >> Thanks, >> /Jesper >> From alexey.menkov at oracle.com Wed Jan 23 02:03:40 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 22 Jan 2019 18:03:40 -0800 Subject: RFR (L) 8213501 : Deploy ExceptionJniWrapper for a few tests In-Reply-To: References: <45341168-e7e0-90d1-449f-210500882b8f@oracle.com> <55283958-de3d-07f2-51e3-ad34c5046a96@oracle.com> <31613f88-5f7d-938d-e9f6-69cdaf857268@oracle.com> <839301b7-c247-df3b-e485-283e8bb7388b@oracle.com> <95fe277d-ba6e-4fec-77aa-d1f1051751aa@oracle.com> <72bf2f4a-5bf7-98de-5f00-68485072923d@oracle.com> Message-ID: <25a50bc3-222c-a915-5517-37a2f9375c42@oracle.com> +1 --alex On 01/22/2019 10:29, JC Beyler wrote: > Thanks Paul! > > Anybody else for the review for version 6? > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > > > Thanks, > Jc > > On Tue, Jan 22, 2019 at 6:10 AM Hohensee, Paul > wrote: > > Lgtm :) > > Paul > > ?On 1/14/19, 7:46 AM, "hotspot-dev on behalf of JC Beyler" > on behalf of > jcbeyler at google.com > wrote: > > ? ? Hi all, > > ? ? Friendly ping on this one, I know that it has been a long > process with back > ? ? and forths, to which I apologize... > > ? ? But is there any way I could get a final LGTM for version 6? > > ? ? Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > ? ? Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > > ? ? Thanks! > ? ? Jc > > ? ? On Tue, Jan 8, 2019 at 10:05 AM JC Beyler > wrote: > > ? ? > Happy new year all! > ? ? > > ? ? > Could I get a final LGTM for version 6? > ? ? > > ? ? > Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > ? ? > Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? > > ? ? > Thanks! > ? ? > Jc > ? ? > > ? ? > On Mon, Dec 17, 2018 at 8:43 AM JC Beyler > > wrote: > ? ? > > ? ? >> Hi all, > ? ? >> > ? ? >> I don't believe I got actual LGTM for this version: > ? ? >> > ? ? >> > ? ? >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > ? ? >> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >> > ? ? >> > ? ? >> It removed the namespaces and uses explicit static instead :) > ? ? >> > ? ? >> Thanks! > ? ? >> Jc > ? ? >> > ? ? >> On Wed, Dec 12, 2018 at 8:06 PM JC Beyler > > wrote: > ? ? >> > ? ? >>> So did I Alexey but with David & Serguei preferring static, > it seems > ? ? >>> more reasonable to go down their route :-) > ? ? >>> > ? ? >>> So here is the latest webrev with static instead of an > anonymous > ? ? >>> namespace: > ? ? >>> > ? ? >>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.06/ > ? ? >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >>> > ? ? >>> Let me know what you think, can I get a webrev 06 review? > ? ? >>> > ? ? >>> Thanks! > ? ? >>> Jc > ? ? >>> > ? ? >>> On Wed, Dec 12, 2018 at 3:10 PM Alex Menkov > > > ? ? >>> wrote: > ? ? >>> > ? ? >>>> Hm.. > ? ? >>>> I considered unnamed namespaces "C++ style" (and static > globals as "C > ? ? >>>> style"). > ? ? >>>> Static globals were deprecated in C++ (but some time ago the > ? ? >>>> deprecation > ? ? >>>> was reverted). > ? ? >>>> > ? ? >>>> --alex > ? ? >>>> > ? ? >>>> On 12/12/2018 13:55, serguei.spitsyn at oracle.com > wrote: > ? ? >>>> > Agreed. > ? ? >>>> > > ? ? >>>> > Thanks, > ? ? >>>> > Serguei > ? ? >>>> > > ? ? >>>> > > ? ? >>>> > On 12/12/18 13:52, David Holmes wrote: > ? ? >>>> >> FWIW I think namespaces are overkill in all of this > test code and > ? ? >>>> just > ? ? >>>> >> obfuscates things - the declaration is easily missed. A > static > ? ? >>>> >> variable in a .cpp is clearly a global variable to the > file. > ? ? >>>> >> > ? ? >>>> >> Cheers, > ? ? >>>> >> David > ? ? >>>> >> > ? ? >>>> >> > ? ? >>>> >> > ? ? >>>> >> On 13/12/2018 5:37 am, serguei.spitsyn at oracle.com > wrote: > ? ? >>>> >>> Hi Jc, > ? ? >>>> >>> > ? ? >>>> >>> > ? ? >>>> >>> On 12/11/18 21:16, JC Beyler wrote: > ? ? >>>> >>>> Hi all, > ? ? >>>> >>>> > ? ? >>>> >>>> Here is the new webrev with the TEST.groups change. > Serguei, let > ? ? >>>> me > ? ? >>>> >>>> know if I convinced you with the static vs anonymous > namespaces or > ? ? >>>> >>>> if you'd still rather have a "static" for now :-) > ? ? >>>> >>> > ? ? >>>> >>> > ? ? >>>> >>> What do you think about this post? : > ? ? >>>> >>> > ? ? >>>> > https://stackoverflow.com/questions/11623451/static-vs-non-static-variables-in-namespace > ? ? >>>> >>> > ? ? >>>> >>> > ? ? >>>> >>>> > ? ? >>>> >>>> Webrev: > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.05/ > ? ? >>>> >>>> > > ? ? >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >>>> >>> > ? ? >>>> >>> The update looks fine. > ? ? >>>> >>> > ? ? >>>> >>> Thanks, > ? ? >>>> >>> Serguei > ? ? >>>> >>> > ? ? >>>> >>> > ? ? >>>> >>> Thanks, > ? ? >>>> >>> Serguei > ? ? >>>> >>> > ? ? >>>> >>>> > ? ? >>>> >>>> Thanks again for the reviews! > ? ? >>>> >>>> Jc > ? ? >>>> >>>> > ? ? >>>> >>>> On Mon, Dec 10, 2018 at 3:10 PM JC Beyler > > ? ? >>>> >>>> >> wrote: > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?Hi Serguei, > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?Yes basically it is equivalent :) I can put them > in but they > ? ? >>>> are > ? ? >>>> >>>>? ? ?not required. The norm actually wanted to > deprecate it but then > ? ? >>>> >>>>? ? ?remembered that C compatibility would require the > static > ? ? >>>> key-word > ? ? >>>> >>>>? ? ?for this case [1] > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?So, really, they are not required here and will > amount to the > ? ? >>>> same > ? ? >>>> >>>>? ? ?thing: only that file can refer to them and you > cannot get to > ? ? >>>> them > ? ? >>>> >>>>? ? ?without a globally available method to return a > pointer to them > ? ? >>>> >>>>? ? ?(ie same as a static variable in C). > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?I can put static if it makes it easier to see > but, by being in > ? ? >>>> an > ? ? >>>> >>>>? ? ?anonymous namespace they are only available for > the file's > ? ? >>>> >>>>? ? ?translation unit. For example: > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?$ cat main.cpp > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?int totally_global; > ? ? >>>> >>>>? ? ?static int explictly_static; > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?namespace { > ? ? >>>> >>>>? ? ?int implicitly_static; > ? ? >>>> >>>>? ? ?} > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?void foo(); > ? ? >>>> >>>>? ? ?int main() { > ? ? >>>> >>>>? ? ? ?foo(); > ? ? >>>> >>>>? ? ?} > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?$ g++ -O3 main.cpp -c > ? ? >>>> >>>>? ? ?$ nm main.o > ? ? >>>> >>>>? ? ? ? ? ? ? ? ? ? ? U _GLOBAL_OFFSET_TABLE_ > ? ? >>>> >>>>? ? ?0000000000000000 T main > ? ? >>>> >>>>? ? ?0000000000000000 B totally_global > ? ? >>>> >>>>? ? ? ? ? ? ? ? ? ? ? U _Z3foov > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?As you can see, the static and anonymous > namespace variables > ? ? >>>> are > ? ? >>>> >>>>? ? ?not in the file due to not being used. If you > were to use them, > ? ? >>>> >>>>? ? ?you'd see them show up as something like: > ? ? >>>> >>>>? ? ?0000000000000008 b _ZL17explicitly_static > ? ? >>>> >>>>? ? ?0000000000000004 b > _ZN12_GLOBAL__N_117implicitly_staticE > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?Where again, it shows that it is mangling the > names so that no > ? ? >>>> >>>>? ? ?external usage can happen without tinkering. > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?Hopefully that helps :-), > ? ? >>>> >>>>? ? ?Jc > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?[1] > ? ? >>>> >>>> > http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1012 > ? ? >>>> >>>> > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?On Mon, Dec 10, 2018 at 2:04 PM > serguei.spitsyn at oracle.com > ? ? >>>> >>>>? ? ? > < > ? ? >>>> serguei.spitsyn at oracle.com > ? ? >>>> >>>>? ? ? >> wrote: > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ? ? ?Hi Jc, > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ? ? ?I had little experience with the C++ namespaces. > ? ? >>>> >>>>? ? ? ? ?My understanding is that static in this > context should mean > ? ? >>>> >>>>? ? ? ? ?internal linkage. > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ? ? ?Thanks, > ? ? >>>> >>>>? ? ? ? ?Serguei > ? ? >>>> >>>> > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ? ? ?On 12/10/18 13:57, JC Beyler wrote: > ? ? >>>> >>>>>? ? ? ? ?Hi Serguei, > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ?The variables and functions are in a > anonymous namespace; > ? ? >>>> my > ? ? >>>> >>>>>? ? ? ? ?understanding of C++ is that this is > equivalent to > ? ? >>>> putting it > ? ? >>>> >>>>>? ? ? ? ?as static.Hence, I didn't add them there. > Does that make > ? ? >>>> >>>>> sense? > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ?Thanks! > ? ? >>>> >>>>>? ? ? ? ?Jc > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ?On Mon, Dec 10, 2018 at 1:33 PM > ? ? >>>> serguei.spitsyn at oracle.com > ? ? >>>> >>>>>? ? ? ? ? > > ? ? >>>> >>>>>? ? ? ? ? > ? ? >>>> >>>>>? ? ? ? ? >> wrote: > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ? ? ?Hi Jc, > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ? ? ?It looks good in general. > ? ? >>>> >>>>>? ? ? ? ? ? ?One question though. > ? ? >>>> >>>>> > ? ? >>>> >>>>> > ? ? >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a_04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html > ? ? >>>> >>>>> > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ? ? ?I wonder if the variables and functions > have to be > ? ? >>>> static. > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ? ? ?Thanks, > ? ? >>>> >>>>>? ? ? ? ? ? ?Serguei > ? ? >>>> >>>>> > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ? ? ?On 12/5/18 11:36, JC Beyler wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ?Hi all, > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ?My apologies to having to come back for > another > ? ? >>>> review > ? ? >>>> >>>>>>? ? ? ? ? ? ?for this change: I ran into a snag when > trying to > ? ? >>>> pull > ? ? >>>> >>>>>>? ? ? ? ? ? ?the latest changes compared to the base > I was working > ? ? >>>> >>>>>>? ? ? ? ? ? ?on. I basically forgot that there was > an issue with > ? ? >>>> >>>>>>? ? ? ? ? ? ?snprintf and that I had solved it via > JDK-8213622. > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ?Could I have a new review of this webrev: > ? ? >>>> >>>>>>? ? ? ? ? ? ?Webrev: > ? ? >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/ > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ?Bug: > ? ? >>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >>>> >>>>>>? ? ? ? ? ? ?Incremental from the port of webrev.03 > that got > ? ? >>>> LGTMs: > ? ? >>>> >>>>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04/ > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ?A few comments on this because it took > me a while to > ? ? >>>> get > ? ? >>>> >>>>>>? ? ? ? ? ? ?things in a state I thought was good: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ?- I had to implement an itoa method, > do we have > ? ? >>>> >>>>>>? ? ? ? ? ? ?something like that in the test base > (remember that > ? ? >>>> >>>>>>? ? ? ? ? ? ?JDK-8213622 could not use sprintf due > to being in the > ? ? >>>> >>>>>>? ? ? ? ? ? ?test code)? > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ?- The differences here compared to > the one you all > ? ? >>>> >>>>>>? ? ? ? ? ? ?reviewed are: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? ?- I found that adding to the > strlen/memcpy > ? ? >>>> error > ? ? >>>> >>>>>>? ? ? ? ? ? ?prone and thought that I would try to > make it less > ? ? >>>> so. > ? ? >>>> >>>>>>? ? ? ? ? ? ?If you want to compare, I extended the > strlen/memcpy > ? ? >>>> >>>>>>? ? ? ? ? ? ?with the new format to show you if you > prefer [1] > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? ? ? ? ?- Note that the diff > between the "old > ? ? >>>> >>>>>>? ? ? ? ? ? ?extended way from [1]" to the webrev.04 > can be found > ? ? >>>> >>>>>> in [2] > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? - I added a test to test the > exception wrapper > ? ? >>>> in > ? ? >>>> >>>>>>? ? ? ? ? ? ?tests :); I'm not sure it is deemed > useful or not but > ? ? >>>> >>>>>>? ? ? ? ? ? ?helped me assure myself that I was not > doing things > ? ? >>>> >>>>>>? ? ? ? ? ? ?wrong; you can find the base test file > here [3]; > ? ? >>>> should > ? ? >>>> >>>>>>? ? ? ? ? ? ?we have this or not? (I know that > normally we don't > ? ? >>>> add > ? ? >>>> >>>>>>? ? ? ? ? ? ?tests to vmTestbase but thought this > might be an > ? ? >>>> >>>>>> exception) > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ?Thanks for your help and my apologies > for the snag, > ? ? >>>> >>>>>>? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ?[1]: > ? ? >>>> >>>>>> > ? ? >>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html > ? ? >>>> >>>>>> > ? ? >>>> >>>>>> < > ? ? >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.03a/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html> > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ?[2]: > ? ? >>>> >>>>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03a_04 > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ?[3] > ? ? >>>> >>>>>> > ? ? >>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html > ? ? >>>> >>>>>> > ? ? >>>> >>>>>> < > ? ? >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.04/test/hotspot/jtreg/vmTestbase/nsk/share/ExceptionCheckingJniEnv/exceptionjni001/exceptionjni001.cpp.html> > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ?On Mon, Dec 3, 2018 at 11:29 PM David > Holmes > ? ? >>>> >>>>>>? ? ? ? ? ? ? > ? ? >>>> >>>>>>? ? ? ? ? ? ? >> wrote: > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?Looks fine to me. > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?David > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?On 4/12/2018 4:04 pm, JC Beyler wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> Hi both, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> Thanks for the reviews! Since > Serguei did not > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?insist on get_basename, I > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> went for get_dirname since the > method is a > ? ? >>>> local > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?static method and won't > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> have its name start spreading, I > think it's ok > ? ? >>>> too. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> For the naming of the local > variable, the idea > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?initially was to use the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> same name as the local variable > for JNIEnv > ? ? >>>> already > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?used to reduce the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> code change. Since I'm now adding > the line > ? ? >>>> macro > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?at the end anyway, this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> does not matter anymore so I > converged all > ? ? >>>> local > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?variables to "jni". > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> So, without further ado, here is > the new > ? ? >>>> version: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> Webrev: > ? ? >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.03/ > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> Bug: > ? ? >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> This passes the various tests > changed by the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?webrev on my dev machine. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> Let me know what you think, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> On Mon, Dec 3, 2018 at 8:40 PM > ? ? >>>> >>>>>> serguei.spitsyn at oracle.com > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > > ? ? >>>> >>>>>> >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > > ? ? >>>> >>>>>> >>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?On 12/3/18 20:15, Chris > Plummer wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > Hi JC, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > Overall it looks good. A > few naming nits > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?thought: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > In bi01t001.cpp, why have > you declared > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?ExceptionCheckingJniEnvPtr > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > using jni_env(jni). > Elsewhere you use > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?jni(jni_env) and rename the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > method argument passed in > from jni to > ? ? >>>> >>>>>> jni_env. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > Related to this, I also > noticed in some > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?files that already are using > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ExceptionCheckingJniEnvPtr, such as > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?CharArrayCriticalLocker.cpp, you > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > delcared it as > env(jni_env). So that > ? ? >>>> means > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?there are 3 different > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?names > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > you have used for the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?ExceptionCheckingJniEnvPtr local > variable. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?They > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > should be consistent. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > Also, can you rename > get_basename() to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?get_dirname()? I know Serguei > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > suggested get_basename() a > while back, > ? ? >>>> but > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?unless "basename" is > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > commonly used for this > purpose, I think > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?"dirname" is more self > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > explanatory. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?In general, I'm Okay with > get_dirname(). > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?Just to mention dirname can > be both short > ? ? >>>> or > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?full, so it is a little > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?confusing as well. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?It is the reason why the > get_basename() was > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?suggested. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?However, I do not insist on > get_basename() > ? ? >>>> nor > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?get_full_dirname(). :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?Serguei > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > Chris > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > On 12/2/18 10:29 PM, David > Holmes wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> Hi Jc, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> I've been lurking on this > one and have > ? ? >>>> had > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?a look through. I'm okay > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> with the FatalError > approach for the > ? ? >>>> tests > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?- we don't expect > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?anything > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> to go wrong in a well > written test in a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?correctly functioning VM. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> David > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >> On 3/12/2018 3:24 pm, JC > Beyler wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> Hi all, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> Would someone on the GC > or runtime > ? ? >>>> team > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?be motivated to give > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?this a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> review? :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> It would be much > appreciated! > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> Webrev: > ? ? >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> Bug: > ? ? >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> Thanks for your help, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> On Tue, Nov 27, 2018 at > 4:36 PM JC > ? ? >>>> Beyler > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > ? ? >>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > ? ? >>>> >>>>>> >>>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?Hi Chris, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?Yes I was waiting > for another > ? ? >>>> review > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?since you had explicitly > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> asked :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?And sounds good that > when someone > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?from GC or runtime gives a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> review, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?I'll wait for your > full review on > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?webrev.02! > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?Thanks again for > your help, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?On Tue, Nov 27, 2018 > at 12:48 PM > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?Chris Plummer > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > ? ? >>>> >>>>>> >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > ? ? >>>> >>>>>> >>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?Hi JC, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?I think it would > be good to > ? ? >>>> get a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?review from the gc or > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?runtime > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?teams, since > this also affects > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?their tests. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?Also, once we > are settled on > ? ? >>>> this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?FatalError approach, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?I still > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?need to give > your webrev-02 a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?full review. I only > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?skimmed over > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?parts of it (I > did look at all > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the changes in webrevo-01). > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?Chris > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ? ? ?On 11/27/18 8:58 AM, > ? ? >>>> >>>>>> serguei.spitsyn at oracle.com > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > > ? ? >>>> >>>>>> >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > > ? ? >>>> >>>>>> >>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>? ? ? ? ?Hi Jc, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>? ? ? ? ?I've already > reviewed this > ? ? >>>> too. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>? ? ? ? ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>? ? ? ? ?Serguei > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>? ? ? ? ?On 11/27/18 > 06:56, JC Beyler > ? ? >>>> >>>>>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> Thanks Chris, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> Anybody else motivated > to look at > ? ? >>>> this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?and review it? :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>? ? ? ? ?On Mon, Nov > 26, 2018 at > ? ? >>>> 1:26 PM > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?Chris Plummer > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? > ? ? >>>> >>>>>> >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? > ? ? >>>> >>>>>> >>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> Hi JC, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> I'm ok with the > FatalError approach, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?but would > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?like to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> hear opinions from > others also. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> Chris > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> On 11/21/18 8:19 AM, > JC Beyler > ? ? >>>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Hi Chris, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Thanks > for taking the > ? ? >>>> time > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?to look at it and yes you > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?have > raised exactly why > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the webrev is between two > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?worlds: > in cases where > ? ? >>>> a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?fatal error on failure is > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?wanted, > should we > ? ? >>>> simplify > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the code to remove > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?the return > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?tests > since we do them > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?internally? Now that I've > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?looked > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?around > for non-fatal > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?cases, I think the answer > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?is yes, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?it > simplifies the code > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?while maintaining the checks. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?I looked > a bit and it > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?seems that I can't find > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?easily a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?case > where the test > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?accepts a JNI failure to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?then move > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?on. > Therefore, perhaps, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?for now, the fail with a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?Fatal > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?is enough > and we can > ? ? >>>> work > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?on the tests to clean > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?them up? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?That > means that this is > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the new webrev with only > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?Fatal > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?and > cleans up the > ? ? >>>> tests so > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?that it is no longer in > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?between > two worlds: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Webrev: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/ > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Bug: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?(This > passes testing > ? ? >>>> on my > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?dev machine for all the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?modified > tests) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?with the > example you > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?provided, it now looks like: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html > ? ? >>>> >>>>>> > ? ? >>>> >>>>>> < > ? ? >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? < > ? ? >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Where it > does, to me at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?least, seem cleaner and less > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?"noisy". > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Let me > know what you > ? ? >>>> think, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?On Tue, > Nov 20, 2018 at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?9:33 PM Chris Plummer > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?< > ? ? >>>> chris.plummer at oracle.com > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? > ? ? >>>> >>>>>> >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? > ? ? >>>> >>>>>> >>>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?Hi JC, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?Sorry > about the > ? ? >>>> delay. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?I had to go back an > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?look at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?the > initial 8210842 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?webrev and RFR thread to see > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?what > this was > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?initially all about. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?In > general the > ? ? >>>> changes > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?look good. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?I > don't have a good > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?answer to your > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>> FatalError/NonFatalError > question. It > ? ? >>>> makes > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?the code > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?a lot > cleaner to > ? ? >>>> use > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?FatalError, but then it > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?is a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ?behavior change, > ? ? >>>> and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?you also need to deal with > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?tests > that > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?intentionally induce errors (do > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?you have > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?an > example of > ? ? >>>> that). > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?In > any case, right > ? ? >>>> now > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?your webrev seems to be > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ?between two worlds. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?You are producing > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?FatalError, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?but > still checking > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?results. Here's a good > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?example: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html > ? ? >>>> >>>>>> > ? ? >>>> >>>>>> < > ? ? >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? < > ? ? >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t003/ap04t003.cpp.frames.html> > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?I'm > not sure if > ? ? >>>> this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?is just a temporary > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?state until > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?it > was decided > ? ? >>>> which > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?approach to take. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?Chris > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ? ? ?On > 11/20/18 2:14 > ? ? >>>> PM, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?JC Beyler wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Hi all, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?Chris thought it > ? ? >>>> made > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?sense to have more > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?eyes on > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?this > change than > ? ? >>>> just > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?serviceability as it will > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?modify to tests > ? ? >>>> that > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?are not only > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?serviceability > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?tests so I've > ? ? >>>> moved > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?this to conversation > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?here :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?For > convenience, > ? ? >>>> I've > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?copy-pasted the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?initial RFR: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?Could I have a > ? ? >>>> review > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?for the extension and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?usage > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?of the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?ExceptionJniWrapper. This adds > lines and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?filenames to the > ? ? >>>> end > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?of the wrapper JNI > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?methods, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?adds > tracing, and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?throws an error if need > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?be. I've > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?ported the gc/lock > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?files to use the new > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?TRACE_JNI_CALL > ? ? >>>> add-on > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?and I've ported a few > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?of the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?tests that were > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?already changed for the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?assignment > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?webrev for > ? ? >>>> >>>>>> JDK-8212884. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Webrev: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.01 > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Bug: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?For > illustration, > ? ? >>>> if > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?I force an error to the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?AP04/ap04t03 test > ? ? >>>> and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?set the verbosity on, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?I get > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?something like: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?>> > Calling JNI > ? ? >>>> method > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?FindClass from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> ap04t003.cpp:343 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?>> > Calling with > ? ? >>>> these > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?parameter(s): > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> java/lang/Threadd > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Wait > for thread > ? ? >>>> to > ? ? >>>> >>>>>> finish > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?<< > Called JNI > ? ? >>>> method > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?FindClass from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> ap04t003.cpp:343 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?Exception in > ? ? >>>> thread > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?"Thread-0" > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > java.lang.NoClassDefFoundError: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?java/lang/Threadd > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Method) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?Caused by: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?java.lang.ClassNotFoundException: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?java.lang.Threadd > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>> > java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?... 3 more > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?FATAL ERROR in > ? ? >>>> native > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?method: JNI method > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?FindClass > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?: > internal error > ? ? >>>> from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?ap04t003.cpp:343 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Method) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> Questions/comments I > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?have about this are: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ? ?- > Do we want to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?force fatal errors when a JNI > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?call > fails in > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?general? Most of these tests > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?do the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?right thing and > ? ? >>>> test > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the return of the JNI > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?calls, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?for > example: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?thrClass = > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? jni->FindClass("java/lang/Threadd", > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?TRACE_JNI_CALL); > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?if (thrClass > ? ? >>>> == > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?NULL) { > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?but > now the > ? ? >>>> wrapper > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?actually would do a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?fatal if > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?the > FindClass call > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?would return a nullptr, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?so we > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?could remove that > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?test altogether. What do you > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> think? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?- I prefer to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?leave them as the tests then > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?become closer to > ? ? >>>> what > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?real users would have in > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?their code and is > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?"recommended" way of > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?doing it > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ? ? - The > ? ? >>>> alternative > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?is to use the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?NonFatalError I > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?added which then > ? ? >>>> just > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?prints out that something > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?went > wrong, > ? ? >>>> letting > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the test continue. Question > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?will > be what > ? ? >>>> should > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?be the default? The > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?fatal or > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?the > non-fatal > ? ? >>>> error > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?handling? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?On a > different > ? ? >>>> >>>>>> subject: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ? ?- > On the new > ? ? >>>> tests, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?I've removed the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?NSK_JNI_VERIFY > ? ? >>>> since > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the JNI wrapper > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?handles the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?tracing and the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?verify in almost the same > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?way; only > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?difference I can > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?really tell is that the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?complain > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?method from NSK > ? ? >>>> has a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?max complain before > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?stopping > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?to > "complain"; I > ? ? >>>> have > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?not added that part > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?of the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?code > in this > ? ? >>>> webrev > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Once > we decide on > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?these, I can continue on the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?files from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?JDK-8212884 and then do both the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?assignment in an > ? ? >>>> if > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?extraction followed-by this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?type > of webrev in > ? ? >>>> an > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?easier fashion. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?Depending on > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?decisions here, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?NSK*VERIFY can be deprecated as > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?well > as we go > ? ? >>>> forward. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Thanks! > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?On > Mon, Nov 19, > ? ? >>>> 2018 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?at 11:34 AM Chris Plummer > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>? ? ? ? ? ? ? ? ?< > ? ? >>>> chris.plummer at oracle.com > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? > ? ? >>>> >>>>>> >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? > ? ? >>>> >>>>>> >>>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?On 11/19/18 > ? ? >>>> 10:07 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?AM, JC Beyler wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?Hi all, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?@David/Chris: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?should I then push this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?RFR to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?the hotspot > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?mailing or the runtime > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?one? For > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?what it's > ? ? >>>> worth, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?a lot of the tests > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?under the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?vmTestbase > ? ? >>>> are > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?jvmti so the review also > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?affects > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?serviceability; it just turns > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?out I > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?started with > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?GC originally and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?then hit > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?some other > ? ? >>>> tests > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?I had touched via the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?assignment > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?extraction. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?I think > ? ? >>>> hotspot > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?would be best. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ?Chris > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?@Serguei: > ? ? >>>> Done > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?for the method > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?renaming, for > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?the indent, > ? ? >>>> are > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?you talking about > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?going from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?the 8-indent > ? ? >>>> to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?4-indent? If so, would > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?it not > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?just be > ? ? >>>> better > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?to do a new JBS bug and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?do the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?whole files > ? ? >>>> in > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?one go? I ask because > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?otherwise, it > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?will look a bit weird to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?have > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?parts of the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?file as 8-indent and others > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> 4-indent? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?Thanks for > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?looking at it! > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>>? ? ? ? ? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?On Mon, Nov > ? ? >>>> 19, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?2018 at 1:25 AM > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > serguei.spitsyn at oracle.com > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > ? ? >>>> >>>>>> >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> ? ? >>>> serguei.spitsyn at oracle.com > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > > ? ? >>>> >>>>>> >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > > ? ? >>>> >>>>>> >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> ? ? >>>> serguei.spitsyn at oracle.com > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > > ? ? >>>> >>>>>> >>>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?Hi Jc, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?We have > ? ? >>>> to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?start this review > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?anyway. :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?It looks > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?good to me in general. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?Thank you > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?for your consistency in this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> refactoring! > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?Some > ? ? >>>> minor > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?comments. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > http://cr.openjdk.java.net/%7Ejcbeyler/8213501/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/share/jni/ExceptionCheckingJniEnv.cpp.udiff.html > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?+static > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?const char* > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?remove_folders(const > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?char* > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?fullname) { I'd suggest to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?rename > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?the > ? ? >>>> function > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?name to something > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?traditional > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?like > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?get_basename. Otherwise, it > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?sounds > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?like this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?function has to really > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?remove > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?folders. > ? ? >>>> :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?Also, all *Locker.cpp have > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?wrong > ? ? >>>> indent > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?in the bodies of if > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?and while > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> statements. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?Could this be fixed > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?with the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> refactoring? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?I did not look on how > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?this > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?impacts > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?tests other than > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>> > ?serviceability. > ? ? >>>> Thanks, > ? ? >>>> >>>>>> Serguei > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? ?On > ? ? >>>> 11/16/18 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?19:43, JC Beyler wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ?Hi all, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ?Anybody > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?motivated to review this? :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>>? ? ? ? ? ? ? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ?On Wed, Nov > ? ? >>>> 7, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?2018 at 9:53 PM JC > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?Beyler > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? >> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > ? ? >>>> >>>>>> >>>> wrote: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Hi all, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Could I > ? ? >>>> have > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?a review for the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?extension > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?and usage of the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> ExceptionJniWrapper. This > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?adds lines > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?filenames to the end of the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?wrapper > ? ? >>>> JNI > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?methods, adds > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?tracing, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?and > ? ? >>>> throws > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?an error if need > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?be. I've > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?ported > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?gc/lock files to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?use the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?new > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?TRACE_JNI_CALL add-on and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?I've > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?ported a > ? ? >>>> few > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?of the tests > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?that were > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?already > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?changed for the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?assignment > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?webrev > ? ? >>>> for > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?JDK-8212884. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Webrev: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8213501/webrev.00/ > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>> > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Bug: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8213501 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?For > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?illustration, if I force > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?an error > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?to the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?AP04/ap04t03 test and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?set the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?verbosity > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?on, I get something > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?like: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?>> > ? ? >>>> Calling > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?JNI method > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?FindClass from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> ap04t003.cpp:343 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?>> > ? ? >>>> Calling > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?with these > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?parameter(s): > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> java/lang/Threadd > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Wait for > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?thread to finish > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?<< Called > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?JNI method > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?FindClass from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> ap04t003.cpp:343 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> Exception in > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?thread "Thread-0" > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > java.lang.NoClassDefFoundError: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> java/lang/Threadd > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Method) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Caused > ? ? >>>> by: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > java.lang.ClassNotFoundException: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> java.lang.Threadd > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?... 3 > ? ? >>>> more > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?FATAL > ? ? >>>> ERROR > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?in native method: JNI > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?method > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?FindClass : internal error > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?ap04t003.cpp:343 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003.runIterateOverHeap(Native > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Method) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003HeapIterator.runIteration(ap04t003.java:140) > ? ? >>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?at > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>> > ? ? >>>> > nsk.jvmti.scenarios.allocation.AP04.ap04t003Thread.run(ap04t003.java:201) > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> Questions/comments I > have about > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> this are: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ? ?- Do we > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?want to force fatal > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?errors > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?when a > ? ? >>>> JNI > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?call fails in general? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Most of > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?these tests do the right > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?thing and > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?test the return of > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?the JNI > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?calls, > ? ? >>>> for > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?example: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?thrClass > ? ? >>>> = > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > jni->FindClass("java/lang/Threadd", > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> TRACE_JNI_CALL); > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ? ? ?if > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?(thrClass == NULL) { > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?but now > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?wrapper actually > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?would do > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?a fatal > ? ? >>>> if > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the FindClass call > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?would > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?return a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?nullptr, so we could > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?remove > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?that test > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?altogether. What do > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?you > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> think? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ? ? ?- I > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?prefer to leave them > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?as the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?tests > ? ? >>>> then > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?become closer to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?what real > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?users > ? ? >>>> would > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?have in their > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?code and is > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?"recommended" way of doing it > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ? ? - The > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?alternative is to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?use the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ?NonFatalError I > ? ? >>>> added > ? ? >>>> >>>>>> which > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?then just > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?prints > ? ? >>>> out > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?that something > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?went wrong, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?letting > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?test continue. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?Question > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?will be > ? ? >>>> what > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?should be the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?default? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?The > ? ? >>>> fatal or > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the non-fatal error > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?handling? > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?On a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?different subject: > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ? ?- On > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?new tests, I've > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?removed > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?NSK_JNI_VERIFY since the JNI > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?wrapper > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?handles the tracing > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?and the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?verify in > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?almost the same > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?way; only > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> difference I > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?can really tell > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?is that > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?the > ? ? >>>> complain > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?method from NSK > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?has a > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?max > ? ? >>>> complain > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?before stopping to > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> "complain"; > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?I have not added that > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?part of > ? ? >>>> the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?code in this webrev > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Once we > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?decide on these, I can > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?continue > ? ? >>>> on > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?the files from > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> JDK-8212884 > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?and then do both the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> assignment > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?in an if extraction > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> followed-by > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?this type of > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?webrev in an > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?easier > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?fashion. Depending on > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?decisions > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?here, NSK*VERIFY can be > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> deprecated > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?as well as we go > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?forward. > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Thank you > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?for the > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ?reviews/comments :) > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>>? ? ? ? ? ? ? ? ? ? ? ? ?-- > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>> > ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> >>>>>>>>>? ? ? ? ? ? ? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>>? ? ? ? ? ? ? ? ? ? ?-- > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>> > ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>>? ? ? ? ? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?-- > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>>? ? ? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?-- > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>>? ? ? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>? ? ? ? ?-- > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>> Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>>>? ? ? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?-- > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>>? ? ?Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> -- > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? >>> Jc > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?>? ? ? > > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> -- > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ? ? ?> Jc > ? ? >>>> >>>>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>> > ? ? >>>> >>>>>>? ? ? ? ? ? ?-- > ? ? >>>> >>>>>>? ? ? ? ? ? ?Thanks, > ? ? >>>> >>>>>>? ? ? ? ? ? ?Jc > ? ? >>>> >>>>> > ? ? >>>> >>>>> > ? ? >>>> >>>>> > ? ? >>>> >>>>>? ? ? ? ?-- > ? ? >>>> >>>>>? ? ? ? ?Thanks, > ? ? >>>> >>>>>? ? ? ? ?Jc > ? ? >>>> >>>> > ? ? >>>> >>>> > ? ? >>>> >>>> > ? ? >>>> >>>>? ? ?-- > ? ? >>>> >>>>? ? ?Thanks, > ? ? >>>> >>>>? ? ?Jc > ? ? >>>> >>>> > ? ? >>>> >>>> > ? ? >>>> >>>> > ? ? >>>> >>>> -- > ? ? >>>> >>>> > ? ? >>>> >>>> Thanks, > ? ? >>>> >>>> Jc > ? ? >>>> >>> > ? ? >>>> > > ? ? >>>> > ? ? >>> > ? ? >>> > ? ? >>> -- > ? ? >>> > ? ? >>> Thanks, > ? ? >>> Jc > ? ? >>> > ? ? >> > ? ? >> > ? ? >> -- > ? ? >> > ? ? >> Thanks, > ? ? >> Jc > ? ? >> > ? ? > > ? ? > > ? ? > -- > ? ? > > ? ? > Thanks, > ? ? > Jc > ? ? > > > > ? ? -- > > ? ? Thanks, > ? ? Jc > > > > > -- > > Thanks, > Jc From david.holmes at oracle.com Wed Jan 23 09:42:21 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 23 Jan 2019 19:42:21 +1000 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <5d6d97d6-fbda-28f4-b625-a9ae351af5ba@oracle.com> References: <5d6d97d6-fbda-28f4-b625-a9ae351af5ba@oracle.com> Message-ID: Hi Robbin, Thanks for all the work on this! This is looking really good. I have one concern and that is the fact the WaitBarrier only uses an int as a tag, but the _safepoint_counter is a uint64_t. Seems to me that once the _safepoint_counter rolls over to need 33-bits then casting it to an int for the tag is going to give the dis-allowed zero value. ?? Specific comments below. Oh and can you add all the high-level description in your initial RFR email to the bug report please. Thanks. On 23/01/2019 1:39 am, Robbin Ehn wrote: > Hi all, here is v01 and v02. > > v01 contains update after comments from list: > http://cr.openjdk.java.net/~rehn/8203469/v01/ > http://cr.openjdk.java.net/~rehn/8203469/v01/inc/ > > v02 contains a bug fix, explained below: > http://cr.openjdk.java.net/~rehn/8203469/v02/ > http://cr.openjdk.java.net/~rehn/8203469/v02/inc/ Minor comments: src/hotspot/share/runtime/safepoint.cpp check_thread_safepoint_state needs a better name - what is it checking the state for? --- 368 Thread* myThread = Thread::current(); 369 assert(myThread->is_VM_thread(), "Only VM thread may execute a safepoint"); 558 DEBUG_ONLY(Thread* myThread = Thread::current();) 559 assert(myThread->is_VM_thread(), "Only VM thread can execute a safepoint"); You don't need the myThread local variables. That's it! :) (Thanks to Dan for tackling the updates to the commentary ;-) ). Thanks, David ----- > Patricio had some good questions about try_stable_load_state. > In previous internal versions I have done the stable load by loading > thread state before and after safepoint id. For some reason I changed > during a > refactoring to the reverse, which is incorrect. Consider the following: > > JavaThread: state / safepoint id / poll |VMThread: global state / > safepoint counter / WaitBarrier > ########################################|################################ > _thread_in_native?????? / 0 / disarmed? | _not_synchronized / 0 / disarmed > ??????????????????????????????????????? | _not_synchronized / 0 / armed(1) > ??????????????????????????????????????? | _not_synchronized / 1 / armed(1) > ??????????????????????????????????????? | _synchronizing??? / 1 / armed(1) > _thread_in_native?????? / 0 / armed???? | > ??????????????????????????????????????? | > ??????????????????????????????????????? | id:_thread_in_native> > ??????????????????????????????????????? | > ??????????????????????????????????????? | _synchonized????? / 1 / armed(1) > ????????? | > _thread_in_native_trans / 0 / armed???? | > ???????????? | > ????????????? | > ??????????????????????????????????????? | _not_synchonized? / 1 / armed(1) > ??????????????????????????????????????? | _not_synchonized? / 2 / armed(1) > _thread_in_native_trans / 0 / disarmed? | > ??????????????????????????????????????? | _not_synchonized? / 2 / disarmed > Next safepoint starts: > ??????????????????????????????????????? | _not_synchronized / 2 / armed(3) > ??????????????????????????????????????? | _not_synchronized / 3 / armed(3) > ??????????????????????????????????????? | _synchronizing??? / 3 / armed(3) > _thread_in_native_trans / 0 / armed???? | > ??????????????????????????????????????? | > ?????????????? | > ???? | > _thread_in_native_trans / 1 / armed???? | > _thread_blocked???????? / 1 / armed???? | > ?????????? | > ??????????????????????????????????????? | id:_thread_blocked> > _thread_in_native_trans / 1 / armed???? | > _thread_in_native_trans / 0 / armed???? | > ??????????????????????????????????????? | > > A false positive is read. > > When do it the correct the safe matrix looks like: > State load 1????? | Safepoint id | State load 2???? | Result > ##################|##############|##################|####### > any?????????????? | !0/current?? | any????????????? | treat all as unsafe > any?????????????? | any????????? | !state1????????? | treat all as unsafe > any?????????????? | 0/current??? | state1?????????? | suspend flag is safe > thread_in_native? | 0/current??? | thread_in_native | safe > thread_in_blocked | 0/current??? | thread_in_blocked| safe > !thread_in_blocked > && > !thread_in_native | 0/current??? | state1?????????? | unsafe > > The case with blocked/0/blocked I added this comment for: > > ?755?? // To handle the thread_blocked state on the backedge of the > WaitBarrier from > ?756?? // previous safepoint and reading the resetted > (0/InactiveSafepointCounter) we > ?757?? // re-read state after we read thread safepoint id. The > JavaThread changes it > ?758?? // state before resetting, the second read will either see a > different thread > ?759?? // state making this an unsafe state or it can see blocked again. > ?760?? // When we see blocked twice with a 0 safepoint id, either: > ?761?? // - It is normally blocked, e.g. on Mutex, TBIVM. > ?762?? // - It was in SS:block(), looped around to SS:block() and is > blocked on the WaitBarrier. > ?763?? // - It was in SS:block() but now on a Mutex. > ?764?? // Either case safe. > > I hope above explains why loading state before and after safepoint id is > sufficient. > > Passes, with flying colors, t1-5, stress test, KS 24h stress. > > Thanks, Robbin > > On 1/15/19 11:39 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes 0.2ms >> the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have transitioned >> to a safe >> state. To make them safe, we arm polls in compiled code and make sure any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native without >> being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming into >> the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all JavaThreads >> being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable >> load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. The >> thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since each >> java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM thread >> may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From robbin.ehn at oracle.com Wed Jan 23 11:29:00 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 23 Jan 2019 12:29:00 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: <5d6d97d6-fbda-28f4-b625-a9ae351af5ba@oracle.com> Message-ID: <6b69eba6-0293-c689-7a6a-c51bd8e4999a@oracle.com> Hi David, On 2019-01-23 10:42, David Holmes wrote: > Hi Robbin, > > Thanks for all the work on this! This is looking really good. Thanks! > > I have one concern and that is the fact the WaitBarrier only uses an int as a > tag, but the _safepoint_counter is a uint64_t. Seems to me that once the > _safepoint_counter rolls over to need 33-bits then casting it to an int for the > tag is going to give the dis-allowed zero value. ?? I had same concern, but since safepoint only happens during odd counters, we only arm the WaitBarrier with odd numbers. (even == no safepoint, odd = active safepoint) There is an assert on ~L520 checking that the safepoint counter was odd during the safepoint. Roll-over manually tested. > > Specific comments below. > > Oh and can you add all the high-level description in your initial RFR email to > the bug report please. Thanks. Fixed, updated with the correct text for the stable load. > > On 23/01/2019 1:39 am, Robbin Ehn wrote: >> Hi all, here is v01 and v02. >> >> v01 contains update after comments from list: >> http://cr.openjdk.java.net/~rehn/8203469/v01/ >> http://cr.openjdk.java.net/~rehn/8203469/v01/inc/ >> >> v02 contains a bug fix, explained below: >> http://cr.openjdk.java.net/~rehn/8203469/v02/ >> http://cr.openjdk.java.net/~rehn/8203469/v02/inc/ > > Minor comments: > > src/hotspot/share/runtime/safepoint.cpp > > check_thread_safepoint_state needs a better name - what is it checking the state > for? Changed to thread_not_running. > > --- > > ?368?? Thread* myThread = Thread::current(); > ?369?? assert(myThread->is_VM_thread(), "Only VM thread may execute a safepoint"); > > ?558?? DEBUG_ONLY(Thread* myThread = Thread::current();) > ?559?? assert(myThread->is_VM_thread(), "Only VM thread can execute a safepoint"); > > You don't need the myThread local variables. Removed Sending out a v04 soon. Thanks, Robbin > > > That's it! :)? (Thanks to Dan for tackling the updates to the commentary ;-) ). > > > Thanks, > David > ----- > > >> Patricio had some good questions about try_stable_load_state. >> In previous internal versions I have done the stable load by loading thread >> state before and after safepoint id. For some reason I changed during a >> refactoring to the reverse, which is incorrect. Consider the following: >> >> JavaThread: state / safepoint id / poll |VMThread: global state / safepoint >> counter / WaitBarrier >> ########################################|################################ >> _thread_in_native?????? / 0 / disarmed? | _not_synchronized / 0 / disarmed >> ???????????????????????????????????????? | _not_synchronized / 0 / armed(1) >> ???????????????????????????????????????? | _not_synchronized / 1 / armed(1) >> ???????????????????????????????????????? | _synchronizing??? / 1 / armed(1) >> _thread_in_native?????? / 0 / armed???? | >> ???????????????????????????????????????? | >> ???????????????????????????????????????? | > id:_thread_in_native> >> ???????????????????????????????????????? | >> ???????????????????????????????????????? | _synchonized????? / 1 / armed(1) >> ????????? | >> _thread_in_native_trans / 0 / armed???? | >> ???????????? | >> ????????????? | >> ???????????????????????????????????????? | _not_synchonized? / 1 / armed(1) >> ???????????????????????????????????????? | _not_synchonized? / 2 / armed(1) >> _thread_in_native_trans / 0 / disarmed? | >> ???????????????????????????????????????? | _not_synchonized? / 2 / disarmed >> Next safepoint starts: >> ???????????????????????????????????????? | _not_synchronized / 2 / armed(3) >> ???????????????????????????????????????? | _not_synchronized / 3 / armed(3) >> ???????????????????????????????????????? | _synchronizing??? / 3 / armed(3) >> _thread_in_native_trans / 0 / armed???? | >> ???????????????????????????????????????? | >> ?????????????? | >> ???? | >> _thread_in_native_trans / 1 / armed???? | >> _thread_blocked???????? / 1 / armed???? | >> ?????????? | >> ???????????????????????????????????????? | > id:_thread_blocked> >> _thread_in_native_trans / 1 / armed???? | >> _thread_in_native_trans / 0 / armed???? | >> ???????????????????????????????????????? | >> >> A false positive is read. >> >> When do it the correct the safe matrix looks like: >> State load 1????? | Safepoint id | State load 2???? | Result >> ##################|##############|##################|####### >> any?????????????? | !0/current?? | any????????????? | treat all as unsafe >> any?????????????? | any????????? | !state1????????? | treat all as unsafe >> any?????????????? | 0/current??? | state1?????????? | suspend flag is safe >> thread_in_native? | 0/current??? | thread_in_native | safe >> thread_in_blocked | 0/current??? | thread_in_blocked| safe >> !thread_in_blocked >> && >> !thread_in_native | 0/current??? | state1?????????? | unsafe >> >> The case with blocked/0/blocked I added this comment for: >> >> ??755?? // To handle the thread_blocked state on the backedge of the >> WaitBarrier from >> ??756?? // previous safepoint and reading the resetted >> (0/InactiveSafepointCounter) we >> ??757?? // re-read state after we read thread safepoint id. The JavaThread >> changes it >> ??758?? // state before resetting, the second read will either see a different >> thread >> ??759?? // state making this an unsafe state or it can see blocked again. >> ??760?? // When we see blocked twice with a 0 safepoint id, either: >> ??761?? // - It is normally blocked, e.g. on Mutex, TBIVM. >> ??762?? // - It was in SS:block(), looped around to SS:block() and is blocked >> on the WaitBarrier. >> ??763?? // - It was in SS:block() but now on a Mutex. >> ??764?? // Either case safe. >> >> I hope above explains why loading state before and after safepoint id is >> sufficient. >> >> Passes, with flying colors, t1-5, stress test, KS 24h stress. >> >> Thanks, Robbin >> >> On 1/15/19 11:39 AM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin From david.holmes at oracle.com Wed Jan 23 11:43:42 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 23 Jan 2019 21:43:42 +1000 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <6b69eba6-0293-c689-7a6a-c51bd8e4999a@oracle.com> References: <5d6d97d6-fbda-28f4-b625-a9ae351af5ba@oracle.com> <6b69eba6-0293-c689-7a6a-c51bd8e4999a@oracle.com> Message-ID: <472a5740-0e0e-d380-860c-597f288f6b12@oracle.com> On 23/01/2019 9:29 pm, Robbin Ehn wrote: > Hi David, > > On 2019-01-23 10:42, David Holmes wrote: >> Hi Robbin, >> >> Thanks for all the work on this! This is looking really good. > > Thanks! > >> >> I have one concern and that is the fact the WaitBarrier only uses an >> int as a tag, but the _safepoint_counter is a uint64_t. Seems to me >> that once the _safepoint_counter rolls over to need 33-bits then >> casting it to an int for the tag is going to give the dis-allowed zero >> value. ?? > > I had same concern, but since safepoint only happens during odd > counters, we only arm the WaitBarrier with odd numbers. > (even == no safepoint, odd = active safepoint) > There is an assert on ~L520 checking that the safepoint counter was odd > during the safepoint. > Roll-over manually tested. Ah I see. Thanks for clarifying. David ----- >> >> Specific comments below. >> >> Oh and can you add all the high-level description in your initial RFR >> email to the bug report please. Thanks. > > Fixed, updated with the correct text for the stable load. > >> >> On 23/01/2019 1:39 am, Robbin Ehn wrote: >>> Hi all, here is v01 and v02. >>> >>> v01 contains update after comments from list: >>> http://cr.openjdk.java.net/~rehn/8203469/v01/ >>> http://cr.openjdk.java.net/~rehn/8203469/v01/inc/ >>> >>> v02 contains a bug fix, explained below: >>> http://cr.openjdk.java.net/~rehn/8203469/v02/ >>> http://cr.openjdk.java.net/~rehn/8203469/v02/inc/ >> >> Minor comments: >> >> src/hotspot/share/runtime/safepoint.cpp >> >> check_thread_safepoint_state needs a better name - what is it checking >> the state for? > > Changed to thread_not_running. > >> >> --- >> >> ??368?? Thread* myThread = Thread::current(); >> ??369?? assert(myThread->is_VM_thread(), "Only VM thread may execute a >> safepoint"); >> >> ??558?? DEBUG_ONLY(Thread* myThread = Thread::current();) >> ??559?? assert(myThread->is_VM_thread(), "Only VM thread can execute a >> safepoint"); >> >> You don't need the myThread local variables. > > Removed > > Sending out a v04 soon. > > Thanks, Robbin > >> >> >> That's it! :)? (Thanks to Dan for tackling the updates to the >> commentary ;-) ). >> >> >> Thanks, >> David >> ----- >> >> >>> Patricio had some good questions about try_stable_load_state. >>> In previous internal versions I have done the stable load by loading >>> thread state before and after safepoint id. For some reason I changed >>> during a >>> refactoring to the reverse, which is incorrect. Consider the following: >>> >>> JavaThread: state / safepoint id / poll |VMThread: global state / >>> safepoint counter / WaitBarrier >>> ########################################|################################ >>> >>> _thread_in_native?????? / 0 / disarmed? | _not_synchronized / 0 / >>> disarmed >>> ???????????????????????????????????????? | _not_synchronized / 0 / >>> armed(1) >>> ???????????????????????????????????????? | _not_synchronized / 1 / >>> armed(1) >>> ???????????????????????????????????????? | _synchronizing??? / 1 / >>> armed(1) >>> _thread_in_native?????? / 0 / armed???? | >>> ???????????????????????????????????????? | >> id:0> >>> ???????????????????????????????????????? | >> state id:_thread_in_native> >>> ???????????????????????????????????????? | >> id:0> >>> ???????????????????????????????????????? | _synchonized????? / 1 / >>> armed(1) >>> ????????? | >>> _thread_in_native_trans / 0 / armed???? | >>> ???????????? | >>> ????????????? | >>> ???????????????????????????????????????? | _not_synchonized? / 1 / >>> armed(1) >>> ???????????????????????????????????????? | _not_synchonized? / 2 / >>> armed(1) >>> _thread_in_native_trans / 0 / disarmed? | >>> ???????????????????????????????????????? | _not_synchonized? / 2 / >>> disarmed >>> Next safepoint starts: >>> ???????????????????????????????????????? | _not_synchronized / 2 / >>> armed(3) >>> ???????????????????????????????????????? | _not_synchronized / 3 / >>> armed(3) >>> ???????????????????????????????????????? | _synchronizing??? / 3 / >>> armed(3) >>> _thread_in_native_trans / 0 / armed???? | >>> ???????????????????????????????????????? | >> id:0> >>> ?????????????? | >>> ???? | >>> _thread_in_native_trans / 1 / armed???? | >>> _thread_blocked???????? / 1 / armed???? | >>> ?????????? | >>> ???????????????????????????????????????? | >> state id:_thread_blocked> >>> _thread_in_native_trans / 1 / armed???? | >>> _thread_in_native_trans / 0 / armed???? | >>> ???????????????????????????????????????? | >> id:0> >>> >>> A false positive is read. >>> >>> When do it the correct the safe matrix looks like: >>> State load 1????? | Safepoint id | State load 2???? | Result >>> ##################|##############|##################|####### >>> any?????????????? | !0/current?? | any????????????? | treat all as >>> unsafe >>> any?????????????? | any????????? | !state1????????? | treat all as >>> unsafe >>> any?????????????? | 0/current??? | state1?????????? | suspend flag is >>> safe >>> thread_in_native? | 0/current??? | thread_in_native | safe >>> thread_in_blocked | 0/current??? | thread_in_blocked| safe >>> !thread_in_blocked >>> && >>> !thread_in_native | 0/current??? | state1?????????? | unsafe >>> >>> The case with blocked/0/blocked I added this comment for: >>> >>> ??755?? // To handle the thread_blocked state on the backedge of the >>> WaitBarrier from >>> ??756?? // previous safepoint and reading the resetted >>> (0/InactiveSafepointCounter) we >>> ??757?? // re-read state after we read thread safepoint id. The >>> JavaThread changes it >>> ??758?? // state before resetting, the second read will either see a >>> different thread >>> ??759?? // state making this an unsafe state or it can see blocked >>> again. >>> ??760?? // When we see blocked twice with a 0 safepoint id, either: >>> ??761?? // - It is normally blocked, e.g. on Mutex, TBIVM. >>> ??762?? // - It was in SS:block(), looped around to SS:block() and is >>> blocked on the WaitBarrier. >>> ??763?? // - It was in SS:block() but now on a Mutex. >>> ??764?? // Either case safe. >>> >>> I hope above explains why loading state before and after safepoint id is >>> sufficient. >>> >>> Passes, with flying colors, t1-5, stress test, KS 24h stress. >>> >>> Thanks, Robbin >>> >>> On 1/15/19 11:39 AM, Robbin Ehn wrote: >>>> Hi all, please review. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>>> >>>> Thanks to Dan for pre-reviewing a lot! >>>> >>>> Background: >>>> ZGC often does very short safepoint operations. For a perspective, in a >>>> specJBB2015 run, G1 can have young collection stops lasting about >>>> 170 ms. While >>>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on >>>> which >>>> operation it is. The time it takes to stop and start the JavaThreads >>>> is relative >>>> very large to a ZGC safepoint. With an operation that just takes >>>> 0.2ms the >>>> overhead of stopping and starting JavaThreads is several times the >>>> operation. >>>> >>>> High-level functionality change: >>>> Serializing the starting over Threads_lock takes time. >>>> - Don't wait on Threads_lock use the WaitBarrier. >>>> Serializing the stopping over Safepoint_lock takes time. >>>> - Let threads stop in parallel, remove Safepoint_lock. >>>> >>>> Details: >>>> JavaThreads have 2 abstract logical states: unsafe or safe. >>>> - Safe means the JavaThread will not touch Java heap or VM internal >>>> structures >>>> ?? without doing a transition and block before doing so. >>>> ???????? - The safe states are: >>>> ???????????????? - When polls armed: _thread_in_native and >>>> _thread_blocked. >>>> ???????????????? - When Threads_lock is held: externally suspended >>>> flag is set. >>>> ???????? - VM Thread have polls armed and holds the Threads_lock >>>> during a >>>> ?????????? safepoint. >>>> - Unsafe means that either Java heap or VM internal structures can >>>> be accessed >>>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>>> ???????? - All combination that are not safe are unsafe. >>>> >>>> We cannot start a safepoint until all unsafe threads have >>>> transitioned to a safe >>>> state. To make them safe, we arm polls in compiled code and make >>>> sure any >>>> transition to another unsafe state will be blocked. JavaThreads >>>> which are unsafe >>>> with state _thread_in_Java may transition to _thread_in_native >>>> without being >>>> blocked, since it just became a safe thread and we can proceed. Any >>>> safe thread >>>> may try to transition at any time to an unsafe state, thus coming >>>> into the >>>> safepoint blocking code at any moment, e.g., after the safepoint is >>>> over, or >>>> even at the beginning of next safepoint. >>>> >>>> The VMThread cannot tolerate false positives from the JavaThread >>>> thread state >>>> because that would mean starting the safepoint without all >>>> JavaThreads being >>>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >>>> never observe >>>> false positives from the safepoint blocking code, if we remove them, >>>> how do we >>>> handle false positives? >>>> >>>> By first publishing which barrier tag (safepoint counter) we will call >>>> WaitBarrier.wait() with as the threads safepoint id and then change >>>> the state to >>>> _thread_blocked, the VMThread can ignore JavaThreads by doing a >>>> stable load of >>>> the state. A stable load of the thread state is successful if the >>>> thread >>>> safepoint id is the same both before and after the load of the state >>>> and >>>> safepoint id is current or InactiveSafepointCounter. If the stable >>>> load fails, >>>> the thread is considered safepoint unsafe. It's no longer enough >>>> that thread is >>>> have state _thread_blocked it must also have correct safepoint id >>>> before and >>>> after we read the state. >>>> >>>> Performance: >>>> The result of faster safepoints is that the average CPU time for >>>> JavaThreads >>>> between safepoints is higher, thus increasing the allocation rate. >>>> The thread >>>> that stops first waits shorter time until it gets started. Even the >>>> thread that >>>> stops last also have shorter stop since we start them faster. If your >>>> application is using a concurrent GC it may need re-tunning since >>>> each java >>>> worker thread have an increased CPU time/allocation rate. Often this >>>> means max >>>> performance is achieved using slightly less java worker threads than >>>> before. >>>> Also the increase allocation rate means shorter time between GC >>>> safepoints. >>>> - If you are using a non-concurrent GC, you should see improved >>>> latency and >>>> ?? throughput. >>>> - After re-tunning with a concurrent GC throughput should be equal >>>> or better but >>>> ?? with better latency. But bear in mind this is a latency patch, not a >>>> ?? throughput one. >>>> With current code a java thread is not to guarantee to run between >>>> safepoint (in >>>> theory a java thread can be starved indefinitely), since the VM >>>> thread may >>>> re-grab the Threads_locks before it woke up from previous safepoint. >>>> If the >>>> GC/VM don't respect MMU (minimum mutator utilization) or if your >>>> machine is very >>>> over-provisioned this can happen. >>>> The current schema thus re-safepoint quickly if the java threads >>>> have not >>>> started yet at the cost of latency. Since the new code uses the >>>> WaitBarrier with >>>> the safepoint counter, all threads must roll forward to next >>>> safepoint by >>>> getting at least some CPU time between two safepoints. Meaning MMU >>>> violations >>>> are more obvious. >>>> >>>> Some examples on numbers: >>>> - On a 16 strand machine synchronization and >>>> un-synchronization/starting is at >>>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >>>> ~100us and >>>> ?? starting ~400->~100us. >>>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >>>> Linux). >>>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>>> ?? synchronization time on 16 strands and ~5% score increase. In >>>> this case the GC >>>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >>>> to 10%. >>>> - specJBB2015 ParGC ~9% increase in critical-jops. >>>> >>>> Thanks, Robbin From robbin.ehn at oracle.com Wed Jan 23 11:53:28 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 23 Jan 2019 12:53:28 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: <5d6d97d6-fbda-28f4-b625-a9ae351af5ba@oracle.com> Message-ID: <7d02cedc-c3c7-fa9d-e7d1-7d5cfb8ab933@oracle.com> Hi Dan, > > ??? L755: ? // To handle the thread_blocked state on the backedge of the > WaitBarrier from > ??? L756: ? // previous safepoint and reading the resetted > (0/InactiveSafepointCounter) we > ??? L757: ? // re-read state after we read thread safepoint id. The JavaThread > changes it > ??? L758: ? // state before resetting, the second read will either see a > different thread > ??? L759: ? // state making this an unsafe state or it can see blocked again. > ??? L760: ? // When we see blocked twice with a 0 safepoint id, either: > ??? L761: ? // - It is normally blocked, e.g. on Mutex, TBIVM. > ??? L762: ? // - It was in SS:block(), looped around to SS:block() and is > blocked on the WaitBarrier. > ??? L763: ? // - It was in SS:block() but now on a Mutex. > ??? L764: ? // Either case safe. > ??????? Please consider these minor tweaks: > > ??? ? ?? ? // To handle the thread_blocked state on the backedge of the > WaitBarrier from a > ??? ? ?? ? // previous safepoint and reading the possibly reset > (0/InactiveSafepointCounter) > ??? ? ?? ? // id, re-read state after we read thread safepoint id. If the > JavaThread changes > ??? ? ?? ? // its state before resetting, the second read will either see a > different thread > ??? ? ?? ? // state making this an unsafe state or it can see blocked again. > ??? ? ?? ? // When we see blocked twice with a 0 safepoint id, this means: > ??? ? ?? ? // - It is normally blocked, e.g., on Mutex, TBIVM. > ??? ? ?? ? // - It was in SS:block(), looped around to SS:block() and is > blocked on the WaitBarrier. > ??? ? ?? ? // - It was in SS:block() but now on a Mutex. > ?????????? // All of these cases are safe. > I did change the text a bit here, hope you like it. All other fixed. > > Thanks for persisting with this work. Thumbs up! All of my comments > in this round are editorial. I don't need to see another webrev if > you choose to make the above changes. Thanks! v03 to RFR mail. /Robbin > > Dan > > >> http://cr.openjdk.java.net/~rehn/8203469/v02/inc/ >> >> Patricio had some good questions about try_stable_load_state. >> In previous internal versions I have done the stable load by loading thread >> state before and after safepoint id. For some reason I changed during a >> refactoring to the reverse, which is incorrect. Consider the following: >> >> JavaThread: state / safepoint id / poll |VMThread: global state / safepoint >> counter / WaitBarrier >> ########################################|################################ >> _thread_in_native?????? / 0 / disarmed? | _not_synchronized / 0 / disarmed >> ??????????????????????????????????????? | _not_synchronized / 0 / armed(1) >> ??????????????????????????????????????? | _not_synchronized / 1 / armed(1) >> ??????????????????????????????????????? | _synchronizing??? / 1 / armed(1) >> _thread_in_native?????? / 0 / armed???? | >> ??????????????????????????????????????? | >> ??????????????????????????????????????? | > id:_thread_in_native> >> ??????????????????????????????????????? | >> ??????????????????????????????????????? | _synchonized????? / 1 / armed(1) >> ????????? | >> _thread_in_native_trans / 0 / armed???? | >> ???????????? | >> ????????????? | >> ??????????????????????????????????????? | _not_synchonized? / 1 / armed(1) >> ??????????????????????????????????????? | _not_synchonized? / 2 / armed(1) >> _thread_in_native_trans / 0 / disarmed? | >> ??????????????????????????????????????? | _not_synchonized? / 2 / disarmed >> Next safepoint starts: >> ??????????????????????????????????????? | _not_synchronized / 2 / armed(3) >> ??????????????????????????????????????? | _not_synchronized / 3 / armed(3) >> ??????????????????????????????????????? | _synchronizing??? / 3 / armed(3) >> _thread_in_native_trans / 0 / armed???? | >> ??????????????????????????????????????? | >> ?????????????? | >> ???? | >> _thread_in_native_trans / 1 / armed???? | >> _thread_blocked???????? / 1 / armed???? | >> ?????????? | >> ??????????????????????????????????????? | > id:_thread_blocked> >> _thread_in_native_trans / 1 / armed???? | >> _thread_in_native_trans / 0 / armed???? | >> ??????????????????????????????????????? | >> >> A false positive is read. >> >> When do it the correct the safe matrix looks like: >> State load 1????? | Safepoint id | State load 2???? | Result >> ##################|##############|##################|####### >> any?????????????? | !0/current?? | any????????????? | treat all as unsafe >> any?????????????? | any????????? | !state1????????? | treat all as unsafe >> any?????????????? | 0/current??? | state1?????????? | suspend flag is safe >> thread_in_native? | 0/current??? | thread_in_native | safe >> thread_in_blocked | 0/current??? | thread_in_blocked| safe >> !thread_in_blocked >> && >> !thread_in_native | 0/current??? | state1?????????? | unsafe >> >> The case with blocked/0/blocked I added this comment for: >> >> ?755?? // To handle the thread_blocked state on the backedge of the >> WaitBarrier from >> ?756?? // previous safepoint and reading the resetted >> (0/InactiveSafepointCounter) we >> ?757?? // re-read state after we read thread safepoint id. The JavaThread >> changes it >> ?758?? // state before resetting, the second read will either see a different >> thread >> ?759?? // state making this an unsafe state or it can see blocked again. >> ?760?? // When we see blocked twice with a 0 safepoint id, either: >> ?761?? // - It is normally blocked, e.g. on Mutex, TBIVM. >> ?762?? // - It was in SS:block(), looped around to SS:block() and is blocked >> on the WaitBarrier. >> ?763?? // - It was in SS:block() but now on a Mutex. >> ?764?? // Either case safe. >> >> I hope above explains why loading state before and after safepoint id is >> sufficient. >> >> Passes, with flying colors, t1-5, stress test, KS 24h stress. >> >> Thanks, Robbin >> >> On 1/15/19 11:39 AM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin > From robbin.ehn at oracle.com Wed Jan 23 13:33:37 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 23 Jan 2019 14:33:37 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Hi all, here is v03. It's contains the update from comments and: I notice safepoint.hpp contained wrong/not need inline keyword for methods. Those method are either default inline because they are defined in the declaration (header) or since they are defined in the same cpp unit as callers and thus can be inlined any way. http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ http://cr.openjdk.java.net/~rehn/8203469/v03/ Passes t1. Thanks, Robbin On 2019-01-15 11:39, Robbin Ehn wrote: > Hi all, please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 > Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ > > Thanks to Dan for pre-reviewing a lot! > > Background: > ZGC often does very short safepoint operations. For a perspective, in a > specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While > in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which > operation it is. The time it takes to stop and start the JavaThreads is relative > very large to a ZGC safepoint. With an operation that just takes 0.2ms the > overhead of stopping and starting JavaThreads is several times the operation. > > High-level functionality change: > Serializing the starting over Threads_lock takes time. > - Don't wait on Threads_lock use the WaitBarrier. > Serializing the stopping over Safepoint_lock takes time. > - Let threads stop in parallel, remove Safepoint_lock. > > Details: > JavaThreads have 2 abstract logical states: unsafe or safe. > - Safe means the JavaThread will not touch Java heap or VM internal structures > ? without doing a transition and block before doing so. > ??????? - The safe states are: > ??????????????? - When polls armed: _thread_in_native and _thread_blocked. > ??????????????? - When Threads_lock is held: externally suspended flag is set. > ??????? - VM Thread have polls armed and holds the Threads_lock during a > ????????? safepoint. > - Unsafe means that either Java heap or VM internal structures can be accessed > ? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. > ??????? - All combination that are not safe are unsafe. > > We cannot start a safepoint until all unsafe threads have transitioned to a safe > state. To make them safe, we arm polls in compiled code and make sure any > transition to another unsafe state will be blocked. JavaThreads which are unsafe > with state _thread_in_Java may transition to _thread_in_native without being > blocked, since it just became a safe thread and we can proceed. Any safe thread > may try to transition at any time to an unsafe state, thus coming into the > safepoint blocking code at any moment, e.g., after the safepoint is over, or > even at the beginning of next safepoint. > > The VMThread cannot tolerate false positives from the JavaThread thread state > because that would mean starting the safepoint without all JavaThreads being > safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe > false positives from the safepoint blocking code, if we remove them, how do we > handle false positives? > > By first publishing which barrier tag (safepoint counter) we will call > WaitBarrier.wait() with as the threads safepoint id and then change the state to > _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of > the state. A stable load of the thread state is successful if the thread > safepoint id is the same both before and after the load of the state and > safepoint id is current or InactiveSafepointCounter. If the stable load fails, > the thread is considered safepoint unsafe. It's no longer enough that thread is > have state _thread_blocked it must also have correct safepoint id before and > after we read the state. > > Performance: > The result of faster safepoints is that the average CPU time for JavaThreads > between safepoints is higher, thus increasing the allocation rate. The thread > that stops first waits shorter time until it gets started. Even the thread that > stops last also have shorter stop since we start them faster. If your > application is using a concurrent GC it may need re-tunning since each java > worker thread have an increased CPU time/allocation rate. Often this means max > performance is achieved using slightly less java worker threads than before. > Also the increase allocation rate means shorter time between GC safepoints. > - If you are using a non-concurrent GC, you should see improved latency and > ? throughput. > - After re-tunning with a concurrent GC throughput should be equal or better but > ? with better latency. But bear in mind this is a latency patch, not a > ? throughput one. > With current code a java thread is not to guarantee to run between safepoint (in > theory a java thread can be starved indefinitely), since the VM thread may > re-grab the Threads_locks before it woke up from previous safepoint. If the > GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very > over-provisioned this can happen. > The current schema thus re-safepoint quickly if the java threads have not > started yet at the cost of latency. Since the new code uses the WaitBarrier with > the safepoint counter, all threads must roll forward to next safepoint by > getting at least some CPU time between two safepoints. Meaning MMU violations > are more obvious. > > Some examples on numbers: > - On a 16 strand machine synchronization and un-synchronization/starting is at > ? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and > ? starting ~400->~100us. > ? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). > - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster > ? synchronization time on 16 strands and ~5% score increase. In this case the GC > ? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. > - specJBB2015 ParGC ~9% increase in critical-jops. > > Thanks, Robbin From coleen.phillimore at oracle.com Wed Jan 23 14:20:34 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 23 Jan 2019 09:20:34 -0500 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> Message-ID: After some internal discussion, Dean convinced me that removing the Compile_lock here might be too dangerous.?? So for these asserts and the error condition, the compiler thread goes to VM from native to check the SystemDictionary::modification_counter under the Compile_lock, with safepoint checking always. Tested with tier1,2,6 and 8. open webrev at http://cr.openjdk.java.net/~coleenp/8216136.02/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8216136 Thanks, Coleen On 1/17/19 7:15 AM, coleen.phillimore at oracle.com wrote: > > > On 1/16/19 10:53 PM, dean.long at oracle.com wrote: >> Hi Coleen.? You still can't safely call notice_modification() outside >> of Compile_lock, (at least not without other changes), so this: >> >> - static inline void notice_modification() { >> assert_locked_or_safepoint(Compile_lock); ++_number_of_modifications; } >> + static inline void notice_modification() { >> Atomic::inc(&_number_of_modifications); } >> >> should be: >> >> static inline void notice_modification() { >> assert_locked_or_safepoint(Compile_lock); >> Atomic::inc(&_number_of_modifications); } >> >> >> Are you trying to eventually remove Compile_lock completely?? If so, >> then notice_modification() would have to be called *before* the >> class hierarchy is changed, not after, and probably other changes >> would be needed as well. > Dean, Thank you for looking at this and your comments. > > No, I'm not trying to remove Compile_lock entirely and I can assert > that notice_modification has the Compile_lock as above. The class > hierarchy code has been changed to be lock free rather than requiring > the Compile_lock, although I think the Compile_lock still protects > some of this code. > > There are also some Compile_lock free ways of getting to dependencies, > because putting notice_modification after flush_dependencies caused > bugs that I'll ask to you offline about. > > Thanks for your help.? I was just trying to peel off one place where > Compile_lock seemed wrong. > > Thanks, > Coleen >> >> dl >> >> >> On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: >>> Summary: make SystemDictionary::modification_counter atomic so not >>> to require Compile_lock. >>> >>> I moved updating the modification counter when the class is defined >>> and added to the hierarchy.? I didn't remove the Compile_lock >>> completely because there may be other code currently under the lock >>> that needs it (flush_dependencies). Can someone from the compiler >>> area also review this? >>> >>> Made Compile_lock an always safepointing lock. >>> >>> Tested with mach5 tier1-6. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>> >>> Thanks, >>> Coleen >> > From daniel.daugherty at oracle.com Wed Jan 23 15:51:32 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 23 Jan 2019 10:51:32 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: On 1/23/19 8:33 AM, Robbin Ehn wrote: > Hi all, here is v03. > > It's contains the update from comments and: > I notice safepoint.hpp contained wrong/not need inline keyword for > methods. > Those method are either default inline because they are defined in the > declaration (header) or since they are defined in the same cpp unit as > callers > and thus can be inlined any way. > > http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ src/hotspot/share/runtime/safepoint.cpp ??? No comments. src/hotspot/share/runtime/safepoint.hpp ??? No comments. Thumbs up. Dan > http://cr.openjdk.java.net/~rehn/8203469/v03/ > > Passes t1. > > Thanks, Robbin > > On 2019-01-15 11:39, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes >> 0.2ms the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock >> during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have >> transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure >> any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native >> without being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming >> into the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all >> JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a >> stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. >> The thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since >> each java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM >> thread may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have >> not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next >> safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From patricio.chilano.mateo at oracle.com Wed Jan 23 16:45:24 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 23 Jan 2019 11:45:24 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: Hi Robbin, Looks good to me! Thanks, Patricio On 1/23/19 8:33 AM, Robbin Ehn wrote: > Hi all, here is v03. > > It's contains the update from comments and: > I notice safepoint.hpp contained wrong/not need inline keyword for > methods. > Those method are either default inline because they are defined in the > declaration (header) or since they are defined in the same cpp unit as > callers > and thus can be inlined any way. > > http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ > http://cr.openjdk.java.net/~rehn/8203469/v03/ > > Passes t1. > > Thanks, Robbin > > On 2019-01-15 11:39, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes >> 0.2ms the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock >> during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have >> transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure >> any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native >> without being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming >> into the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all >> JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a >> stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. >> The thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since >> each java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM >> thread may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have >> not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next >> safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From karen.kinnear at oracle.com Wed Jan 23 21:34:23 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 23 Jan 2019 16:34:23 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: <8D9FA13C-C99D-4F02-B539-62C4269CE332@oracle.com> This looks really good. Delighted with performance and cleaner logic. Couple of minor questions/comments: 1. SafepointMechanism.inline.hpp added an OrderAccess::loadload() in block_if_requested_local_poll() do you also need one in block_if_requested() ? 2. Tested on ARM? Stress test the OrderAccess Thank you for comments on OrderAccess lines - will help in future 3. minor safepoint.cpp 749: resetted -> reset 4. While you are in there Thank you for cleaning up CMS comments safepoint.hpp line 58 _synchronized // All Java threads are stopped at a safepoint. Only VM thread in running -> All Java threads are running in native, blocked in OS or stopped at safepoint What other threads an run besides the VM thead at this point? e.g. safepoint cleanup threads e.g. any GC threads that can run during a safepoint? 5. Would it make sense to split the safepoint_safe and try_stable_load_state into code that works during a safepoint and separate logic that works not at a safepoint, for the InactiveSafepoint state? thanks, Karen > On Jan 23, 2019, at 8:33 AM, Robbin Ehn wrote: > > Hi all, here is v03. > > It's contains the update from comments and: > I notice safepoint.hpp contained wrong/not need inline keyword for methods. > Those method are either default inline because they are defined in the > declaration (header) or since they are defined in the same cpp unit as callers > and thus can be inlined any way. > > http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ > http://cr.openjdk.java.net/~rehn/8203469/v03/ > > Passes t1. > > Thanks, Robbin > > On 2019-01-15 11:39, Robbin Ehn wrote: >> Hi all, please review. >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> Thanks to Dan for pre-reviewing a lot! >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads is relative >> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >> overhead of stopping and starting JavaThreads is several times the operation. >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal structures >> without doing a transition and block before doing so. >> - The safe states are: >> - When polls armed: _thread_in_native and _thread_blocked. >> - When Threads_lock is held: externally suspended flag is set. >> - VM Thread have polls armed and holds the Threads_lock during a >> safepoint. >> - Unsafe means that either Java heap or VM internal structures can be accessed >> by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> - All combination that are not safe are unsafe. >> We cannot start a safepoint until all unsafe threads have transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure any >> transition to another unsafe state will be blocked. JavaThreads which are unsafe >> with state _thread_in_Java may transition to _thread_in_native without being >> blocked, since it just became a safe thread and we can proceed. Any safe thread >> may try to transition at any time to an unsafe state, thus coming into the >> safepoint blocking code at any moment, e.g., after the safepoint is over, or >> even at the beginning of next safepoint. >> The VMThread cannot tolerate false positives from the JavaThread thread state >> because that would mean starting the safepoint without all JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >> false positives from the safepoint blocking code, if we remove them, how do we >> handle false positives? >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >> the thread is considered safepoint unsafe. It's no longer enough that thread is >> have state _thread_blocked it must also have correct safepoint id before and >> after we read the state. >> Performance: >> The result of faster safepoints is that the average CPU time for JavaThreads >> between safepoints is higher, thus increasing the allocation rate. The thread >> that stops first waits shorter time until it gets started. Even the thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since each java >> worker thread have an increased CPU time/allocation rate. Often this means max >> performance is achieved using slightly less java worker threads than before. >> Also the increase allocation rate means shorter time between GC safepoints. >> - If you are using a non-concurrent GC, you should see improved latency and >> throughput. >> - After re-tunning with a concurrent GC throughput should be equal or better but >> with better latency. But bear in mind this is a latency patch, not a >> throughput one. >> With current code a java thread is not to guarantee to run between safepoint (in >> theory a java thread can be starved indefinitely), since the VM thread may >> re-grab the Threads_locks before it woke up from previous safepoint. If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have not >> started yet at the cost of latency. Since the new code uses the WaitBarrier with >> the safepoint counter, all threads must roll forward to next safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU violations >> are more obvious. >> Some examples on numbers: >> - On a 16 strand machine synchronization and un-synchronization/starting is at >> least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >> starting ~400->~100us. >> (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> synchronization time on 16 strands and ~5% score increase. In this case the GC >> op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> Thanks, Robbin From coleen.phillimore at oracle.com Wed Jan 23 22:05:34 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 23 Jan 2019 17:05:34 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: Robbin, this looks very clean and as understandable as it can be I guess.? I have a couple of small suggestions. http://cr.openjdk.java.net/~rehn/8203469/v03/webrev/src/hotspot/share/code/dependencyContext.hpp.udiff.html + assert((SafepointSynchronize::safepoint_counter() - _safepoint_counter) < 2, "safepoint happened"); This code shouldn't know the special safepoint counter semantics. I'm surprised there aren't more of these.? Can you make this a function in safepoint.hpp like: ?? static bool is_same_safepoint(int counter) { return safepoint_counter() < 2; }? // safepoint counter incremented by two during safepoint http://cr.openjdk.java.net/~rehn/8203469/v03/webrev/src/hotspot/share/runtime/safepoint.cpp.udiff.html + int count = Atomic::add(-1, &_waiting_to_block); There's an Atomic::sub which I think is preferable. +WaitBarrier* SafepointSynchronize::_wait_barrier; +Semaphore* SafepointSynchronize::_vm_wait; + Can you use this place to document briefly the interaction between the threads using these barriers?? i.e. one is the one the vm waits for while waiting for threads to block and the other is the the barrier that the threads block on.? Maybe this can be a place to describe this in a little bit of detail.? That would help with reading the code below. Thanks, Coleen On 1/23/19 8:33 AM, Robbin Ehn wrote: > Hi all, here is v03. > > It's contains the update from comments and: > I notice safepoint.hpp contained wrong/not need inline keyword for > methods. > Those method are either default inline because they are defined in the > declaration (header) or since they are defined in the same cpp unit as > callers > and thus can be inlined any way. > > http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ > http://cr.openjdk.java.net/~rehn/8203469/v03/ > > Passes t1. > > Thanks, Robbin > > On 2019-01-15 11:39, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes >> 0.2ms the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock >> during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have >> transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure >> any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native >> without being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming >> into the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all >> JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a >> stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. >> The thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since >> each java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM >> thread may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have >> not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next >> safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From david.holmes at oracle.com Thu Jan 24 00:49:59 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 24 Jan 2019 10:49:59 +1000 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: Hi Robbin, One minor nit on the updates in safepoint.cpp ! // re-read state after we read thread safepoint id. The JavaThread changes its ! // thread state from thread_blocked before resetting safepoint id to 0. ! // Guaranteeing the second read will be from an updated thread state. It can The "sentence" starting "Guaranteeing" is not a sentence. Suggestion: // This guarantees the second read ... Thanks, David On 23/01/2019 11:33 pm, Robbin Ehn wrote: > Hi all, here is v03. > > It's contains the update from comments and: > I notice safepoint.hpp contained wrong/not need inline keyword for methods. > Those method are either default inline because they are defined in the > declaration (header) or since they are defined in the same cpp unit as > callers > and thus can be inlined any way. > > http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ > http://cr.openjdk.java.net/~rehn/8203469/v03/ > > Passes t1. > > Thanks, Robbin > > On 2019-01-15 11:39, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes 0.2ms >> the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have transitioned >> to a safe >> state. To make them safe, we arm polls in compiled code and make sure any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native without >> being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming into >> the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all JavaThreads >> being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable >> load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. The >> thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since each >> java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM thread >> may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From david.holmes at oracle.com Thu Jan 24 02:31:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 24 Jan 2019 12:31:55 +1000 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> Message-ID: Hi Coleen, On 24/01/2019 12:20 am, coleen.phillimore at oracle.com wrote: > > After some internal discussion, Dean convinced me that removing the > Compile_lock here might be too dangerous.?? So for these asserts and the > error condition, the compiler thread goes to VM from native to check the > SystemDictionary::modification_counter under the Compile_lock, with > safepoint checking always. That sounds quite reasonable. Reviewed. Though perhaps the bug synopsis should be updated to reflect the change in direction before pushing. Thanks, David ----- > Tested with tier1,2,6 and 8. > > open webrev at http://cr.openjdk.java.net/~coleenp/8216136.02/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8216136 > > Thanks, > Coleen > > On 1/17/19 7:15 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 1/16/19 10:53 PM, dean.long at oracle.com wrote: >>> Hi Coleen.? You still can't safely call notice_modification() outside >>> of Compile_lock, (at least not without other changes), so this: >>> >>> - static inline void notice_modification() { >>> assert_locked_or_safepoint(Compile_lock); ++_number_of_modifications; } >>> + static inline void notice_modification() { >>> Atomic::inc(&_number_of_modifications); } >>> >>> should be: >>> >>> static inline void notice_modification() { >>> assert_locked_or_safepoint(Compile_lock); >>> Atomic::inc(&_number_of_modifications); } >>> >>> >>> Are you trying to eventually remove Compile_lock completely?? If so, >>> then notice_modification() would have to be called *before* the >>> class hierarchy is changed, not after, and probably other changes >>> would be needed as well. >> Dean, Thank you for looking at this and your comments. >> >> No, I'm not trying to remove Compile_lock entirely and I can assert >> that notice_modification has the Compile_lock as above. The class >> hierarchy code has been changed to be lock free rather than requiring >> the Compile_lock, although I think the Compile_lock still protects >> some of this code. >> >> There are also some Compile_lock free ways of getting to dependencies, >> because putting notice_modification after flush_dependencies caused >> bugs that I'll ask to you offline about. >> >> Thanks for your help.? I was just trying to peel off one place where >> Compile_lock seemed wrong. >> >> Thanks, >> Coleen >>> >>> dl >>> >>> >>> On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: >>>> Summary: make SystemDictionary::modification_counter atomic so not >>>> to require Compile_lock. >>>> >>>> I moved updating the modification counter when the class is defined >>>> and added to the hierarchy.? I didn't remove the Compile_lock >>>> completely because there may be other code currently under the lock >>>> that needs it (flush_dependencies). Can someone from the compiler >>>> area also review this? >>>> >>>> Made Compile_lock an always safepointing lock. >>>> >>>> Tested with mach5 tier1-6. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>>> >>>> Thanks, >>>> Coleen >>> >> > From coleen.phillimore at oracle.com Thu Jan 24 02:38:05 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 23 Jan 2019 21:38:05 -0500 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> Message-ID: <8e900937-c994-f3dd-8180-0f1a90bc4819@oracle.com> On 1/23/19 9:31 PM, David Holmes wrote: > Hi Coleen, > > On 24/01/2019 12:20 am, coleen.phillimore at oracle.com wrote: >> >> After some internal discussion, Dean convinced me that removing the >> Compile_lock here might be too dangerous.?? So for these asserts and >> the error condition, the compiler thread goes to VM from native to >> check the SystemDictionary::modification_counter under the >> Compile_lock, with safepoint checking always. > > That sounds quite reasonable. Reviewed. > > Though perhaps the bug synopsis should be updated to reflect the > change in direction before pushing. Done, thanks! Coleen > > Thanks, > David > ----- > >> Tested with tier1,2,6 and 8. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.02/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >> >> Thanks, >> Coleen >> >> On 1/17/19 7:15 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 1/16/19 10:53 PM, dean.long at oracle.com wrote: >>>> Hi Coleen.? You still can't safely call notice_modification() >>>> outside of Compile_lock, (at least not without other changes), so >>>> this: >>>> >>>> - static inline void notice_modification() { >>>> assert_locked_or_safepoint(Compile_lock); >>>> ++_number_of_modifications; } >>>> + static inline void notice_modification() { >>>> Atomic::inc(&_number_of_modifications); } >>>> >>>> should be: >>>> >>>> static inline void notice_modification() { >>>> assert_locked_or_safepoint(Compile_lock); >>>> Atomic::inc(&_number_of_modifications); } >>>> >>>> >>>> Are you trying to eventually remove Compile_lock completely?? If >>>> so, then notice_modification() would have to be called *before* the >>>> class hierarchy is changed, not after, and probably other changes >>>> would be needed as well. >>> Dean, Thank you for looking at this and your comments. >>> >>> No, I'm not trying to remove Compile_lock entirely and I can assert >>> that notice_modification has the Compile_lock as above. The class >>> hierarchy code has been changed to be lock free rather than >>> requiring the Compile_lock, although I think the Compile_lock still >>> protects some of this code. >>> >>> There are also some Compile_lock free ways of getting to >>> dependencies, because putting notice_modification after >>> flush_dependencies caused bugs that I'll ask to you offline about. >>> >>> Thanks for your help.? I was just trying to peel off one place where >>> Compile_lock seemed wrong. >>> >>> Thanks, >>> Coleen >>>> >>>> dl >>>> >>>> >>>> On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: >>>>> Summary: make SystemDictionary::modification_counter atomic so not >>>>> to require Compile_lock. >>>>> >>>>> I moved updating the modification counter when the class is >>>>> defined and added to the hierarchy.? I didn't remove the >>>>> Compile_lock completely because there may be other code currently >>>>> under the lock that needs it (flush_dependencies). Can someone >>>>> from the compiler area also review this? >>>>> >>>>> Made Compile_lock an always safepointing lock. >>>>> >>>>> Tested with mach5 tier1-6. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> >> From dean.long at oracle.com Thu Jan 24 05:17:07 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 23 Jan 2019 21:17:07 -0800 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> Message-ID: <5bbaa8a5-c75d-f01d-244c-0dc0ca0a1a96@oracle.com> Looks good. dl On 1/23/19 6:20 AM, coleen.phillimore at oracle.com wrote: > > After some internal discussion, Dean convinced me that removing the > Compile_lock here might be too dangerous.?? So for these asserts and > the error condition, the compiler thread goes to VM from native to > check the SystemDictionary::modification_counter under the > Compile_lock, with safepoint checking always. > > Tested with tier1,2,6 and 8. > > open webrev at http://cr.openjdk.java.net/~coleenp/8216136.02/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8216136 > > Thanks, > Coleen > > On 1/17/19 7:15 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 1/16/19 10:53 PM, dean.long at oracle.com wrote: >>> Hi Coleen.? You still can't safely call notice_modification() >>> outside of Compile_lock, (at least not without other changes), so this: >>> >>> - static inline void notice_modification() { >>> assert_locked_or_safepoint(Compile_lock); ++_number_of_modifications; } >>> + static inline void notice_modification() { >>> Atomic::inc(&_number_of_modifications); } >>> >>> should be: >>> >>> static inline void notice_modification() { >>> assert_locked_or_safepoint(Compile_lock); >>> Atomic::inc(&_number_of_modifications); } >>> >>> >>> Are you trying to eventually remove Compile_lock completely? If so, >>> then notice_modification() would have to be called *before* the >>> class hierarchy is changed, not after, and probably other changes >>> would be needed as well. >> Dean, Thank you for looking at this and your comments. >> >> No, I'm not trying to remove Compile_lock entirely and I can assert >> that notice_modification has the Compile_lock as above. The class >> hierarchy code has been changed to be lock free rather than requiring >> the Compile_lock, although I think the Compile_lock still protects >> some of this code. >> >> There are also some Compile_lock free ways of getting to >> dependencies, because putting notice_modification after >> flush_dependencies caused bugs that I'll ask to you offline about. >> >> Thanks for your help.? I was just trying to peel off one place where >> Compile_lock seemed wrong. >> >> Thanks, >> Coleen >>> >>> dl >>> >>> >>> On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: >>>> Summary: make SystemDictionary::modification_counter atomic so not >>>> to require Compile_lock. >>>> >>>> I moved updating the modification counter when the class is defined >>>> and added to the hierarchy.? I didn't remove the Compile_lock >>>> completely because there may be other code currently under the lock >>>> that needs it (flush_dependencies). Can someone from the compiler >>>> area also review this? >>>> >>>> Made Compile_lock an always safepointing lock. >>>> >>>> Tested with mach5 tier1-6. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>>> >>>> Thanks, >>>> Coleen >>> >> > From robbin.ehn at oracle.com Thu Jan 24 10:19:29 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 24 Jan 2019 11:19:29 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: Hi David, On 1/24/19 1:49 AM, David Holmes wrote: > Hi Robbin, > > One minor nit on the updates in safepoint.cpp > > !?? // re-read state after we read thread safepoint id. The JavaThread changes its > !?? // thread state from thread_blocked before resetting safepoint id to 0. > !?? // Guaranteeing the second read will be from an updated thread state. It can > > The "sentence" starting "Guaranteeing" is not a sentence. Suggestion: > > // This guarantees the second read ... Fixed, thanks! A v04 coming including Karens comments. /Robbin > > Thanks, > David > > On 23/01/2019 11:33 pm, Robbin Ehn wrote: >> Hi all, here is v03. >> >> It's contains the update from comments and: >> I notice safepoint.hpp contained wrong/not need inline keyword for methods. >> Those method are either default inline because they are defined in the >> declaration (header) or since they are defined in the same cpp unit as callers >> and thus can be inlined any way. >> >> http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ >> http://cr.openjdk.java.net/~rehn/8203469/v03/ >> >> Passes t1. >> >> Thanks, Robbin >> >> On 2019-01-15 11:39, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin From robbin.ehn at oracle.com Thu Jan 24 11:40:51 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 24 Jan 2019 12:40:51 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <8D9FA13C-C99D-4F02-B539-62C4269CE332@oracle.com> References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> <8D9FA13C-C99D-4F02-B539-62C4269CE332@oracle.com> Message-ID: Hi Karen, On 1/23/19 10:34 PM, Karen Kinnear wrote: > This looks really good. Delighted with performance and cleaner logic. Thanks! > > Couple of minor questions/comments: > > 1. SafepointMechanism.inline.hpp > ? added an OrderAccess::loadload() in block_if_requested_local_poll() > ? do you also need one in block_if_requested() ? Yes, thanks. > > 2. Tested on ARM? Stress test the OrderAccess > ???Thank you for comments on OrderAccess lines - will help in future AndrewH was going to test it, I have not heard from him. > > 3. minor safepoint.cpp 749: resetted -> reset Fixed. > > 4. While you are in there > Thank you for cleaning up CMS comments > safepoint.hpp line 58 _synchronized // All Java threads are stopped at a > safepoint. Only VM thread in running > ?? -> All Java threads are running in native, blocked in OS or stopped at safepoint > ?? What other threads an run besides the VM thead at this point? > ? ? e.g. safepoint cleanup threads > ? ? e.g. any GC threads that can run during a safepoint? Updated. > > 5. Would it make sense to split the safepoint_safe and try_stable_load_state > into code that works during a safepoint and separate logic that works not > at a safepoint, for the InactiveSafepoint state? The safepoint_safe() primary user is handshakes. It have one second use-case in an assert in jfr, if previously use-age was correct it still should be. That piece of jfr code should only be run inside a safepoint/handshake. It's not used by the safepointing code at all. It only works when asking a thread with poll armed, thus only handshake and safepoint should ask this _after_ arming. (IMHO the jfr assert should be change) v04 to RFR mail coming. Thanks, Robbin > > thanks, > Karen > >> On Jan 23, 2019, at 8:33 AM, Robbin Ehn > > wrote: >> >> Hi all, here is v03. >> >> It's contains the update from comments and: >> I notice safepoint.hpp contained wrong/not need inline keyword for methods. >> Those method are either default inline because they are defined in the >> declaration (header) or since they are defined in the same cpp unit as callers >> and thus can be inlined any way. >> >> http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ >> http://cr.openjdk.java.net/~rehn/8203469/v03/ >> >> Passes t1. >> >> Thanks, Robbin >> >> On 2019-01-15 11:39, Robbin Ehn wrote: >>> Hi all, please review. >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> Thanks to Dan for pre-reviewing a lot! >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ? without doing a transition and block before doing so. >>> ??????? - The safe states are: >>> ??????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ??????????????? - When Threads_lock is held: externally suspended flag is set. >>> ??????? - VM Thread have polls armed and holds the Threads_lock during a >>> ????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ??????? - All combination that are not safe are unsafe. >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ? with better latency. But bear in mind this is a latency patch, not a >>> ? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ? starting ~400->~100us. >>> ? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ? synchronization time on 16 strands and ~5% score increase. In this case the GC >>> ? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> Thanks, Robbin > From coleen.phillimore at oracle.com Thu Jan 24 13:01:52 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Jan 2019 08:01:52 -0500 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: <5bbaa8a5-c75d-f01d-244c-0dc0ca0a1a96@oracle.com> References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> <5bbaa8a5-c75d-f01d-244c-0dc0ca0a1a96@oracle.com> Message-ID: <4104aaeb-7709-fc6e-9b52-755d7812e3ca@oracle.com> Thanks Dean and thank you for the consulation on this RFE. Coleen On 1/24/19 12:17 AM, dean.long at oracle.com wrote: > Looks good. > > dl > > On 1/23/19 6:20 AM, coleen.phillimore at oracle.com wrote: >> >> After some internal discussion, Dean convinced me that removing the >> Compile_lock here might be too dangerous.?? So for these asserts and >> the error condition, the compiler thread goes to VM from native to >> check the SystemDictionary::modification_counter under the >> Compile_lock, with safepoint checking always. >> >> Tested with tier1,2,6 and 8. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.02/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >> >> Thanks, >> Coleen >> >> On 1/17/19 7:15 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 1/16/19 10:53 PM, dean.long at oracle.com wrote: >>>> Hi Coleen.? You still can't safely call notice_modification() >>>> outside of Compile_lock, (at least not without other changes), so >>>> this: >>>> >>>> - static inline void notice_modification() { >>>> assert_locked_or_safepoint(Compile_lock); >>>> ++_number_of_modifications; } >>>> + static inline void notice_modification() { >>>> Atomic::inc(&_number_of_modifications); } >>>> >>>> should be: >>>> >>>> static inline void notice_modification() { >>>> assert_locked_or_safepoint(Compile_lock); >>>> Atomic::inc(&_number_of_modifications); } >>>> >>>> >>>> Are you trying to eventually remove Compile_lock completely? If so, >>>> then notice_modification() would have to be called *before* the >>>> class hierarchy is changed, not after, and probably other changes >>>> would be needed as well. >>> Dean, Thank you for looking at this and your comments. >>> >>> No, I'm not trying to remove Compile_lock entirely and I can assert >>> that notice_modification has the Compile_lock as above. The class >>> hierarchy code has been changed to be lock free rather than >>> requiring the Compile_lock, although I think the Compile_lock still >>> protects some of this code. >>> >>> There are also some Compile_lock free ways of getting to >>> dependencies, because putting notice_modification after >>> flush_dependencies caused bugs that I'll ask to you offline about. >>> >>> Thanks for your help.? I was just trying to peel off one place where >>> Compile_lock seemed wrong. >>> >>> Thanks, >>> Coleen >>>> >>>> dl >>>> >>>> >>>> On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: >>>>> Summary: make SystemDictionary::modification_counter atomic so not >>>>> to require Compile_lock. >>>>> >>>>> I moved updating the modification counter when the class is >>>>> defined and added to the hierarchy.? I didn't remove the >>>>> Compile_lock completely because there may be other code currently >>>>> under the lock that needs it (flush_dependencies). Can someone >>>>> from the compiler area also review this? >>>>> >>>>> Made Compile_lock an always safepointing lock. >>>>> >>>>> Tested with mach5 tier1-6. >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> >> > From harold.seigel at oracle.com Thu Jan 24 13:34:52 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Thu, 24 Jan 2019 08:34:52 -0500 Subject: RFR (S) JDK-8216970: condy causes JVM crash In-Reply-To: References: Message-ID: <420efb8c-f001-c7ca-4f0a-ce637eefce1a@oracle.com> Hi Lois, These changes look good. Thanks, Harold On 1/22/2019 11:10 AM, Lois Foltan wrote: > > Updated webrev that includes preliminary comments from John Rose. > > open webrev at: > http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.2/webrev/ > > Thanks, > Lois > > On 1/18/2019 1:50 PM, Lois Foltan wrote: >> Please review this change that allows escape analysis to correctly >> handle a dynamic constant whose return type is an array. >> >> open webrev at: >> http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.1/webrev/ >> bug link: https://bugs.openjdk.java.net/browse/JDK-8216970 >> >> Testing: hs-tier1-3, jdk-tier1-3 (all platforms).? hs-tier4-5 (linux >> only) >> >> Thanks, >> Lois >> >> >> > From erik.osterlund at oracle.com Thu Jan 24 14:13:10 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 24 Jan 2019 15:13:10 +0100 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: <4104aaeb-7709-fc6e-9b52-755d7812e3ca@oracle.com> References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> <5bbaa8a5-c75d-f01d-244c-0dc0ca0a1a96@oracle.com> <4104aaeb-7709-fc6e-9b52-755d7812e3ca@oracle.com> Message-ID: <6d102404-b34f-f330-80e8-362778646240@oracle.com> Hi Coleen, Looks good. Ship it! Thanks, /Erik On 2019-01-24 14:01, coleen.phillimore at oracle.com wrote: > Thanks Dean and thank you for the consulation on this RFE. > Coleen > > On 1/24/19 12:17 AM, dean.long at oracle.com wrote: >> Looks good. >> >> dl >> >> On 1/23/19 6:20 AM, coleen.phillimore at oracle.com wrote: >>> >>> After some internal discussion, Dean convinced me that removing the >>> Compile_lock here might be too dangerous.?? So for these asserts and >>> the error condition, the compiler thread goes to VM from native to >>> check the SystemDictionary::modification_counter under the >>> Compile_lock, with safepoint checking always. >>> >>> Tested with tier1,2,6 and 8. >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.02/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>> >>> Thanks, >>> Coleen >>> >>> On 1/17/19 7:15 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 1/16/19 10:53 PM, dean.long at oracle.com wrote: >>>>> Hi Coleen.? You still can't safely call notice_modification() >>>>> outside of Compile_lock, (at least not without other changes), so >>>>> this: >>>>> >>>>> - static inline void notice_modification() { >>>>> assert_locked_or_safepoint(Compile_lock); >>>>> ++_number_of_modifications; } >>>>> + static inline void notice_modification() { >>>>> Atomic::inc(&_number_of_modifications); } >>>>> >>>>> should be: >>>>> >>>>> static inline void notice_modification() { >>>>> assert_locked_or_safepoint(Compile_lock); >>>>> Atomic::inc(&_number_of_modifications); } >>>>> >>>>> >>>>> Are you trying to eventually remove Compile_lock completely? If >>>>> so, then notice_modification() would have to be called *before* the >>>>> class hierarchy is changed, not after, and probably other changes >>>>> would be needed as well. >>>> Dean, Thank you for looking at this and your comments. >>>> >>>> No, I'm not trying to remove Compile_lock entirely and I can assert >>>> that notice_modification has the Compile_lock as above. The class >>>> hierarchy code has been changed to be lock free rather than >>>> requiring the Compile_lock, although I think the Compile_lock still >>>> protects some of this code. >>>> >>>> There are also some Compile_lock free ways of getting to >>>> dependencies, because putting notice_modification after >>>> flush_dependencies caused bugs that I'll ask to you offline about. >>>> >>>> Thanks for your help.? I was just trying to peel off one place >>>> where Compile_lock seemed wrong. >>>> >>>> Thanks, >>>> Coleen >>>>> >>>>> dl >>>>> >>>>> >>>>> On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: make SystemDictionary::modification_counter atomic so >>>>>> not to require Compile_lock. >>>>>> >>>>>> I moved updating the modification counter when the class is >>>>>> defined and added to the hierarchy.? I didn't remove the >>>>>> Compile_lock completely because there may be other code currently >>>>>> under the lock that needs it (flush_dependencies). Can someone >>>>>> from the compiler area also review this? >>>>>> >>>>>> Made Compile_lock an always safepointing lock. >>>>>> >>>>>> Tested with mach5 tier1-6. >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >>> >> > From coleen.phillimore at oracle.com Thu Jan 24 14:16:01 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Jan 2019 09:16:01 -0500 Subject: RFR (S) 8216136: Don't take Compile_lock for SystemDictionary::_modification_counter In-Reply-To: <6d102404-b34f-f330-80e8-362778646240@oracle.com> References: <48a69ecb-1d3c-817e-7e2b-4e55a68a66b8@oracle.com> <081db1f4-1a52-ef60-7934-a4314e7c2c80@oracle.com> <182d84fc-4a14-e0c2-c374-a50538525f26@oracle.com> <5bbaa8a5-c75d-f01d-244c-0dc0ca0a1a96@oracle.com> <4104aaeb-7709-fc6e-9b52-755d7812e3ca@oracle.com> <6d102404-b34f-f330-80e8-362778646240@oracle.com> Message-ID: Thanks, Erik! Coleen On 1/24/19 9:13 AM, Erik ?sterlund wrote: > Hi Coleen, > > Looks good. Ship it! > > Thanks, > /Erik > > On 2019-01-24 14:01, coleen.phillimore at oracle.com wrote: >> Thanks Dean and thank you for the consulation on this RFE. >> Coleen >> >> On 1/24/19 12:17 AM, dean.long at oracle.com wrote: >>> Looks good. >>> >>> dl >>> >>> On 1/23/19 6:20 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> After some internal discussion, Dean convinced me that removing the >>>> Compile_lock here might be too dangerous.?? So for these asserts >>>> and the error condition, the compiler thread goes to VM from native >>>> to check the SystemDictionary::modification_counter under the >>>> Compile_lock, with safepoint checking always. >>>> >>>> Tested with tier1,2,6 and 8. >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8216136.02/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>>> >>>> Thanks, >>>> Coleen >>>> >>>> On 1/17/19 7:15 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 1/16/19 10:53 PM, dean.long at oracle.com wrote: >>>>>> Hi Coleen.? You still can't safely call notice_modification() >>>>>> outside of Compile_lock, (at least not without other changes), so >>>>>> this: >>>>>> >>>>>> - static inline void notice_modification() { >>>>>> assert_locked_or_safepoint(Compile_lock); >>>>>> ++_number_of_modifications; } >>>>>> + static inline void notice_modification() { >>>>>> Atomic::inc(&_number_of_modifications); } >>>>>> >>>>>> should be: >>>>>> >>>>>> static inline void notice_modification() { >>>>>> assert_locked_or_safepoint(Compile_lock); >>>>>> Atomic::inc(&_number_of_modifications); } >>>>>> >>>>>> >>>>>> Are you trying to eventually remove Compile_lock completely? If >>>>>> so, then notice_modification() would have to be called *before* the >>>>>> class hierarchy is changed, not after, and probably other changes >>>>>> would be needed as well. >>>>> Dean, Thank you for looking at this and your comments. >>>>> >>>>> No, I'm not trying to remove Compile_lock entirely and I can >>>>> assert that notice_modification has the Compile_lock as above. The >>>>> class hierarchy code has been changed to be lock free rather than >>>>> requiring the Compile_lock, although I think the Compile_lock >>>>> still protects some of this code. >>>>> >>>>> There are also some Compile_lock free ways of getting to >>>>> dependencies, because putting notice_modification after >>>>> flush_dependencies caused bugs that I'll ask to you offline about. >>>>> >>>>> Thanks for your help.? I was just trying to peel off one place >>>>> where Compile_lock seemed wrong. >>>>> >>>>> Thanks, >>>>> Coleen >>>>>> >>>>>> dl >>>>>> >>>>>> >>>>>> On 1/16/19 8:43 AM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: make SystemDictionary::modification_counter atomic so >>>>>>> not to require Compile_lock. >>>>>>> >>>>>>> I moved updating the modification counter when the class is >>>>>>> defined and added to the hierarchy.? I didn't remove the >>>>>>> Compile_lock completely because there may be other code >>>>>>> currently under the lock that needs it (flush_dependencies). Can >>>>>>> someone from the compiler area also review this? >>>>>>> >>>>>>> Made Compile_lock an always safepointing lock. >>>>>>> >>>>>>> Tested with mach5 tier1-6. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8216136.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8216136 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> >>>>> >>>> >>> >> > From robbin.ehn at oracle.com Thu Jan 24 15:20:44 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 24 Jan 2019 16:20:44 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: Hi Coleen, On 1/23/19 11:05 PM, coleen.phillimore at oracle.com wrote: > > Robbin, this looks very clean and as understandable as it can be I guess.? I > have a couple of small suggestions. > > http://cr.openjdk.java.net/~rehn/8203469/v03/webrev/src/hotspot/share/code/dependencyContext.hpp.udiff.html > > > + assert((SafepointSynchronize::safepoint_counter() - _safepoint_counter) < 2, > "safepoint happened"); > > > This code shouldn't know the special safepoint counter semantics. I'm surprised > there aren't more of these.? Can you make this a function in safepoint.hpp like: > ?? static bool is_same_safepoint(int counter) { return safepoint_counter() < 2; > }? // safepoint counter incremented by two during safepoint > > http://cr.openjdk.java.net/~rehn/8203469/v03/webrev/src/hotspot/share/runtime/safepoint.cpp.udiff.html > Fixed. > > + int count = Atomic::add(-1, &_waiting_to_block); > > > There's an Atomic::sub which I think is preferable. > Fixed. > +WaitBarrier* SafepointSynchronize::_wait_barrier; > +Semaphore* SafepointSynchronize::_vm_wait; > + > > > Can you use this place to document briefly the interaction between the threads > using these barriers?? i.e. one is the one the vm waits for while waiting for > threads to block and the other is the the barrier that the threads block on. > Maybe this can be a place to describe this in a little bit of detail.? That > would help with reading the code below. > Fixed. v04 to rfr. Thanks, Robbin > Thanks, > Coleen > > On 1/23/19 8:33 AM, Robbin Ehn wrote: >> Hi all, here is v03. >> >> It's contains the update from comments and: >> I notice safepoint.hpp contained wrong/not need inline keyword for methods. >> Those method are either default inline because they are defined in the >> declaration (header) or since they are defined in the same cpp unit as callers >> and thus can be inlined any way. >> >> http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ >> http://cr.openjdk.java.net/~rehn/8203469/v03/ >> >> Passes t1. >> >> Thanks, Robbin >> >> On 2019-01-15 11:39, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin > From karen.kinnear at oracle.com Thu Jan 24 15:31:06 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Thu, 24 Jan 2019 10:31:06 -0500 Subject: RFR (S) JDK-8216970: condy causes JVM crash In-Reply-To: <420efb8c-f001-c7ca-4f0a-ce637eefce1a@oracle.com> References: <420efb8c-f001-c7ca-4f0a-ce637eefce1a@oracle.com> Message-ID: <2948ECD8-FB98-4982-854A-D4F9B6CEB35E@oracle.com> Lois, Fix and test looks good. thanks, Karen > On Jan 24, 2019, at 8:34 AM, Harold Seigel wrote: > > Hi Lois, > > These changes look good. > > Thanks, Harold > > On 1/22/2019 11:10 AM, Lois Foltan wrote: >> >> Updated webrev that includes preliminary comments from John Rose. >> >> open webrev at: http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.2/webrev/ >> >> Thanks, >> Lois >> >> On 1/18/2019 1:50 PM, Lois Foltan wrote: >>> Please review this change that allows escape analysis to correctly handle a dynamic constant whose return type is an array. >>> >>> open webrev at: http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.1/webrev/ >>> bug link: https://bugs.openjdk.java.net/browse/JDK-8216970 >>> >>> Testing: hs-tier1-3, jdk-tier1-3 (all platforms). hs-tier4-5 (linux only) >>> >>> Thanks, >>> Lois >>> >>> >>> >> From lois.foltan at oracle.com Thu Jan 24 15:34:30 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 24 Jan 2019 10:34:30 -0500 Subject: RFR (S) JDK-8216970: condy causes JVM crash In-Reply-To: <2948ECD8-FB98-4982-854A-D4F9B6CEB35E@oracle.com> References: <420efb8c-f001-c7ca-4f0a-ce637eefce1a@oracle.com> <2948ECD8-FB98-4982-854A-D4F9B6CEB35E@oracle.com> Message-ID: <6ece1de0-bc35-e0e5-7dd6-2579b89aa822@oracle.com> Thank you for the reviews Harold & Karen! Lois On 1/24/2019 10:31 AM, Karen Kinnear wrote: > Lois, > > Fix and test looks good. > > thanks, > Karen > >> On Jan 24, 2019, at 8:34 AM, Harold Seigel wrote: >> >> Hi Lois, >> >> These changes look good. >> >> Thanks, Harold >> >> On 1/22/2019 11:10 AM, Lois Foltan wrote: >>> Updated webrev that includes preliminary comments from John Rose. >>> >>> open webrev at: http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.2/webrev/ >>> >>> Thanks, >>> Lois >>> >>> On 1/18/2019 1:50 PM, Lois Foltan wrote: >>>> Please review this change that allows escape analysis to correctly handle a dynamic constant whose return type is an array. >>>> >>>> open webrev at: http://cr.openjdk.java.net/~lfoltan/bug_jdk8216970.1/webrev/ >>>> bug link: https://bugs.openjdk.java.net/browse/JDK-8216970 >>>> >>>> Testing: hs-tier1-3, jdk-tier1-3 (all platforms). hs-tier4-5 (linux only) >>>> >>>> Thanks, >>>> Lois >>>> >>>> >>>> From robbin.ehn at oracle.com Thu Jan 24 15:51:15 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 24 Jan 2019 16:51:15 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: Hi here is v04, updated after the comments. http://cr.openjdk.java.net/~rehn/8203469/v04/inc http://cr.openjdk.java.net/~rehn/8203469/v04/ Still running some tests. Thanks, Robbin On 1/15/19 11:39 AM, Robbin Ehn wrote: > Hi all, please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 > Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ > > Thanks to Dan for pre-reviewing a lot! > > Background: > ZGC often does very short safepoint operations. For a perspective, in a > specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While > in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which > operation it is. The time it takes to stop and start the JavaThreads is relative > very large to a ZGC safepoint. With an operation that just takes 0.2ms the > overhead of stopping and starting JavaThreads is several times the operation. > > High-level functionality change: > Serializing the starting over Threads_lock takes time. > - Don't wait on Threads_lock use the WaitBarrier. > Serializing the stopping over Safepoint_lock takes time. > - Let threads stop in parallel, remove Safepoint_lock. > > Details: > JavaThreads have 2 abstract logical states: unsafe or safe. > - Safe means the JavaThread will not touch Java heap or VM internal structures > ? without doing a transition and block before doing so. > ??????? - The safe states are: > ??????????????? - When polls armed: _thread_in_native and _thread_blocked. > ??????????????? - When Threads_lock is held: externally suspended flag is set. > ??????? - VM Thread have polls armed and holds the Threads_lock during a > ????????? safepoint. > - Unsafe means that either Java heap or VM internal structures can be accessed > ? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. > ??????? - All combination that are not safe are unsafe. > > We cannot start a safepoint until all unsafe threads have transitioned to a safe > state. To make them safe, we arm polls in compiled code and make sure any > transition to another unsafe state will be blocked. JavaThreads which are unsafe > with state _thread_in_Java may transition to _thread_in_native without being > blocked, since it just became a safe thread and we can proceed. Any safe thread > may try to transition at any time to an unsafe state, thus coming into the > safepoint blocking code at any moment, e.g., after the safepoint is over, or > even at the beginning of next safepoint. > > The VMThread cannot tolerate false positives from the JavaThread thread state > because that would mean starting the safepoint without all JavaThreads being > safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe > false positives from the safepoint blocking code, if we remove them, how do we > handle false positives? > > By first publishing which barrier tag (safepoint counter) we will call > WaitBarrier.wait() with as the threads safepoint id and then change the state to > _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of > the state. A stable load of the thread state is successful if the thread > safepoint id is the same both before and after the load of the state and > safepoint id is current or InactiveSafepointCounter. If the stable load fails, > the thread is considered safepoint unsafe. It's no longer enough that thread is > have state _thread_blocked it must also have correct safepoint id before and > after we read the state. > > Performance: > The result of faster safepoints is that the average CPU time for JavaThreads > between safepoints is higher, thus increasing the allocation rate. The thread > that stops first waits shorter time until it gets started. Even the thread that > stops last also have shorter stop since we start them faster. If your > application is using a concurrent GC it may need re-tunning since each java > worker thread have an increased CPU time/allocation rate. Often this means max > performance is achieved using slightly less java worker threads than before. > Also the increase allocation rate means shorter time between GC safepoints. > - If you are using a non-concurrent GC, you should see improved latency and > ? throughput. > - After re-tunning with a concurrent GC throughput should be equal or better but > ? with better latency. But bear in mind this is a latency patch, not a > ? throughput one. > With current code a java thread is not to guarantee to run between safepoint (in > theory a java thread can be starved indefinitely), since the VM thread may > re-grab the Threads_locks before it woke up from previous safepoint. If the > GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very > over-provisioned this can happen. > The current schema thus re-safepoint quickly if the java threads have not > started yet at the cost of latency. Since the new code uses the WaitBarrier with > the safepoint counter, all threads must roll forward to next safepoint by > getting at least some CPU time between two safepoints. Meaning MMU violations > are more obvious. > > Some examples on numbers: > - On a 16 strand machine synchronization and un-synchronization/starting is at > ? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and > ? starting ~400->~100us. > ? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). > - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster > ? synchronization time on 16 strands and ~5% score increase. In this case the GC > ? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. > - specJBB2015 ParGC ~9% increase in critical-jops. > > Thanks, Robbin From coleen.phillimore at oracle.com Thu Jan 24 15:59:49 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 24 Jan 2019 10:59:49 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: Looks good to me. Coleen On 1/24/19 10:51 AM, Robbin Ehn wrote: > Hi here is v04, updated after the comments. > > http://cr.openjdk.java.net/~rehn/8203469/v04/inc > http://cr.openjdk.java.net/~rehn/8203469/v04/ > > Still running some tests. > > Thanks, Robbin > > > On 1/15/19 11:39 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes >> 0.2ms the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock >> during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have >> transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure >> any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native >> without being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming >> into the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all >> JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a >> stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. >> The thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since >> each java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM >> thread may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have >> not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next >> safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From jesper.wilhelmsson at oracle.com Thu Jan 24 20:46:39 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Thu, 24 Jan 2019 21:46:39 +0100 Subject: RFR: JDK-8217580 - Remove tests from problemList as bugs has been closed In-Reply-To: References: <0140719C-4D7D-42A6-8667-DBB0F086DC73@oracle.com> Message-ID: <512A0786-DFBF-4CA9-9995-15468D404D99@oracle.com> Thanks Igor! Yes, I think it would make sense to backport this to JDK 12. /Jesper > On 22 Jan 2019, at 23:23, Igor Ignatyev wrote: > > Hi Jesper, > > looks good, thanks for taking care of it. (I haven't checked all the bugs, but I trust you did the right thing). one question, as it affects 12 (and hence 12u), should we push it to 12 repo? > > Thanks, > -- Igor > >> On Jan 22, 2019, at 2:11 PM, jesper.wilhelmsson at oracle.com wrote: >> >> Hi, >> >> Please review this patch that removes tests from the problemLists. The bugs referred to in these problemList entries has been closed and therefore the tests should not be problemlisted anymore. >> >> Please note that some of the bugs were closed as "Can not reproduce" and "Will not fix". If these tests starts failing again we need to re-evaluate these bugs. If a bug is closed as "Will not fix" and there are tests that reproduces that failure, the tests needs to be re-written to work around the bug or be removed. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8217580 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8217580/webrev.00/ >> >> Thanks, >> /Jesper >> > From jesper.wilhelmsson at oracle.com Thu Jan 24 20:47:12 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Thu, 24 Jan 2019 21:47:12 +0100 Subject: RFR: JDK-8217580 - Remove tests from problemList as bugs has been closed In-Reply-To: <43ce478b-5145-3d56-591e-012eaf120cb5@oracle.com> References: <0140719C-4D7D-42A6-8667-DBB0F086DC73@oracle.com> <43ce478b-5145-3d56-591e-012eaf120cb5@oracle.com> Message-ID: <6A24CB35-7AB1-47B6-A24C-784602C8CC0D@oracle.com> Thanks Misha! /Jesper > On 23 Jan 2019, at 00:06, mikhailo.seledtsov at oracle.com wrote: > > Looks good. I double-checked the bug numbers. > > Misha > > > On 1/22/19 2:23 PM, Igor Ignatyev wrote: >> Hi Jesper, >> >> looks good, thanks for taking care of it. (I haven't checked all the bugs, but I trust you did the right thing). one question, as it affects 12 (and hence 12u), should we push it to 12 repo? >> >> Thanks, >> -- Igor >> >>> On Jan 22, 2019, at 2:11 PM, jesper.wilhelmsson at oracle.com wrote: >>> >>> Hi, >>> >>> Please review this patch that removes tests from the problemLists. The bugs referred to in these problemList entries has been closed and therefore the tests should not be problemlisted anymore. >>> >>> Please note that some of the bugs were closed as "Can not reproduce" and "Will not fix". If these tests starts failing again we need to re-evaluate these bugs. If a bug is closed as "Will not fix" and there are tests that reproduces that failure, the tests needs to be re-written to work around the bug or be removed. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8217580 >>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8217580/webrev.00/ >>> >>> Thanks, >>> /Jesper >>> > From david.holmes at oracle.com Fri Jan 25 07:34:33 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 25 Jan 2019 17:34:33 +1000 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: <9c8e4ec5-495a-32dc-32e7-fdec92a2f9af@oracle.com> Hi Robbin, On 25/01/2019 1:51 am, Robbin Ehn wrote: > Hi here is v04, updated after the comments. > > http://cr.openjdk.java.net/~rehn/8203469/v04/inc src/hotspot/share/runtime/safepoint.hpp + // JavaThreads not blocking (e.g. mutex) the entire safepoint stops on the + // _wait_barrier, where they can quickly be started again. static WaitBarrier* _wait_barrier; That comment doesn't make sense to me. What does "not blocking the entire safepoint" mean ?? Also s/stops/stop/ Are you really just saying that "JavaThreads that need to block for the safepoint will stop on the _wait_barrier ..." ? + // The last JavaThread doing the callback will singal the VM thread to continue. Typo: singal -> signal No need for updatwed webrev. Thanks, David > http://cr.openjdk.java.net/~rehn/8203469/v04/ > > Still running some tests. > > Thanks, Robbin > > > On 1/15/19 11:39 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes 0.2ms >> the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have transitioned >> to a safe >> state. To make them safe, we arm polls in compiled code and make sure any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native without >> being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming into >> the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all JavaThreads >> being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable >> load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. The >> thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since each >> java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM thread >> may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From igor.ignatyev at oracle.com Fri Jan 25 07:46:48 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 24 Jan 2019 23:46:48 -0800 Subject: RFR(T) [12] : 8217770 : problem list org.graalvm.compiler.debug.test.DebugContextTest Message-ID: <88E96595-0B8D-48F3-B7A1-B48A2A7922C4@oracle.com> http://cr.openjdk.java.net/~iignatyev//8217770/webrev.00/index.html > 1 line changed: 1 ins; 0 del; 0 mod; > diff -r 6533b2b34593 test/hotspot/jtreg/ProblemList-graal.txt > org.graalvm.compiler.core.test.OptionsVerifierTest 8205081 > org.graalvm.compiler.hotspot.test.CompilationWrapperTest 8205081 > org.graalvm.compiler.replacements.test.classfile.ClassfileBytecodeProviderTest 8205081 > +org.graalvm.compiler.debug.test.DebugContextTest 8205081 > > org.graalvm.compiler.core.test.deopt.CompiledMethodTest 8202955 Hi all, could you please review this tiny and trivial patch which puts org.graalvm.compiler.debug.test.DebugContextTest back into the problem list? this graal unit test was prematurely removed from the problem list by 8217580[1-2]. this test is/was known to fail not only because of 8203504[3], but also because of 8205081[4]. this patch put DebugContextTest test to the problem list w/ 8205081 as the reason. JBS: https://bugs.openjdk.java.net/browse/JDK-8217770 webrev: http://cr.openjdk.java.net/~iignatyev//8217770/webrev.00/index.html testing: compiler/graalunit/DebugTest.java (which includes this unit test) [1] https://bugs.openjdk.java.net/browse/JDK-8217580 [2] http://hg.openjdk.java.net/jdk/jdk12/rev/6533b2b34593#l2.44 [3] https://bugs.openjdk.java.net/browse/JDK-8203504 [4] https://bugs.openjdk.java.net/browse/JDK-8205081 Thanks, -- Igor From david.holmes at oracle.com Fri Jan 25 07:50:28 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 25 Jan 2019 17:50:28 +1000 Subject: RFR(T) [12] : 8217770 : problem list org.graalvm.compiler.debug.test.DebugContextTest In-Reply-To: <88E96595-0B8D-48F3-B7A1-B48A2A7922C4@oracle.com> References: <88E96595-0B8D-48F3-B7A1-B48A2A7922C4@oracle.com> Message-ID: Looks good. Thanks, David On 25/01/2019 5:46 pm, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8217770/webrev.00/index.html >> 1 line changed: 1 ins; 0 del; 0 mod; > >> diff -r 6533b2b34593 test/hotspot/jtreg/ProblemList-graal.txt >> org.graalvm.compiler.core.test.OptionsVerifierTest 8205081 >> org.graalvm.compiler.hotspot.test.CompilationWrapperTest 8205081 >> org.graalvm.compiler.replacements.test.classfile.ClassfileBytecodeProviderTest 8205081 >> +org.graalvm.compiler.debug.test.DebugContextTest 8205081 >> >> org.graalvm.compiler.core.test.deopt.CompiledMethodTest 8202955 > > Hi all, > > could you please review this tiny and trivial patch which puts org.graalvm.compiler.debug.test.DebugContextTest back into the problem list? > > this graal unit test was prematurely removed from the problem list by 8217580[1-2]. this test is/was known to fail not only because of 8203504[3], but also because of 8205081[4]. this patch put DebugContextTest test to the problem list w/ 8205081 as the reason. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8217770 > webrev: http://cr.openjdk.java.net/~iignatyev//8217770/webrev.00/index.html > testing: compiler/graalunit/DebugTest.java (which includes this unit test) > > [1] https://bugs.openjdk.java.net/browse/JDK-8217580 > [2] http://hg.openjdk.java.net/jdk/jdk12/rev/6533b2b34593#l2.44 > [3] https://bugs.openjdk.java.net/browse/JDK-8203504 > [4] https://bugs.openjdk.java.net/browse/JDK-8205081 > > Thanks, > -- Igor > From igor.ignatyev at oracle.com Fri Jan 25 07:53:38 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 24 Jan 2019 23:53:38 -0800 Subject: RFR(T) [12] : 8217770 : problem list org.graalvm.compiler.debug.test.DebugContextTest In-Reply-To: References: <88E96595-0B8D-48F3-B7A1-B48A2A7922C4@oracle.com> Message-ID: that was fast! thanks David. -- Igor > On Jan 24, 2019, at 11:50 PM, David Holmes wrote: > > Looks good. > > Thanks, > David > > On 25/01/2019 5:46 pm, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8217770/webrev.00/index.html >>> 1 line changed: 1 ins; 0 del; 0 mod; >>> diff -r 6533b2b34593 test/hotspot/jtreg/ProblemList-graal.txt >>> org.graalvm.compiler.core.test.OptionsVerifierTest 8205081 >>> org.graalvm.compiler.hotspot.test.CompilationWrapperTest 8205081 >>> org.graalvm.compiler.replacements.test.classfile.ClassfileBytecodeProviderTest 8205081 >>> +org.graalvm.compiler.debug.test.DebugContextTest 8205081 >>> org.graalvm.compiler.core.test.deopt.CompiledMethodTest 8202955 >> Hi all, >> could you please review this tiny and trivial patch which puts org.graalvm.compiler.debug.test.DebugContextTest back into the problem list? >> this graal unit test was prematurely removed from the problem list by 8217580[1-2]. this test is/was known to fail not only because of 8203504[3], but also because of 8205081[4]. this patch put DebugContextTest test to the problem list w/ 8205081 as the reason. >> JBS: https://bugs.openjdk.java.net/browse/JDK-8217770 >> webrev: http://cr.openjdk.java.net/~iignatyev//8217770/webrev.00/index.html >> testing: compiler/graalunit/DebugTest.java (which includes this unit test) >> [1] https://bugs.openjdk.java.net/browse/JDK-8217580 >> [2] http://hg.openjdk.java.net/jdk/jdk12/rev/6533b2b34593#l2.44 >> [3] https://bugs.openjdk.java.net/browse/JDK-8203504 >> [4] https://bugs.openjdk.java.net/browse/JDK-8205081 >> Thanks, >> -- Igor From robbin.ehn at oracle.com Fri Jan 25 08:31:57 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 25 Jan 2019 09:31:57 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <9c8e4ec5-495a-32dc-32e7-fdec92a2f9af@oracle.com> References: <9c8e4ec5-495a-32dc-32e7-fdec92a2f9af@oracle.com> Message-ID: <93325393-94c3-7afc-cf6a-83a89e651659@oracle.com> Hi David, On 1/25/19 8:34 AM, David Holmes wrote: > Hi Robbin, > > On 25/01/2019 1:51 am, Robbin Ehn wrote: >> Hi here is v04, updated after the comments. >> >> http://cr.openjdk.java.net/~rehn/8203469/v04/inc > > src/hotspot/share/runtime/safepoint.hpp > > +?? // JavaThreads not blocking (e.g. mutex) the entire safepoint stops on the > +?? // _wait_barrier, where they can quickly be started again. > ??? static WaitBarrier* _wait_barrier; > > That comment doesn't make sense to me. What does "not blocking the entire > safepoint" mean ?? Also s/stops/stop/ > > Are you really just saying that "JavaThreads that need to block for the > safepoint will stop on the _wait_barrier ..." ? Yes, thanks! > > > +?? // The last JavaThread doing the callback will singal the VM thread to > continue. > > Typo: singal -> signal Thanks again David! /Robbin > > No need for updatwed webrev. > > Thanks, > David > >> http://cr.openjdk.java.net/~rehn/8203469/v04/ >> >> Still running some tests. >> >> Thanks, Robbin >> >> >> On 1/15/19 11:39 AM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin From robbin.ehn at oracle.com Fri Jan 25 08:32:24 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 25 Jan 2019 09:32:24 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: <222c0b8b-8447-74e0-e908-8e0a5e7bb92f@oracle.com> Thanks Coleen! /Robbin On 1/24/19 4:59 PM, coleen.phillimore at oracle.com wrote: > Looks good to me. > Coleen > > On 1/24/19 10:51 AM, Robbin Ehn wrote: >> Hi here is v04, updated after the comments. >> >> http://cr.openjdk.java.net/~rehn/8203469/v04/inc >> http://cr.openjdk.java.net/~rehn/8203469/v04/ >> >> Still running some tests. >> >> Thanks, Robbin >> >> >> On 1/15/19 11:39 AM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin > From yumin.qi at gmail.com Fri Jan 25 08:52:14 2019 From: yumin.qi at gmail.com (yumin qi) Date: Fri, 25 Jan 2019 16:52:14 +0800 Subject: Anonymous class Message-ID: Hi, I have a question of anonymous class. We know the anonymous class with a host_klass, and the flag is set when the InstanceKlass is created after the class parsed. In case of a regular java class file, the flag will be set correctly but for the case it is not set: SystemDictionary::parse_stream or resolve_from_stream, which is called from JVM_DefineClassWithSource or jni_defineClass. The stack trace like: #1 0x00007f1127291ce7 in SystemDictionary::resolve_from_stream(class_name=0x7f0ee40243f0, class_loader=..., protection_domain=...,d__=0x7f0f04001000) at /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/classfile/systemDictionaryShared.cpp:656 #1 0x00007f1127291ce7 in SystemDictionary::resolve_from_stream (class_name=0x7f0ee40243f0, class_loader=..., protection_doma in=..., st=0x7f0f0a5f3850, verify=true, __the_thread__=0x7f0f04001000) at /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/classfile/systemDictionary.cpp:1234 #2 0x00007f1126f3cadb in jvm_define_class_common (env=0x7f0f04001220, name=0x7f0f0a5f3e60 "com/google/common/collect/Iterato rs$3", loader=0x7f0f0a5f3f20, buf=0x7f0ee401e210 "\312\376\272\276", len=943, pd=0x7f0f0a5f3f48, source=0x7f0f0a5f3a60 "jar:file:/home/<...>/lib/guava-19.0.jar!/", verify=1 '\001', __the_thread__=0x7f0f04001000) at /home/yumin.qi/ws/openjdk/8/jdk8u/hotspot/src/share/vm/prims/jvm.cpp:1082 #3 0x00007f1126f3d019 in JVM_DefineClassWithSource (env=0x7f0f04001220, name=0x7f0f0a5f3e60 "com/google/common/collect/Itera tors$3", loader=0x7f0f0a5f3f20, buf=0x7f0ee401e210 "\312\376\272\276", len=943, pd=0x7f0f0a5f3f48, source=0x7f0f0a5f3a60 "jar:file:/home/<...>/lib/guava-19.0.jar!/") at /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/prim/jvm.cpp:1102 #4 0x00007f112581214f in Java_java_lang_ClassLoader_defineClass1 () from /home/ws/openjdk/8/jdk8u/build/linux-x86_64-normal-server-slowdebug/images/j2sdk-image/jre/lib/amd64/libjava.so #5 0x00007f11118f5402 in ?? () #6 0x00007f0f0a5f3f48 in ?? () #7 0x00007f0f0a5f3f70 in ?? () #8 0x0000000755dbbc28 in ?? () The class name is com.google.common.collect.Iterators$3, and it is an anonymous class. This is an example using Guava. I also can reproduce with jdk13. The field after parsing: _nonstatic_oop_map_size = 1, _is_marked_dependent = false, _has_unloaded_dependent = false, _misc_flags = 38, ////// <------------ not set for anonymous _minor_version = 0, _major_version = 50, _init_thread = 0x0, _vtable_len = 8, _itable_len = 8, This path is from a custom loader defining a anonymous class, does not set anonymous correctly for the flag. Is this a bug? If in java land, supply a host as parameter, it can be set correctly in VM part. Should I file a bug for it? Thanks Yumin From robbin.ehn at oracle.com Fri Jan 25 10:42:56 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 25 Jan 2019 11:42:56 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: <942a57f7-0466-deed-9cee-28918b720de4@oracle.com> Thanks! /Robbin On 1/23/19 5:45 PM, Patricio Chilano wrote: > Hi Robbin, > > Looks good to me! > > Thanks, > Patricio > > On 1/23/19 8:33 AM, Robbin Ehn wrote: >> Hi all, here is v03. >> >> It's contains the update from comments and: >> I notice safepoint.hpp contained wrong/not need inline keyword for methods. >> Those method are either default inline because they are defined in the >> declaration (header) or since they are defined in the same cpp unit as callers >> and thus can be inlined any way. >> >> http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ >> http://cr.openjdk.java.net/~rehn/8203469/v03/ >> >> Passes t1. >> >> Thanks, Robbin >> >> On 2019-01-15 11:39, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin > From robbin.ehn at oracle.com Fri Jan 25 10:44:18 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 25 Jan 2019 11:44:18 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: <8838a5c8-f427-35f0-7af6-0cf21ee6827e@oracle.com> Message-ID: Thanks! /Robbin On 1/23/19 4:51 PM, Daniel D. Daugherty wrote: > On 1/23/19 8:33 AM, Robbin Ehn wrote: >> Hi all, here is v03. >> >> It's contains the update from comments and: >> I notice safepoint.hpp contained wrong/not need inline keyword for methods. >> Those method are either default inline because they are defined in the >> declaration (header) or since they are defined in the same cpp unit as callers >> and thus can be inlined any way. >> >> http://cr.openjdk.java.net/~rehn/8203469/v03/inc/ > > src/hotspot/share/runtime/safepoint.cpp > ??? No comments. > > src/hotspot/share/runtime/safepoint.hpp > ??? No comments. > > Thumbs up. > > Dan > > >> http://cr.openjdk.java.net/~rehn/8203469/v03/ >> >> Passes t1. >> >> Thanks, Robbin >> >> On 2019-01-15 11:39, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin > From david.holmes at oracle.com Fri Jan 25 11:18:27 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 25 Jan 2019 21:18:27 +1000 Subject: Anonymous class In-Reply-To: References: Message-ID: <8ccbfa3b-6297-d76c-97db-28121c965fdd@oracle.com> On 25/01/2019 6:52 pm, yumin qi wrote: > Hi, > > I have a question of anonymous class. We know the anonymous class with a > host_klass, and the flag is set when the InstanceKlass is created after the > class parsed. In case of a regular java class file, the flag will be set > correctly but for the case it is not set: > SystemDictionary::parse_stream or resolve_from_stream, which is called > from JVM_DefineClassWithSource or jni_defineClass. The stack trace like: > > #1 0x00007f1127291ce7 in > SystemDictionary::resolve_from_stream(class_name=0x7f0ee40243f0, > class_loader=..., protection_domain=...,d__=0x7f0f04001000) at > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/classfile/systemDictionaryShared.cpp:656 > #1 0x00007f1127291ce7 in SystemDictionary::resolve_from_stream > (class_name=0x7f0ee40243f0, class_loader=..., protection_doma > in=..., > st=0x7f0f0a5f3850, verify=true, __the_thread__=0x7f0f04001000) > at > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/classfile/systemDictionary.cpp:1234 > #2 0x00007f1126f3cadb in jvm_define_class_common (env=0x7f0f04001220, > name=0x7f0f0a5f3e60 "com/google/common/collect/Iterato > rs$3", > loader=0x7f0f0a5f3f20, buf=0x7f0ee401e210 "\312\376\272\276", len=943, > pd=0x7f0f0a5f3f48, > source=0x7f0f0a5f3a60 "jar:file:/home/<...>/lib/guava-19.0.jar!/", > verify=1 '\001', __the_thread__=0x7f0f04001000) at > /home/yumin.qi/ws/openjdk/8/jdk8u/hotspot/src/share/vm/prims/jvm.cpp:1082 > #3 0x00007f1126f3d019 in JVM_DefineClassWithSource (env=0x7f0f04001220, > name=0x7f0f0a5f3e60 "com/google/common/collect/Itera > tors$3", > loader=0x7f0f0a5f3f20, buf=0x7f0ee401e210 "\312\376\272\276", len=943, > pd=0x7f0f0a5f3f48, > source=0x7f0f0a5f3a60 "jar:file:/home/<...>/lib/guava-19.0.jar!/") at > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/prim/jvm.cpp:1102 > #4 0x00007f112581214f in Java_java_lang_ClassLoader_defineClass1 () > from > /home/ws/openjdk/8/jdk8u/build/linux-x86_64-normal-server-slowdebug/images/j2sdk-image/jre/lib/amd64/libjava.so > #5 0x00007f11118f5402 in ?? () > #6 0x00007f0f0a5f3f48 in ?? () > #7 0x00007f0f0a5f3f70 in ?? () > #8 0x0000000755dbbc28 in ?? () > > The class name is com.google.common.collect.Iterators$3, and it is an > anonymous class. I'm confused. Are you talking about Java level anonymous classes or VM anonymous classes as created by Unsafe.defineAnonymousClass? Only VM anonymous classes have a "host class". Java level "anonymous classes" are just regular classes. David ----- > This is an example using Guava. I also can reproduce with jdk13. > > The field after parsing: > > _nonstatic_oop_map_size = 1, > _is_marked_dependent = false, > _has_unloaded_dependent = false, > _misc_flags = 38, ////// <------------ not set for anonymous > _minor_version = 0, > _major_version = 50, > _init_thread = 0x0, > _vtable_len = 8, > _itable_len = 8, > > This path is from a custom loader defining a anonymous class, does not > set anonymous correctly for the flag. Is this a bug? If in java land, > supply a host as parameter, it can be set correctly in VM part. Should I > file a bug for it? > > Thanks > Yumin > From david.griffiths at gmail.com Fri Jan 25 12:09:35 2019 From: david.griffiths at gmail.com (David Griffiths) Date: Fri, 25 Jan 2019 12:09:35 +0000 Subject: Modify JVM to produce safepoints at every PcDesc? Message-ID: Hi, how feasible would it be to modify the JVM to create a safepoint at every PcDesc? I know performance would suffer but something of the order of 30% slowdown would be fine. Basically I want access to all local variables at every PcDesc rather than just method call/return and backloops. Cheers, David From yumin.qi at gmail.com Fri Jan 25 13:06:34 2019 From: yumin.qi at gmail.com (yumin qi) Date: Fri, 25 Jan 2019 21:06:34 +0800 Subject: Anonymous class In-Reply-To: <8ccbfa3b-6297-d76c-97db-28121c965fdd@oracle.com> References: <8ccbfa3b-6297-d76c-97db-28121c965fdd@oracle.com> Message-ID: Hi, David I am confused here. As you pointed out: Java anonymous classes are not VM anonymous classes. VM anonymous classes are created by Unsafe_DefineAnonymousClass only. Is my understanding right? All all lambda classes are vm anonymous? Thanks Yumin On Fri, Jan 25, 2019 at 7:18 PM David Holmes wrote: > > > On 25/01/2019 6:52 pm, yumin qi wrote: > > Hi, > > > > I have a question of anonymous class. We know the anonymous class > with a > > host_klass, and the flag is set when the InstanceKlass is created after > the > > class parsed. In case of a regular java class file, the flag will be set > > correctly but for the case it is not set: > > SystemDictionary::parse_stream or resolve_from_stream, which is called > > from JVM_DefineClassWithSource or jni_defineClass. The stack trace like: > > > > #1 0x00007f1127291ce7 in > > SystemDictionary::resolve_from_stream(class_name=0x7f0ee40243f0, > > class_loader=..., protection_domain=...,d__=0x7f0f04001000) at > > > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/classfile/systemDictionaryShared.cpp:656 > > #1 0x00007f1127291ce7 in SystemDictionary::resolve_from_stream > > (class_name=0x7f0ee40243f0, class_loader=..., protection_doma > > in=..., > > st=0x7f0f0a5f3850, verify=true, __the_thread__=0x7f0f04001000) > > at > > > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/classfile/systemDictionary.cpp:1234 > > #2 0x00007f1126f3cadb in jvm_define_class_common (env=0x7f0f04001220, > > name=0x7f0f0a5f3e60 "com/google/common/collect/Iterato > > rs$3", > > loader=0x7f0f0a5f3f20, buf=0x7f0ee401e210 "\312\376\272\276", > len=943, > > pd=0x7f0f0a5f3f48, > > source=0x7f0f0a5f3a60 "jar:file:/home/<...>/lib/guava-19.0.jar!/", > > verify=1 '\001', __the_thread__=0x7f0f04001000) at > > /home/yumin.qi/ws/openjdk/8/jdk8u/hotspot/src/share/vm/prims/jvm.cpp:1082 > > #3 0x00007f1126f3d019 in JVM_DefineClassWithSource (env=0x7f0f04001220, > > name=0x7f0f0a5f3e60 "com/google/common/collect/Itera > > tors$3", > > loader=0x7f0f0a5f3f20, buf=0x7f0ee401e210 "\312\376\272\276", > len=943, > > pd=0x7f0f0a5f3f48, > > source=0x7f0f0a5f3a60 "jar:file:/home/<...>/lib/guava-19.0.jar!/") > at > > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/prim/jvm.cpp:1102 > > #4 0x00007f112581214f in Java_java_lang_ClassLoader_defineClass1 () > > from > > > /home/ws/openjdk/8/jdk8u/build/linux-x86_64-normal-server-slowdebug/images/j2sdk-image/jre/lib/amd64/libjava.so > > #5 0x00007f11118f5402 in ?? () > > #6 0x00007f0f0a5f3f48 in ?? () > > #7 0x00007f0f0a5f3f70 in ?? () > > #8 0x0000000755dbbc28 in ?? () > > > > The class name is com.google.common.collect.Iterators$3, and it is an > > anonymous class. > > I'm confused. Are you talking about Java level anonymous classes or VM > anonymous classes as created by Unsafe.defineAnonymousClass? Only VM > anonymous classes have a "host class". Java level "anonymous classes" > are just regular classes. > > David > ----- > > > > > This is an example using Guava. I also can reproduce with jdk13. > > > > The field after parsing: > > > > _nonstatic_oop_map_size = 1, > > _is_marked_dependent = false, > > _has_unloaded_dependent = false, > > _misc_flags = 38, ////// <------------ not set for anonymous > > _minor_version = 0, > > _major_version = 50, > > _init_thread = 0x0, > > _vtable_len = 8, > > _itable_len = 8, > > > > This path is from a custom loader defining a anonymous class, does not > > set anonymous correctly for the flag. Is this a bug? If in java land, > > supply a host as parameter, it can be set correctly in VM part. Should I > > file a bug for it? > > > > Thanks > > Yumin > > > From david.holmes at oracle.com Fri Jan 25 13:37:12 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 25 Jan 2019 23:37:12 +1000 Subject: Anonymous class In-Reply-To: References: <8ccbfa3b-6297-d76c-97db-28121c965fdd@oracle.com> Message-ID: <684252eb-391c-4ca1-a164-090c33160c2c@oracle.com> On 25/01/2019 11:06 pm, yumin qi wrote: > Hi, David > > I am confused here. > As you pointed out: > ? ?Java anonymous classes are not VM anonymous classes. > ? ?VM anonymous classes are created by Unsafe_DefineAnonymousClass only. > > Is my understanding right? All all lambda classes are vm anonymous? I don't know if it is correct to say "all lambda classes" as I don't know the full translation strategy for lambda expressions, but yes some lambda generated classes are VM anonymous classes. David > Thanks > Yumin > > On Fri, Jan 25, 2019 at 7:18 PM David Holmes > wrote: > > > > On 25/01/2019 6:52 pm, yumin qi wrote: > > Hi, > > > >? ? I have a question of anonymous class. We know the anonymous > class with a > > host_klass, and the flag is set when the InstanceKlass is created > after the > > class parsed. In case of a regular java class file, the flag will > be set > > correctly but for the case it is not set: > >? ? SystemDictionary::parse_stream or resolve_from_stream, which > is called > > from JVM_DefineClassWithSource or jni_defineClass. The stack > trace like: > > > > #1? 0x00007f1127291ce7 in > > SystemDictionary::resolve_from_stream(class_name=0x7f0ee40243f0, > > class_loader=..., protection_domain=...,d__=0x7f0f04001000) at > > > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/classfile/systemDictionaryShared.cpp:656 > > #1? 0x00007f1127291ce7 in SystemDictionary::resolve_from_stream > > (class_name=0x7f0ee40243f0, class_loader=..., protection_doma > > in=..., > >? ? ? st=0x7f0f0a5f3850, verify=true, __the_thread__=0x7f0f04001000) > >? ? ? at > > > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/classfile/systemDictionary.cpp:1234 > > #2? 0x00007f1126f3cadb in jvm_define_class_common > (env=0x7f0f04001220, > > name=0x7f0f0a5f3e60 "com/google/common/collect/Iterato > > rs$3", > >? ? ? loader=0x7f0f0a5f3f20, buf=0x7f0ee401e210 > "\312\376\272\276", len=943, > > pd=0x7f0f0a5f3f48, > >? ? ? source=0x7f0f0a5f3a60 > "jar:file:/home/<...>/lib/guava-19.0.jar!/", > > verify=1 '\001', __the_thread__=0x7f0f04001000) at > > > /home/yumin.qi/ws/openjdk/8/jdk8u/hotspot/src/share/vm/prims/jvm.cpp:1082 > > #3? 0x00007f1126f3d019 in JVM_DefineClassWithSource > (env=0x7f0f04001220, > > name=0x7f0f0a5f3e60 "com/google/common/collect/Itera > > tors$3", > >? ? ? loader=0x7f0f0a5f3f20, buf=0x7f0ee401e210 > "\312\376\272\276", len=943, > > pd=0x7f0f0a5f3f48, > >? ? ? source=0x7f0f0a5f3a60 > "jar:file:/home/<...>/lib/guava-19.0.jar!/") at > > /home/ws/openjdk/8/jdk8u/hotspot/src/share/vm/prim/jvm.cpp:1102 > > #4? 0x00007f112581214f in Java_java_lang_ClassLoader_defineClass1 () > >? ? ?from > > > /home/ws/openjdk/8/jdk8u/build/linux-x86_64-normal-server-slowdebug/images/j2sdk-image/jre/lib/amd64/libjava.so > > #5? 0x00007f11118f5402 in ?? () > > #6? 0x00007f0f0a5f3f48 in ?? () > > #7? 0x00007f0f0a5f3f70 in ?? () > > #8? 0x0000000755dbbc28 in ?? () > > > > The class name is com.google.common.collect.Iterators$3, and it is an > > anonymous class. > > I'm confused. Are you talking about Java level anonymous classes or VM > anonymous classes as created by Unsafe.defineAnonymousClass? Only VM > anonymous classes have a "host class". Java level "anonymous classes" > are just regular classes. > > David > ----- > > > > > This is an example using Guava. I also can reproduce with jdk13. > > > > The field after parsing: > > > >? ?_nonstatic_oop_map_size = 1, > >? ? _is_marked_dependent = false, > >? ? _has_unloaded_dependent = false, > >? ? _misc_flags = 38,? ? ? ////// <------------ not set for anonymous > >? ? _minor_version = 0, > >? ? _major_version = 50, > >? ? _init_thread = 0x0, > >? ? _vtable_len = 8, > >? ? _itable_len = 8, > > > >? ? This path is from a custom loader defining a anonymous class, > does not > > set anonymous correctly for the flag. Is this a bug? If in java land, > > supply a host as parameter,? it can be set correctly in VM part. > Should I > > file a bug for it? > > > > Thanks > > Yumin > > > From mandy.chung at oracle.com Fri Jan 25 16:12:22 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 25 Jan 2019 08:12:22 -0800 Subject: Anonymous class In-Reply-To: <684252eb-391c-4ca1-a164-090c33160c2c@oracle.com> References: <8ccbfa3b-6297-d76c-97db-28121c965fdd@oracle.com> <684252eb-391c-4ca1-a164-090c33160c2c@oracle.com> Message-ID: <089ba997-998f-cf3e-0d83-433015797335@oracle.com> On 1/25/19 5:37 AM, David Holmes wrote: >> >> ???? > The class name is com.google.common.collect.Iterators$3, and >> it is an >> ???? > anonymous class. >> >> ??? I'm confused. Are you talking about Java level anonymous classes >> or VM >> ??? anonymous classes as created by Unsafe.defineAnonymousClass? Only VM >> ??? anonymous classes have a "host class". Java level "anonymous >> classes" >> ??? are just regular classes. Based on the class name com.google.common.collect.Iterators$3, this is a Java anonymous class.? As the stack trace shows, it's defined from JVM_DefineClassWithSource. VM anonymous class is defined via a different VM entry point Unsafe.defineAnonymousClass, as David said. No host class for this class is right if it's Java anonymous? class. You can use -Djdk.internal.lambda.dumpProxyClasses= that will dump the generated lambda proxy classes to the specified path (mkdir path first).? This might help Mandy From jesper.wilhelmsson at oracle.com Fri Jan 25 16:22:04 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Fri, 25 Jan 2019 17:22:04 +0100 Subject: RFR: ProblemList two tests failing in tier 3 Message-ID: Hi, Two of the tests that I removed from the ProblemList yesterday fails in tier 3. They seems to be run in a different way than I ran them in my verification. New bugs filed and putting them back on the ProblemList. Please review the diff below. (Trivial change, will push asap with one Reviewer.) Bugs: https://bugs.openjdk.java.net/browse/JDK-8217801 https://bugs.openjdk.java.net/browse/JDK-8217797 Diff: diff --git a/test/hotspot/jtreg/ProblemList-graal.txt b/test/hotspot/jtreg/ProblemList-graal.txt --- a/test/hotspot/jtreg/ProblemList-graal.txt +++ b/test/hotspot/jtreg/ProblemList-graal.txt @@ -39,6 +39,8 @@ compiler/graalunit/JttThreadsTest.java 8208066 generic-all +compiler/intrinsics/mathexact/LongMulOverflowTest.java 8217796 generic-all + compiler/jvmci/SecurityRestrictionsTest.java 8181837 generic-all compiler/unsafe/UnsafeGetConstantField.java 8181833 generic-all diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -186,6 +186,7 @@ vmTestbase/vm/mlvm/meth/stress/java/sequences/Test.java 8208255 generic-all vmTestbase/vm/mlvm/meth/stress/jdi/breakpointInCompiledCode/Test.java 8208255 generic-all vmTestbase/vm/mlvm/mixed/stress/java/findDeadlock/TestDescription.java 8208278 generic-all +vmTestbase/vm/mlvm/mixed/stress/regression/b6969574/INDIFY_Test.java 8217800 generic-all vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2none_a/TestDescription.java 8013267 generic-all vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2manyDiff_b/TestDescription.java 8013267 generic-all vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2manySame_b/TestDescription.java 8013267 generic-all Thanks, /Jesper From tobias.hartmann at oracle.com Fri Jan 25 16:29:11 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 25 Jan 2019 17:29:11 +0100 Subject: RFR: ProblemList two tests failing in tier 3 In-Reply-To: References: Message-ID: <7d3b08c3-2205-4f87-72fc-9626eb74763e@oracle.com> Hi Jesper, reviewed. Best regards, Tobias On 25.01.19 17:22, jesper.wilhelmsson at oracle.com wrote: > Hi, > > Two of the tests that I removed from the ProblemList yesterday fails in tier 3. They seems to be run in a different way than I ran them in my verification. > New bugs filed and putting them back on the ProblemList. Please review the diff below. (Trivial change, will push asap with one Reviewer.) > > Bugs: > https://bugs.openjdk.java.net/browse/JDK-8217801 > https://bugs.openjdk.java.net/browse/JDK-8217797 > > Diff: > diff --git a/test/hotspot/jtreg/ProblemList-graal.txt b/test/hotspot/jtreg/ProblemList-graal.txt > --- a/test/hotspot/jtreg/ProblemList-graal.txt > +++ b/test/hotspot/jtreg/ProblemList-graal.txt > @@ -39,6 +39,8 @@ > > compiler/graalunit/JttThreadsTest.java 8208066 generic-all > > +compiler/intrinsics/mathexact/LongMulOverflowTest.java 8217796 generic-all > + > compiler/jvmci/SecurityRestrictionsTest.java 8181837 generic-all > > compiler/unsafe/UnsafeGetConstantField.java 8181833 generic-all > diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -186,6 +186,7 @@ > vmTestbase/vm/mlvm/meth/stress/java/sequences/Test.java 8208255 generic-all > vmTestbase/vm/mlvm/meth/stress/jdi/breakpointInCompiledCode/Test.java 8208255 generic-all > vmTestbase/vm/mlvm/mixed/stress/java/findDeadlock/TestDescription.java 8208278 generic-all > +vmTestbase/vm/mlvm/mixed/stress/regression/b6969574/INDIFY_Test.java 8217800 generic-all > vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2none_a/TestDescription.java 8013267 generic-all > vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2manyDiff_b/TestDescription.java 8013267 generic-all > vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2manySame_b/TestDescription.java 8013267 generic-all > > > Thanks, > /Jesper > From jesper.wilhelmsson at oracle.com Fri Jan 25 16:33:11 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Fri, 25 Jan 2019 17:33:11 +0100 Subject: RFR: ProblemList two tests failing in tier 3 In-Reply-To: <7d3b08c3-2205-4f87-72fc-9626eb74763e@oracle.com> References: <7d3b08c3-2205-4f87-72fc-9626eb74763e@oracle.com> Message-ID: <397F141F-0E1A-47CA-BE44-9FFA34C3BA6B@oracle.com> Thanks! /Jesper > On 25 Jan 2019, at 17:29, Tobias Hartmann wrote: > > Hi Jesper, > > reviewed. > > Best regards, > Tobias > > On 25.01.19 17:22, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> >> Two of the tests that I removed from the ProblemList yesterday fails in tier 3. They seems to be run in a different way than I ran them in my verification. >> New bugs filed and putting them back on the ProblemList. Please review the diff below. (Trivial change, will push asap with one Reviewer.) >> >> Bugs: >> https://bugs.openjdk.java.net/browse/JDK-8217801 >> https://bugs.openjdk.java.net/browse/JDK-8217797 >> >> Diff: >> diff --git a/test/hotspot/jtreg/ProblemList-graal.txt b/test/hotspot/jtreg/ProblemList-graal.txt >> --- a/test/hotspot/jtreg/ProblemList-graal.txt >> +++ b/test/hotspot/jtreg/ProblemList-graal.txt >> @@ -39,6 +39,8 @@ >> >> compiler/graalunit/JttThreadsTest.java 8208066 generic-all >> >> +compiler/intrinsics/mathexact/LongMulOverflowTest.java 8217796 generic-all >> + >> compiler/jvmci/SecurityRestrictionsTest.java 8181837 generic-all >> >> compiler/unsafe/UnsafeGetConstantField.java 8181833 generic-all >> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -186,6 +186,7 @@ >> vmTestbase/vm/mlvm/meth/stress/java/sequences/Test.java 8208255 generic-all >> vmTestbase/vm/mlvm/meth/stress/jdi/breakpointInCompiledCode/Test.java 8208255 generic-all >> vmTestbase/vm/mlvm/mixed/stress/java/findDeadlock/TestDescription.java 8208278 generic-all >> +vmTestbase/vm/mlvm/mixed/stress/regression/b6969574/INDIFY_Test.java 8217800 generic-all >> vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2none_a/TestDescription.java 8013267 generic-all >> vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2manyDiff_b/TestDescription.java 8013267 generic-all >> vmTestbase/vm/mlvm/indy/func/jvmti/mergeCP_indy2manySame_b/TestDescription.java 8013267 generic-all >> >> >> Thanks, >> /Jesper >> From daniel.daugherty at oracle.com Fri Jan 25 16:36:56 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 25 Jan 2019 11:36:56 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: <374c3294-f903-dcc1-5672-bc25c5a668d8@oracle.com> On 1/24/19 10:51 AM, Robbin Ehn wrote: > Hi here is v04, updated after the comments. > > http://cr.openjdk.java.net/~rehn/8203469/v04/inc src/hotspot/share/code/dependencyContext.hpp ??? L110: assert(SafepointSynchronize::is_same_safepoint(_safepoint_counter), "safepoint happened"); ??????? Perhaps: "must be the same safepoint" for the mesg. src/hotspot/share/runtime/safepoint.cpp ??? L749: ? // previous safepoint and reading the reset (0/InactiveSafepointCounter) we ??????? Not quite grammatically correct. Perhaps: ??????????? // previous safepoint and reading the reset value (0/InactiveSafepointCounter) we src/hotspot/share/runtime/safepoint.hpp ??? L59: ?????????????????????????????????????????????? // VM thread and any non-Java thread may be running. ??????? Perhaps: ??????????????????????????????????????????????????????? // VM thread and any NonJavaThread may be running. ??? L102: ? // If VM thread only waits for callback threads, we wait for them on this semaphore. ??????? Perhaps: ??????????? // If VM thread has to wait for callback threads, it will wait for them on this semaphore. src/hotspot/share/runtime/safepointMechanism.inline.hpp ??? Nice catch here! Thumbs up! Dan > > http://cr.openjdk.java.net/~rehn/8203469/v04/ > > Still running some tests. > > Thanks, Robbin > > > On 1/15/19 11:39 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes >> 0.2ms the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock >> during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have >> transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure >> any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native >> without being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming >> into the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all >> JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a >> stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. >> The thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since >> each java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM >> thread may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have >> not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next >> safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin > From igor.ignatyev at oracle.com Sat Jan 26 16:32:21 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Sat, 26 Jan 2019 08:32:21 -0800 Subject: RFR(T)[12] : 8217852 : problem-list ctw of jdk.jconsole and java.desktop on windows Message-ID: http://cr.openjdk.java.net/~iignatyev//8217852/webrev.00/index.html > 4 lines changed: 4 ins; 0 del; 0 mod; Hi all, could you please review this ting and trivial patch which problem lists jdk.jconsole and java.desktop* ctw tests on windows? the tests were un-problem listed by 8217580[1] as 8189604[2] was resolved, but it appears we still get similar problems in the same tests. JBS: https://bugs.openjdk.java.net/browse/JDK-8217852 webrev: http://cr.openjdk.java.net/~iignatyev//8217852/webrev.00/index.html Thanks, -- Igor From vladimir.kozlov at oracle.com Sat Jan 26 19:53:24 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sat, 26 Jan 2019 11:53:24 -0800 Subject: RFR(T)[12] : 8217852 : problem-list ctw of jdk.jconsole and java.desktop on windows In-Reply-To: References: Message-ID: <1992B74B-6EF8-459A-BCB2-28A683FA0545@oracle.com> Good. Thanks Vladimir > On Jan 26, 2019, at 8:32 AM, Igor Ignatyev wrote: > > http://cr.openjdk.java.net/~iignatyev//8217852/webrev.00/index.html >> 4 lines changed: 4 ins; 0 del; 0 mod; > > Hi all, > > could you please review this ting and trivial patch which problem lists jdk.jconsole and java.desktop* ctw tests on windows? the tests were un-problem listed by 8217580[1] as 8189604[2] was resolved, but it appears we still get similar problems in the same tests. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8217852 > webrev: http://cr.openjdk.java.net/~iignatyev//8217852/webrev.00/index.html > > Thanks, > -- Igor From igor.ignatyev at oracle.com Sat Jan 26 20:51:51 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Sat, 26 Jan 2019 12:51:51 -0800 Subject: RFR(T)[12] : 8217852 : problem-list ctw of jdk.jconsole and java.desktop on windows In-Reply-To: <1992B74B-6EF8-459A-BCB2-28A683FA0545@oracle.com> References: <1992B74B-6EF8-459A-BCB2-28A683FA0545@oracle.com> Message-ID: <990315BB-103A-498A-A98E-E20A7A89EDB7@oracle.com> thanks for your review Vladimir. pushed. -- Igor > On Jan 26, 2019, at 11:53 AM, Vladimir Kozlov wrote: > > Good. > > Thanks > Vladimir > >> On Jan 26, 2019, at 8:32 AM, Igor Ignatyev wrote: >> >> http://cr.openjdk.java.net/~iignatyev//8217852/webrev.00/index.html >>> 4 lines changed: 4 ins; 0 del; 0 mod; >> >> Hi all, >> >> could you please review this ting and trivial patch which problem lists jdk.jconsole and java.desktop* ctw tests on windows? the tests were un-problem listed by 8217580[1] as 8189604[2] was resolved, but it appears we still get similar problems in the same tests. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8217852 >> webrev: http://cr.openjdk.java.net/~iignatyev//8217852/webrev.00/index.html >> >> Thanks, >> -- Igor > From manc at google.com Sun Jan 27 00:35:35 2019 From: manc at google.com (Man Cao) Date: Sat, 26 Jan 2019 16:35:35 -0800 Subject: RFR (M): 8212206: Refactor AdaptiveSizePolicy to separate out code related to GC overhead In-Reply-To: References: <6b1e59ec7f4746e8e071fd44ec91ca966fac8d78.camel@oracle.com> <7e0c775d-86c1-b80c-b1a6-373ca21206ba@oracle.com> Message-ID: Friendly ping. Could anyone give a second "looks good"? As for the develop flag AdaptiveSizePolicyGCTimeLimitThreshold/GCOverheadLimitThreshold, I added a note about it in https://bugs.openjdk.java.net/browse/JDK-8212084. -Man On Tue, Jan 15, 2019 at 6:41 PM Man Cao wrote: > Hi, > > I rebased the patch to tip and updated year in some headers to 2019, > without making any real change: > http://cr.openjdk.java.net/~manc/8212206/webrev.02/ > > > I don't foresee that this will be implemented, or even makes sense, for >> ZGC. As I see it, this is only a thing STW collectors. For that reason, >> I don't think it belongs in CollectedHeap. Keeping it as a separate >> utility class for collectors that want to use it sounds better. >> > Sounds good to keep this patch in the current state, without further > changing the CollectedHeap class. > > I haven't looked very closely at the patch, but couldn't help to notice >> that the option is called "GCOverheapLimitThreshold" (and >> "AdaptiveSizePolicyGCTimeLimitThreshold" before that), which is a >> tautology and a not very good description of what it is. >> How about we take the opportunity to clean this up and completely ditch >> the "gc_overhead_limit_count" thing and get rid of this option? It's a >> "develop" option, so it's not available to normal users anyway. Has >> anyone of you ever used this option and actually find it valuable? > > I didn't find any users inside Google that require changing this option. > That said, some users did complain that UseGCOverheadLimit for ParallelGC > or CMS is too difficult to get > triggered, because of the requirement for 5 consecutive full GCs, which is > set by this option. > I think if it were a normal "product" option, there will definitely be > users setting it. > I never understand why it is a "develop" option. I think we could either > remove it, > or make it an "experimental" option. > I'm leaning towards not removing it for now, as I'm not sure if 5 is still > a reasonable > default value for UseGCOverheadLimit for G1. > How about we decide whether to keep or remove this option after > JDK-8212084 (UseGCOverheadLimit for G1) is fixed? > > Also for the hsperfdata counter change, I created > https://bugs.openjdk.java.net/browse/JDK-8217221. I will draft a CSR for > it later. > > -Man > From shade at redhat.com Sun Jan 27 13:49:14 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Sun, 27 Jan 2019 14:49:14 +0100 Subject: RFR [12] (XS) 8217854: [TESTBUG] runtime/CompressedOops/UseCompressedOops.java fails with Shenandoah Message-ID: <628cba73-0e66-51e5-703d-0a644673d627@redhat.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8217854 Fix: diff -r 21bcd9cdffb3 test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java --- a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java Sat Jan 26 12:51:27 2019 -0800 +++ b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java Sun Jan 27 14:48:37 2019 +0100 @@ -63,7 +63,7 @@ testCompressedOopsModes(args, "-XX:+UseParallelGC"); testCompressedOopsModes(args, "-XX:+UseParallelOldGC"); if (GC.Shenandoah.isSupported()) { - testCompressedOopsModes(args, "-XX:+UseShenandoahGC"); + testCompressedOopsModes(args, "-XX:+UnlockExperimentalVMOptions", "-XX:+UseShenandoahGC"); } } This started to happen after recent removal of this test from ProblemList. The removal from ProblemList was pushed to jdk/jdk12, so this patch goes there as well. Testing: Linux x86_64 build, jtreg test Thanks, -Aleksey From daniel.daugherty at oracle.com Sun Jan 27 14:23:04 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Sun, 27 Jan 2019 09:23:04 -0500 Subject: RFR [12] (XS) 8217854: [TESTBUG] runtime/CompressedOops/UseCompressedOops.java fails with Shenandoah In-Reply-To: <628cba73-0e66-51e5-703d-0a644673d627@redhat.com> References: <628cba73-0e66-51e5-703d-0a644673d627@redhat.com> Message-ID: <646bbc11-c8dd-2636-0288-eb95423aea60@oracle.com> Thumbs up! Dan On 1/27/19 8:49 AM, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8217854 > > Fix: > > diff -r 21bcd9cdffb3 test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java > --- a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java Sat Jan 26 12:51:27 2019 -0800 > +++ b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java Sun Jan 27 14:48:37 2019 +0100 > @@ -63,7 +63,7 @@ > testCompressedOopsModes(args, "-XX:+UseParallelGC"); > testCompressedOopsModes(args, "-XX:+UseParallelOldGC"); > if (GC.Shenandoah.isSupported()) { > - testCompressedOopsModes(args, "-XX:+UseShenandoahGC"); > + testCompressedOopsModes(args, "-XX:+UnlockExperimentalVMOptions", "-XX:+UseShenandoahGC"); > } > } > > > This started to happen after recent removal of this test from ProblemList. The removal from > ProblemList was pushed to jdk/jdk12, so this patch goes there as well. > > Testing: Linux x86_64 build, jtreg test > > Thanks, > -Aleksey > > From zgu at redhat.com Sun Jan 27 19:18:00 2019 From: zgu at redhat.com (zgu at redhat.com) Date: Sun, 27 Jan 2019 14:18:00 -0500 Subject: RFR [12] (XS) 8217854: [TESTBUG] runtime/CompressedOops/UseCompressedOops.java fails with Shenandoah In-Reply-To: <628cba73-0e66-51e5-703d-0a644673d627@redhat.com> References: <628cba73-0e66-51e5-703d-0a644673d627@redhat.com> Message-ID: <1548616680.31327.41.camel@redhat.com> Looks good. -Zhengyu On Sun, 2019-01-27 at 14:49 +0100, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8217854 > > Fix: > > diff -r 21bcd9cdffb3 > test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java > --- > a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java S > at Jan 26 12:51:27 2019 -0800 > +++ > b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java S > un Jan 27 14:48:37 2019 +0100 > @@ -63,7 +63,7 @@ > testCompressedOopsModes(args, "-XX:+UseParallelGC"); > testCompressedOopsModes(args, "-XX:+UseParallelOldGC"); > if (GC.Shenandoah.isSupported()) { > - testCompressedOopsModes(args, "-XX:+UseShenandoahGC"); > + testCompressedOopsModes(args, "- > XX:+UnlockExperimentalVMOptions", "-XX:+UseShenandoahGC"); > } > } > > > This started to happen after recent removal of this test from > ProblemList. The removal from > ProblemList was pushed to jdk/jdk12, so this patch goes there as > well. > > Testing: Linux x86_64 build, jtreg test > > Thanks, > -Aleksey > From patricio.chilano.mateo at oracle.com Mon Jan 28 08:42:12 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Mon, 28 Jan 2019 03:42:12 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor Message-ID: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> Hi all, Please review the following patch: Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ The current implementation of native monitors uses a technique that we name "sneaky locking" to prevent possible deadlocks of the JVM during safepoints. The implementation of this technique though introduces a race when a monitor is shared between the VMThread and non-JavaThreads. This patch aims to solve that problem and at the same time simplify the code. The proposal is based on the introduction of the new class PlatformMonitor, which serves as a wrapper for the actual synchronization primitives in each platform (mutexes and condition variables). Most of the API calls can thus be implemented as simple wrappers around PlatformMonitor, adding more assertions and very little extra metadata. To be able to remove the lock sneaking code and at the same time avoid deadlocking scenarios, we combine two techniques: -When a JavaThread that has just acquired the lock, detects there is a safepoint request in the ThreadLockBlockInVM destructor, it releases the lock before blocking at the safepoint. After resuming from it, the JavaThread will have to acquire the lock again. - In the ThreadLockBlockInVM constructor for the Monitor::wait() method, in order to avoid blocking we allow for a possible safepoint request to make progress but without letting the JavaThread block for it (since we would be stopped by the destructor anyways). We also do that for the Monitor::lock() case although no deadlock is being prevented there. The ThreadLockBlockInVM jacket is a new ThreadStateTransition class used instead of the ThreadBlockInVM one. This allowed more flexibility to handle the two techniques mentioned above. Also, ThreadBlockInVM calls SafepointMechanism::block_if_requested() which creates some problems when trying to allow safepoints to continue without stopping, since that method not only checks for safepoints but also processes handshakes. In terms of performance, benchmarks show very similar results to what we have now. So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have been tested. Thanks, Patricio From matthias.baesken at sap.com Mon Jan 28 08:48:52 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 28 Jan 2019 08:48:52 +0000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x Message-ID: Hello, please review this change ; it adds virtualization related info in the hs_error file on linux s390x . On linux s390x, we usually (always?) run in virtualized environments (LPAR and/or z/VM / KVM ). It is helpful for instance in support cases to get some information about the virtualized environment in the hs_error file . A lot of info can be taken from the /proc/sysinfo file on linux s390x . Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8217786 http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ Best regards, Matthias From thomas.stuefe at gmail.com Mon Jan 28 10:38:51 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 28 Jan 2019 11:38:51 +0100 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: Message-ID: Hi Matthias, I would reformulate the _scan_and_print_sysinfo_file() function in linux code to be more generic, e.g. by handing down some sort of matching condition. In its simplest form this can be one or a collection of keywords, for example: -- // keywords_to_match - NULL terminated array of keywords static bool print_matching_lines_from_sysinfo_file(outputStream* st, const char* keywords_to_match[]) { .. int i = 0; while (keywords_to_match[i]) { if (strstr(line, keywords_to_match[i]) == line) print line; i ++ } } and call this on s390 with: #ifdef s390 const char* kw[] = { "LPAR", "CPU", "VM", NULL }; #endif print_matching_lines_from_sysinfo_file (st, kw); --- That way this coding can be easliy reused on other architectures. Alternatively, I would fan out the coding for Linux to the cpu specific files (os_linux_.cpp) and leave all but s390 empty. But I am personally not fond of those many empty functions. Cheers, Thomas On Mon, Jan 28, 2019 at 9:49 AM Baesken, Matthias wrote: > Hello, please review this change ; it adds virtualization related info > in the hs_error file on linux s390x . > > On linux s390x, we usually (always?) run in virtualized environments > (LPAR and/or z/VM / KVM ). > > It is helpful for instance in support cases to get some information about > the virtualized environment in the hs_error file . > A lot of info can be taken from the /proc/sysinfo file on linux s390x . > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8217786 > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > > > > Best regards, Matthias > From matthias.baesken at sap.com Mon Jan 28 11:28:36 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 28 Jan 2019 11:28:36 +0000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: Message-ID: Hi Thomas , at first I wanted to do such a generic solution (function that gets the path to the file + a pattern list); but then I thought that such generic solutions are often not liked in OpenJDK. But let?s see what others think , I am open to go for a more generic approach . Best regards, Matthias From: Thomas St?fe Sent: Montag, 28. Januar 2019 11:39 To: Baesken, Matthias Cc: hotspot-dev at openjdk.java.net Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x Hi Matthias, I would reformulate the _scan_and_print_sysinfo_file() function in linux code to be more generic, e.g. by handing down some sort of matching condition. In its simplest form this can be one or a collection of keywords, for example: -- // keywords_to_match - NULL terminated array of keywords static bool print_matching_lines_from_sysinfo_file(outputStream* st, const char* keywords_to_match[]) { .. int i = 0; while (keywords_to_match[i]) { if (strstr(line, keywords_to_match[i]) == line) print line; i ++ } } and call this on s390 with: #ifdef s390 const char* kw[] = { "LPAR", "CPU", "VM", NULL }; #endif print_matching_lines_from_sysinfo_file (st, kw); --- That way this coding can be easliy reused on other architectures. Alternatively, I would fan out the coding for Linux to the cpu specific files (os_linux_.cpp) and leave all but s390 empty. But I am personally not fond of those many empty functions. Cheers, Thomas On Mon, Jan 28, 2019 at 9:49 AM Baesken, Matthias > wrote: Hello, please review this change ; it adds virtualization related info in the hs_error file on linux s390x . On linux s390x, we usually (always?) run in virtualized environments (LPAR and/or z/VM / KVM ). It is helpful for instance in support cases to get some information about the virtualized environment in the hs_error file . A lot of info can be taken from the /proc/sysinfo file on linux s390x . Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8217786 http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ Best regards, Matthias From david.holmes at oracle.com Mon Jan 28 11:34:30 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 28 Jan 2019 21:34:30 +1000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: Message-ID: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> Hi Matthias, On 28/01/2019 6:48 pm, Baesken, Matthias wrote: > Hello, please review this change ; it adds virtualization related info in the hs_error file on linux s390x . Can't you include this information in an existing section of the error processing code instead of adding a new function that is empty everywhere except Linux? Thanks, David > On linux s390x, we usually (always?) run in virtualized environments (LPAR and/or z/VM / KVM ). > > It is helpful for instance in support cases to get some information about the virtualized environment in the hs_error file . > A lot of info can be taken from the /proc/sysinfo file on linux s390x . > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8217786 > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > > > > Best regards, Matthias > From matthias.baesken at sap.com Mon Jan 28 12:23:37 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 28 Jan 2019 12:23:37 +0000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> Message-ID: > > Can't you include this information in an existing section of the error > processing code instead of adding a new function that is empty > everywhere except Linux? > Hi David , do you mean something like #if defined(S390) STEP("printing virtualization info") ... #endif in vmError.cpp ? I thought about doing this. But on the other hand , the now still empty os::pd_print_virtualization_info in platforms != linux might fill over time ( we could add [at least for some platforms] other virtualization related info ). Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Montag, 28. Januar 2019 12:35 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' > Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error > file on linux s390x > > Hi Matthias, > > On 28/01/2019 6:48 pm, Baesken, Matthias wrote: > > Hello, please review this change ; it adds virtualization related info in the > hs_error file on linux s390x . > > Can't you include this information in an existing section of the error > processing code instead of adding a new function that is empty > everywhere except Linux? > > Thanks, > David > > > On linux s390x, we usually (always?) run in virtualized environments > (LPAR and/or z/VM / KVM ). > > > > It is helpful for instance in support cases to get some information about the > virtualized environment in the hs_error file . > > A lot of info can be taken from the /proc/sysinfo file on linux s390x . > > > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8217786 > > > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > > > > > > > > Best regards, Matthias > > From robbin.ehn at oracle.com Mon Jan 28 13:04:52 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 28 Jan 2019 14:04:52 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: Message-ID: <0ba510c1-6a1a-68f1-811c-0f538c3a472b@oracle.com> Hi all, here is v05. http://cr.openjdk.java.net/~rehn/8203469/v05/ http://cr.openjdk.java.net/~rehn/8203469/v05/inc/ I have been asked to go on-top-of: https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036425.html With a small grace-period. There will be a v06 rebase on-top of that. Updated after comments and changes regarding safepoint_safe(). In JFR code path, thread is always current, so it should not be calling safepoint_safe. It also don't control polls, so even if it returns true it is not safe in that case. Updated to a handshake_safe() private method with a friend for handshakes. Test t1-3, stress testing and JFR. Thanks, Robbin On 1/15/19 11:39 AM, Robbin Ehn wrote: > Hi all, please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 > Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ > > Thanks to Dan for pre-reviewing a lot! > > Background: > ZGC often does very short safepoint operations. For a perspective, in a > specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While > in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which > operation it is. The time it takes to stop and start the JavaThreads is relative > very large to a ZGC safepoint. With an operation that just takes 0.2ms the > overhead of stopping and starting JavaThreads is several times the operation. > > High-level functionality change: > Serializing the starting over Threads_lock takes time. > - Don't wait on Threads_lock use the WaitBarrier. > Serializing the stopping over Safepoint_lock takes time. > - Let threads stop in parallel, remove Safepoint_lock. > > Details: > JavaThreads have 2 abstract logical states: unsafe or safe. > - Safe means the JavaThread will not touch Java heap or VM internal structures > ? without doing a transition and block before doing so. > ??????? - The safe states are: > ??????????????? - When polls armed: _thread_in_native and _thread_blocked. > ??????????????? - When Threads_lock is held: externally suspended flag is set. > ??????? - VM Thread have polls armed and holds the Threads_lock during a > ????????? safepoint. > - Unsafe means that either Java heap or VM internal structures can be accessed > ? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. > ??????? - All combination that are not safe are unsafe. > > We cannot start a safepoint until all unsafe threads have transitioned to a safe > state. To make them safe, we arm polls in compiled code and make sure any > transition to another unsafe state will be blocked. JavaThreads which are unsafe > with state _thread_in_Java may transition to _thread_in_native without being > blocked, since it just became a safe thread and we can proceed. Any safe thread > may try to transition at any time to an unsafe state, thus coming into the > safepoint blocking code at any moment, e.g., after the safepoint is over, or > even at the beginning of next safepoint. > > The VMThread cannot tolerate false positives from the JavaThread thread state > because that would mean starting the safepoint without all JavaThreads being > safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe > false positives from the safepoint blocking code, if we remove them, how do we > handle false positives? > > By first publishing which barrier tag (safepoint counter) we will call > WaitBarrier.wait() with as the threads safepoint id and then change the state to > _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of > the state. A stable load of the thread state is successful if the thread > safepoint id is the same both before and after the load of the state and > safepoint id is current or InactiveSafepointCounter. If the stable load fails, > the thread is considered safepoint unsafe. It's no longer enough that thread is > have state _thread_blocked it must also have correct safepoint id before and > after we read the state. > > Performance: > The result of faster safepoints is that the average CPU time for JavaThreads > between safepoints is higher, thus increasing the allocation rate. The thread > that stops first waits shorter time until it gets started. Even the thread that > stops last also have shorter stop since we start them faster. If your > application is using a concurrent GC it may need re-tunning since each java > worker thread have an increased CPU time/allocation rate. Often this means max > performance is achieved using slightly less java worker threads than before. > Also the increase allocation rate means shorter time between GC safepoints. > - If you are using a non-concurrent GC, you should see improved latency and > ? throughput. > - After re-tunning with a concurrent GC throughput should be equal or better but > ? with better latency. But bear in mind this is a latency patch, not a > ? throughput one. > With current code a java thread is not to guarantee to run between safepoint (in > theory a java thread can be starved indefinitely), since the VM thread may > re-grab the Threads_locks before it woke up from previous safepoint. If the > GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very > over-provisioned this can happen. > The current schema thus re-safepoint quickly if the java threads have not > started yet at the cost of latency. Since the new code uses the WaitBarrier with > the safepoint counter, all threads must roll forward to next safepoint by > getting at least some CPU time between two safepoints. Meaning MMU violations > are more obvious. > > Some examples on numbers: > - On a 16 strand machine synchronization and un-synchronization/starting is at > ? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and > ? starting ~400->~100us. > ? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). > - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster > ? synchronization time on 16 strands and ~5% score increase. In this case the GC > ? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. > - specJBB2015 ParGC ~9% increase in critical-jops. > > Thanks, Robbin From robbin.ehn at oracle.com Mon Jan 28 13:31:47 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 28 Jan 2019 14:31:47 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> Message-ID: Hi Patricio, Mostly looks good! block_at_safepoint is always called with block_in_safepoint_check = true. (correct?) Changing that to a local state instead of global simplifies the code. So I'm suggesting something like below. Thanks, Robbin diff -r e65cc445234c src/hotspot/share/runtime/interfaceSupport.inline.hpp --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan 28 13:10:15 2019 +0100 +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan 28 14:10:59 2019 +0100 @@ -308,2 +308,1 @@ - thread->block_in_safepoint_check = false; - SafepointMechanism::block_at_safepoint(thread); + SafepointMechanism::callback_if_safepoint(thread); @@ -323,2 +322,1 @@ - SafepointMechanism::block_at_safepoint(_thread); - _thread->block_in_safepoint_check = true; + SafepointMechanism::callback_if_safepoint(_thread); @@ -335,2 +332,0 @@ - } else { - _thread->block_in_safepoint_check = true; @@ -337,0 +334,1 @@ + diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp --- a/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 13:10:15 2019 +0100 +++ b/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 14:10:59 2019 +0100 @@ -795,1 +795,1 @@ -void SafepointSynchronize::block(JavaThread *thread) { +void SafepointSynchronize::block(JavaThread *thread, bool block_in_safepoint_check) { @@ -850,1 +850,1 @@ - if (thread->block_in_safepoint_check) { + if (block_in_safepoint_check) { @@ -880,1 +880,1 @@ - thread->block_in_safepoint_check) { + block_in_safepoint_check) { diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp --- a/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 13:10:15 2019 +0100 +++ b/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 14:10:59 2019 +0100 @@ -146,1 +146,1 @@ - static void block(JavaThread *thread); + static void block(JavaThread *thread, bool block_in_safepoint_check = true); diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp --- a/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 13:10:15 2019 +0100 +++ b/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 14:10:59 2019 +0100 @@ -82,1 +82,1 @@ - static inline void block_at_safepoint(JavaThread* thread); + static inline void callback_if_safepoint(JavaThread* thread); diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.inline.hpp --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan 28 13:10:15 2019 +0100 +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan 28 14:10:59 2019 +0100 @@ -82,1 +82,1 @@ -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { @@ -84,1 +84,1 @@ - SafepointSynchronize::block(thread); + SafepointSynchronize::block(thread, false); diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp --- a/src/hotspot/share/runtime/thread.cpp Mon Jan 28 13:10:15 2019 +0100 +++ b/src/hotspot/share/runtime/thread.cpp Mon Jan 28 14:10:59 2019 +0100 @@ -298,2 +297,0 @@ - block_in_safepoint_check = true; - diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp --- a/src/hotspot/share/runtime/thread.hpp Mon Jan 28 13:10:15 2019 +0100 +++ b/src/hotspot/share/runtime/thread.hpp Mon Jan 28 14:10:59 2019 +0100 @@ -788,2 +787,0 @@ - bool block_in_safepoint_check; // to decide whether to block in SS::block or not - On 1/28/19 9:42 AM, Patricio Chilano wrote: > Hi all, > > Please review the following patch: > > Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 > Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ > > The current implementation of native monitors uses a technique that we name > "sneaky locking" to prevent possible deadlocks of the JVM during safepoints. The > implementation of this technique though introduces a race when a monitor is > shared between the VMThread and non-JavaThreads. This patch aims to solve that > problem and at the same time simplify the code. > > The proposal is based on the introduction of the new class PlatformMonitor, > which serves as a wrapper for the actual synchronization primitives in each > platform (mutexes and condition variables). Most of the API calls can thus be > implemented as simple wrappers around PlatformMonitor, adding more assertions > and very little extra metadata. > To be able to remove the lock sneaking code and at the same time avoid > deadlocking scenarios, we combine two techniques: > > -When a JavaThread that has just acquired the lock, detects there is a safepoint > request in the ThreadLockBlockInVM destructor, it releases the lock before > blocking at the safepoint. After resuming from it, the JavaThread will have to > acquire the lock again. > > - In the ThreadLockBlockInVM constructor for the Monitor::wait() method, in > order to avoid blocking we allow for a possible safepoint request to make > progress but without letting the JavaThread block for it (since we would be > stopped by the destructor anyways). We also do that for the Monitor::lock() case > although no deadlock is being prevented there. > > The ThreadLockBlockInVM jacket is a new ThreadStateTransition class used instead > of the ThreadBlockInVM one. This allowed more flexibility to handle the two > techniques mentioned above. Also, ThreadBlockInVM calls > SafepointMechanism::block_if_requested() which creates some problems when trying > to allow safepoints to continue without stopping, since that method not only > checks for safepoints but also processes handshakes. > > In terms of performance, benchmarks show very similar results to what we have now. > > So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have been tested. > > Thanks, > Patricio > From erik.osterlund at oracle.com Mon Jan 28 13:56:28 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 28 Jan 2019 14:56:28 +0100 Subject: 8216541: CompiledICHolders of VM locked unloaded nmethods are released too late Message-ID: <0345dd70-fa19-a7e3-7892-8f4500f8339f@oracle.com> Hi, There is a bit of an anomaly how we deal with freeing up resources used by inline caches when an nmethod dies. An inline cache might have an associated ICStub if it is in a transitional state. But it might also have a CompiledICHolder if it is referring to a c2i adapter or an itable stub. These resources need to be freed up when an nmethod dies. But they get cleared at completely different times. As for IC stubs, they get "cleared" when is_alive() nmethods get converted to zombie. As for nmethods being made unloaded, they will (presumably) not have IC caches in transitional states. As for CompiledICHolders, rather than clearing them out when the nmethod dies (transitions to zombie or unloaded), as we did with IC stubs, we defer clearing of CompiledICHolders until the nmethod gets deleted, or "flushed". Because reasons. However, unless precaution is taken, it is not safe to find and clear out CompiledICHolders when the nmethod gets deleted (as opposed to when the nmethod died), due to how we detect there is a CompiledICHolder associated with the CompiledIC. We look at the destination of the call instruction, and ask said code blob if it is an adapter blob. If it is an adapter blob, it is assumed that the CompiledIC must have a CompiledICHolder associated with it. What can happen then is that between the nmethod dying (in particular due to becoming unloaded), and the nmethod being freed, the slot in the CodeCache that the stale destination pointed at, has had a c2i adapter allocated over it. So asking the question whether a CompiledIC has a CompiledICHolder is not safe in general, if the destination is stale, and points at dead nmethods. It will then have false positives. In order to deal with that awkwardness, rather than clearing CompiledICHolders when the nmethod dies and it is perfectly safe to query whether it has a CompiledICHolder or not, we insist on deferring clearing of CompiledICHolders to when the nmethod dies. To make that safe and avoid the false positives, the sweeper cleans CompiledICs of unloaded nmethods... most of the time... ...except when they are locked in VM. Then we don't do that. So when the sweeper finds unloaded nmethods that are locked in VM, then it skips cleaning of CompiledICs one cycle, its (stale) destination possibly gets reclaimed and possibly has a c2i adapter allocated in the same memory location as that stale destination points at. When we subsequently "flush" the nmethod, we incorrectly find that it has a CompiledICHolder (due to its destination pointing at a c2i adapter blob), even though it never had a CompiledICHolder when it died. Ouch. It goes downhill from there. It happens to be that when using code loading/unloading JVMTI events, this bug is more easily provoked. The reason for this is that the events lock the nmethods with the nmethodLocker until some event is being processed in the service thread, making the sweeper skip over cleaning the ICs of the unloaded nmethods in this corner case of the nmethod life cycle. My proposed solution to this bug is to simply nuke all CompiledIC metadata, whether that is ICStubs or CompiledICHolders, the instant the nmethod dies (transitions to zombie or unloaded), as well as setting state of such CompiledICs to clean. It seems much more fragile to have to maintain dead CompiledICs only for the sake of releasing CompiledICHolders when freeing the nmethod rather than when it dies. It also seems unnecessarily complicated to free CompiledICHolders and ICStubs of CompiledICs to get cleared at different times. By just clearing all of this when the nmethods die, we remove this complexity. Bug: https://bugs.openjdk.java.net/browse/JDK-8216541 Webrev: http://cr.openjdk.java.net/~eosterlund/8216541/webrev.00/ Going forward, I intend to perform a bunch of cleanups in this area, but I wanted to keep the change small ish for the bug fix, to make it easier to backport. It will probably need to get backported pretty far back. I don't think this has ever worked I'm afraid. The proposed change has survived 200 rounds of kitchensink, hs-tier1-3 and hs-precheckin-comp. Thanks, /Erik From bsrbnd at gmail.com Mon Jan 28 15:20:37 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Mon, 28 Jan 2019 16:20:37 +0100 Subject: test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java failing with -XX:+UseShenandoahGC on x86_64 In-Reply-To: References: Message-ID: I've seen this has just been pushed to 12 but I think we'd need this to 13 too, but: https://bugs.openjdk.java.net/browse/JDK-8217873 has been resolved as duplicate. I can help reopening it and pushing the fix to 13 if necessary? Thanks, Bernard On Sat, 26 Jan 2019 at 16:47, B. Blaser wrote: > > Hi, > > Maybe you're already aware of this failing test, but it seems that > -XX:+UnlockExperimentalVMOptions is necessary with > -XX:+UseShenandoahGC as here under. > > Regards, > Bernard > > diff --git a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java > b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java > --- a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java > +++ b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2014, 2018, Oracle and/or its affiliates. All rights reserved. > + * Copyright (c) 2014, 2019, Oracle and/or its affiliates. All rights reserved. > * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > * > * This code is free software; you can redistribute it and/or modify it > @@ -63,7 +63,7 @@ > testCompressedOopsModes(args, "-XX:+UseParallelGC"); > testCompressedOopsModes(args, "-XX:+UseParallelOldGC"); > if (GC.Shenandoah.isSupported()) { > - testCompressedOopsModes(args, "-XX:+UseShenandoahGC"); > + testCompressedOopsModes(args, > "-XX:+UnlockExperimentalVMOptions", "-XX:+UseShenandoahGC"); > } > } From daniel.daugherty at oracle.com Mon Jan 28 15:39:08 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 28 Jan 2019 10:39:08 -0500 Subject: test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java failing with -XX:+UseShenandoahGC on x86_64 In-Reply-To: References: Message-ID: <9eb1d081-8cd5-eb13-48f0-7ea4c26fa229@oracle.com> It will get pushed to jdk/jdk (for JDK13) on Jesper's next sync from jdk/jdk12 -> jdk/jdk. Dan On 1/28/19 10:20 AM, B. Blaser wrote: > I've seen this has just been pushed to 12 but I think we'd need this > to 13 too, but: > > https://bugs.openjdk.java.net/browse/JDK-8217873 > > has been resolved as duplicate. > > I can help reopening it and pushing the fix to 13 if necessary? > > Thanks, > Bernard > > On Sat, 26 Jan 2019 at 16:47, B. Blaser wrote: >> Hi, >> >> Maybe you're already aware of this failing test, but it seems that >> -XX:+UnlockExperimentalVMOptions is necessary with >> -XX:+UseShenandoahGC as here under. >> >> Regards, >> Bernard >> >> diff --git a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java >> b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java >> --- a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java >> +++ b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java >> @@ -1,5 +1,5 @@ >> /* >> - * Copyright (c) 2014, 2018, Oracle and/or its affiliates. All rights reserved. >> + * Copyright (c) 2014, 2019, Oracle and/or its affiliates. All rights reserved. >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> * >> * This code is free software; you can redistribute it and/or modify it >> @@ -63,7 +63,7 @@ >> testCompressedOopsModes(args, "-XX:+UseParallelGC"); >> testCompressedOopsModes(args, "-XX:+UseParallelOldGC"); >> if (GC.Shenandoah.isSupported()) { >> - testCompressedOopsModes(args, "-XX:+UseShenandoahGC"); >> + testCompressedOopsModes(args, >> "-XX:+UnlockExperimentalVMOptions", "-XX:+UseShenandoahGC"); >> } >> } From rkennke at redhat.com Mon Jan 28 16:16:11 2019 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 28 Jan 2019 17:16:11 +0100 Subject: test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java failing with -XX:+UseShenandoahGC on x86_64 In-Reply-To: References: Message-ID: Isn't anything pushed to 12 (kindof) automatically pushed to 13 too, eventually? Roman > I've seen this has just been pushed to 12 but I think we'd need this > to 13 too, but: > > https://bugs.openjdk.java.net/browse/JDK-8217873 > > has been resolved as duplicate. > > I can help reopening it and pushing the fix to 13 if necessary? > > Thanks, > Bernard > > On Sat, 26 Jan 2019 at 16:47, B. Blaser wrote: >> >> Hi, >> >> Maybe you're already aware of this failing test, but it seems that >> -XX:+UnlockExperimentalVMOptions is necessary with >> -XX:+UseShenandoahGC as here under. >> >> Regards, >> Bernard >> >> diff --git a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java >> b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java >> --- a/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java >> +++ b/test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java >> @@ -1,5 +1,5 @@ >> /* >> - * Copyright (c) 2014, 2018, Oracle and/or its affiliates. All rights reserved. >> + * Copyright (c) 2014, 2019, Oracle and/or its affiliates. All rights reserved. >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> * >> * This code is free software; you can redistribute it and/or modify it >> @@ -63,7 +63,7 @@ >> testCompressedOopsModes(args, "-XX:+UseParallelGC"); >> testCompressedOopsModes(args, "-XX:+UseParallelOldGC"); >> if (GC.Shenandoah.isSupported()) { >> - testCompressedOopsModes(args, "-XX:+UseShenandoahGC"); >> + testCompressedOopsModes(args, >> "-XX:+UnlockExperimentalVMOptions", "-XX:+UseShenandoahGC"); >> } >> } From shade at redhat.com Mon Jan 28 16:48:44 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 28 Jan 2019 17:48:44 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump Message-ID: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> RFE: https://bugs.openjdk.java.net/browse/JDK-8217879 Fix: http://cr.openjdk.java.net/~shade/8217879/webrev.01/ "Instructions" block is useful when following up on hs_errs that happened without the disassembler attached, which is usually the case coming from users. One can use the disassembler [1] to look around the code that was crashing, and get extended conjectures why the error happened, including rewinding a bit of history. However, current window is sometimes too small to infer enough context. I propose we extend it! The patch also commons the paths across OS/Arch-specific files to that current "delta" appears less of the magic number. Plus, it adds cr()-s for consistency across the arches and within the methods. Testing: eyeballing hs_errs from artificial crashes, Linux x86_64 build, jdk-submit Thanks, -Aleksey [1] I use https://onlinedisassembler.com, for example. From patricio.chilano.mateo at oracle.com Mon Jan 28 19:18:13 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Mon, 28 Jan 2019 14:18:13 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> Message-ID: <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> Hi Robbin, Thanks for reviewing this! Removing the block_in_safepoint_check thread local attribute is a great idea, here is v02: Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ Running mach5 again. Thanks, Patricio On 1/28/19 8:31 AM, Robbin Ehn wrote: > Hi Patricio, > > Mostly looks good! > > block_at_safepoint is always called with block_in_safepoint_check = > true. (correct?) > Changing that to a local state instead of global simplifies the code. > > So I'm suggesting something like below. > > Thanks, Robbin > > diff -r e65cc445234c > src/hotspot/share/runtime/interfaceSupport.inline.hpp > --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon Jan > 28 13:10:15 2019 +0100 > +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon Jan > 28 14:10:59 2019 +0100 > @@ -308,2 +308,1 @@ > -??? thread->block_in_safepoint_check = false; > -??? SafepointMechanism::block_at_safepoint(thread); > +??? SafepointMechanism::callback_if_safepoint(thread); > @@ -323,2 +322,1 @@ > -????? SafepointMechanism::block_at_safepoint(_thread); > -????? _thread->block_in_safepoint_check = true; > +????? SafepointMechanism::callback_if_safepoint(_thread); > @@ -335,2 +332,0 @@ > -??? } else { > -????? _thread->block_in_safepoint_check = true; > @@ -337,0 +334,1 @@ > + > diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp > --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 13:10:15 > 2019 +0100 > +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 14:10:59 > 2019 +0100 > @@ -795,1 +795,1 @@ > -void SafepointSynchronize::block(JavaThread *thread) { > +void SafepointSynchronize::block(JavaThread *thread, bool > block_in_safepoint_check) { > @@ -850,1 +850,1 @@ > -????? if (thread->block_in_safepoint_check) { > +????? if (block_in_safepoint_check) { > @@ -880,1 +880,1 @@ > -????????? thread->block_in_safepoint_check) { > +????????? block_in_safepoint_check) { > diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp > --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 13:10:15 > 2019 +0100 > +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 14:10:59 > 2019 +0100 > @@ -146,1 +146,1 @@ > -? static void?? block(JavaThread *thread); > +? static void?? block(JavaThread *thread, bool > block_in_safepoint_check = true); > diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp > --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 > 13:10:15 2019 +0100 > +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 > 14:10:59 2019 +0100 > @@ -82,1 +82,1 @@ > -? static inline void block_at_safepoint(JavaThread* thread); > +? static inline void callback_if_safepoint(JavaThread* thread); > diff -r e65cc445234c > src/hotspot/share/runtime/safepointMechanism.inline.hpp > --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan > 28 13:10:15 2019 +0100 > +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan > 28 14:10:59 2019 +0100 > @@ -82,1 +82,1 @@ > -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { > +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { > @@ -84,1 +84,1 @@ > -??? SafepointSynchronize::block(thread); > +??? SafepointSynchronize::block(thread, false); > diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp > --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 2019 > +0100 > +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 2019 > +0100 > @@ -298,2 +297,0 @@ > -? block_in_safepoint_check = true; > - > diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp > --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 2019 > +0100 > +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 2019 > +0100 > @@ -788,2 +787,0 @@ > -? bool block_in_safepoint_check;????????????? // to decide whether to > block in SS::block or not > - > > > On 1/28/19 9:42 AM, Patricio Chilano wrote: >> Hi all, >> >> Please review the following patch: >> >> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >> >> The current implementation of native monitors uses a technique that >> we name "sneaky locking" to prevent possible deadlocks of the JVM >> during safepoints. The implementation of this technique though >> introduces a race when a monitor is shared between the VMThread and >> non-JavaThreads. This patch aims to solve that problem and at the >> same time simplify the code. >> >> The proposal is based on the introduction of the new class >> PlatformMonitor, which serves as a wrapper for the actual >> synchronization primitives in each platform (mutexes and condition >> variables). Most of the API calls can thus be implemented as simple >> wrappers around PlatformMonitor, adding more assertions and very >> little extra metadata. >> To be able to remove the lock sneaking code and at the same time >> avoid deadlocking scenarios, we combine two techniques: >> >> -When a JavaThread that has just acquired the lock, detects there is >> a safepoint request in the ThreadLockBlockInVM destructor, it >> releases the lock before blocking at the safepoint. After resuming >> from it, the JavaThread will have to acquire the lock again. >> >> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >> method, in order to avoid blocking we allow for a possible safepoint >> request to make progress but without letting the JavaThread block for >> it (since we would be stopped by the destructor anyways). We also do >> that for the Monitor::lock() case although no deadlock is being >> prevented there. >> >> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >> used instead of the ThreadBlockInVM one. This allowed more >> flexibility to handle the two techniques mentioned above. Also, >> ThreadBlockInVM calls SafepointMechanism::block_if_requested() which >> creates some problems when trying to allow safepoints to continue >> without stopping, since that method not only checks for safepoints >> but also processes handshakes. >> >> In terms of performance, benchmarks show very similar results to what >> we have now. >> >> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have been >> tested. >> >> Thanks, >> Patricio >> From zgu at redhat.com Mon Jan 28 19:20:17 2019 From: zgu at redhat.com (zgu at redhat.com) Date: Mon, 28 Jan 2019 14:20:17 -0500 Subject: RFR(XXS) 8217785: Padding ParallelTaskTerminator::_offerred_termination variable Message-ID: <1548703217.31327.58.camel@redhat.com> Hi, Could I have reviews for this small enhancement, that pads _offer_termination variable into a separate cacheline? cause the variable may be highly contended during task termination. Bug: https://bugs.openjdk.java.net/browse/JDK-8217785 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8217785/webrev.00/ Test: hotspot_gc (+/- UseOWSTTaskTerminator) on Linux x64 (fastdebug and release) Thanks, -Zhengyu From shade at redhat.com Mon Jan 28 19:38:52 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 28 Jan 2019 20:38:52 +0100 Subject: RFR(XXS) 8217785: Padding ParallelTaskTerminator::_offerred_termination variable In-Reply-To: <1548703217.31327.58.camel@redhat.com> References: <1548703217.31327.58.camel@redhat.com> Message-ID: <7680c85f-f105-2142-0c76-5ba07f4978e5@redhat.com> On 1/28/19 8:20 PM, zgu at redhat.com wrote: > Hi, > > Could I have reviews for this small enhancement, that pads > _offer_termination variable into a separate cacheline? cause the > variable may be highly contended during task termination. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8217785 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8217785/webrev.00/ This looks fine to me. -Aleksey From zgu at redhat.com Mon Jan 28 19:48:32 2019 From: zgu at redhat.com (zgu at redhat.com) Date: Mon, 28 Jan 2019 14:48:32 -0500 Subject: RFR(XXS) 8217785: Padding ParallelTaskTerminator::_offerred_termination variable In-Reply-To: <7680c85f-f105-2142-0c76-5ba07f4978e5@redhat.com> References: <1548703217.31327.58.camel@redhat.com> <7680c85f-f105-2142-0c76-5ba07f4978e5@redhat.com> Message-ID: <1548704912.31327.61.camel@redhat.com> Thanks, Aleksey! Could you also help reviewing following two related changes? [1] https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January /024702.html [2] https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January /024707.html -Zhengyu On Mon, 2019-01-28 at 20:38 +0100, Aleksey Shipilev wrote: > On 1/28/19 8:20 PM, zgu at redhat.com wrote: > > Hi, > > > > Could I have reviews for this small enhancement, that pads > > _offer_termination variable into a separate cacheline? cause the > > variable may be highly contended during task termination. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8217785 > > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8217785/webrev.00/ > > This looks fine to me. > > -Aleksey > > From thomas.stuefe at gmail.com Mon Jan 28 19:51:19 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 28 Jan 2019 20:51:19 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> Message-ID: Hi Alexey, I agree that a larger dump would be helpful, I had the same thought in the past. I know others prefer slim hs-err files, but they would have to chime in. Bikeshedding: I do not like naming that thing os::print_instructions(), since we do not do that, we just print a hex dump. Especially on x86, where the "unitsize" parameter is confusing because we have a variable sizes instruction set. I would prefer something like this: os::print_hex_dump_surrounding(address pivot, size_t len, size_t unitsize) and let that function print the pivot address up front (obviously avoiding to name it "pc"), then the dump (+- len around pivot) and maybe a little ">" marker at the start of the pivot line. -- In addition, I never liked that "os::print_context" even calls this. The function name sounds harmless, like a simple "print the ucontext structure" but is potentially dangerous since it just dereferences whatever it finds in pc. So even though it looks that way it is by no means a general purpose function, but only usable in error reporting. Your patch makes it more dangerous since the printed area now is larger and so we run a larger risk in segfaulting. I have a version of os::print_hex_dump locally somewhere which uses SafeFetch32 to print out the hex dump, printing little "?" for unmapped memory. I think that would be the correct way. Also I think that the print-instructions and print-top-pf-stack parts should be removed from the os::print_context() and called directly from vmError.cpp. But I understand if you want to leave this improvement to some other time. --- Patch looks fine otherwise. Cheers, Thomas On Mon, Jan 28, 2019 at 5:50 PM Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217879 > > Fix: > http://cr.openjdk.java.net/~shade/8217879/webrev.01/ > > "Instructions" block is useful when following up on hs_errs that happened > without the disassembler > attached, which is usually the case coming from users. One can use the > disassembler [1] to look > around the code that was crashing, and get extended conjectures why the > error happened, including > rewinding a bit of history. However, current window is sometimes too small > to infer enough context. > I propose we extend it! > > The patch also commons the paths across OS/Arch-specific files to that > current "delta" appears less > of the magic number. Plus, it adds cr()-s for consistency across the > arches and within the methods. > > Testing: eyeballing hs_errs from artificial crashes, Linux x86_64 build, > jdk-submit > > Thanks, > -Aleksey > > [1] I use https://onlinedisassembler.com, for example. > > From coleen.phillimore at oracle.com Mon Jan 28 20:37:41 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 28 Jan 2019 15:37:41 -0500 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> Message-ID: <29e645ac-983a-6d69-10a4-1bdd2e09f3f3@oracle.com> Can you attach a version of the hs_err file produced?? I prefer a slimmer hs_err file at least in the beginning so I don't have to scroll pages to find the native stack trace.? Especially for triaging purposes. Thanks, Coleen On 1/28/19 11:48 AM, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217879 > > Fix: > http://cr.openjdk.java.net/~shade/8217879/webrev.01/ > > "Instructions" block is useful when following up on hs_errs that happened without the disassembler > attached, which is usually the case coming from users. One can use the disassembler [1] to look > around the code that was crashing, and get extended conjectures why the error happened, including > rewinding a bit of history. However, current window is sometimes too small to infer enough context. > I propose we extend it! > > The patch also commons the paths across OS/Arch-specific files to that current "delta" appears less > of the magic number. Plus, it adds cr()-s for consistency across the arches and within the methods. > > Testing: eyeballing hs_errs from artificial crashes, Linux x86_64 build, jdk-submit > > Thanks, > -Aleksey > > [1] I use https://onlinedisassembler.com, for example. > From shade at redhat.com Mon Jan 28 20:47:06 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 28 Jan 2019 21:47:06 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <29e645ac-983a-6d69-10a4-1bdd2e09f3f3@oracle.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <29e645ac-983a-6d69-10a4-1bdd2e09f3f3@oracle.com> Message-ID: On 1/28/19 9:37 PM, coleen.phillimore at oracle.com wrote: > Can you attach a version of the hs_err file produced?? I prefer a slimmer hs_err file at least in > the beginning so I don't have to scroll pages to find the native stack trace.? Especially for > triaging purposes. "Instructions" block is all the way below the stack trace in hs_err. And the block is still dense, it used to be just 4 lines of data, now it's 32 lines of data. See here: http://cr.openjdk.java.net/~shade/8217879/hs_err_sample.log -Aleksey From shade at redhat.com Mon Jan 28 20:55:55 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 28 Jan 2019 21:55:55 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> Message-ID: <0799f64f-c263-005a-7196-6e244ea37a4a@redhat.com> On 1/28/19 8:51 PM, Thomas St?fe wrote: > I agree that a larger dump would be helpful, I had the same thought in the past. I know others > prefer slim hs-err files, but they would have to chime in. There is a bespoke rule in Hotspot that choosing between more asserts and fastdebug performance one should choose more asserts. I am mentally extending this to hs_err: choosing between more debugging information and slimmer hs_err one should choose more debugging information :) > os::print_hex_dump_surrounding(address pivot, size_t len, size_t unitsize) > > and let that function print the pivot address up front (obviously avoiding to name it "pc"), then > the dump (+- len around pivot) and maybe a little ">" marker at the start of the pivot line.? ...except current function outputs "Instructions" header, so it is dumping raw instruction stream. I used to call it os::print_hex_dump_near, but renamed it because of that "Instructions" header. > In addition, I never liked that "os::print_context" even calls this. The function name sounds > harmless, like a simple "print the ucontext structure" but is potentially dangerous since it just > dereferences whatever it finds in pc. So even though it looks that way it is by no means a general > purpose function, but only usable in error reporting. Your patch makes it more dangerous since the > printed area now is larger and so we run a larger risk in segfaulting. > > I have a version of os::print_hex_dump locally somewhere which uses SafeFetch32 to print out the hex > dump, printing little "?" for unmapped memory. I think that would be the correct way. Yes, my early version had something like this: address bottom = pc - delta; address top = pc + delta; while (pc > bottom && !is_readable_pointer(bottom)) bottom++; while (pc < top && !is_readable_pointer(top)) top--; os::print_hex_dump(bottom, top, ...); ...which makes it safe by reducing the dump window to only readable memory. But then I realized that vmError machinery would not trip hard on this failure, and would just print this message: Instructions: (pc=0x0000000000000000) 0x0000000000000000: [error occurred during error reporting (printing registers, top of stack, instructions near pc), id 0xb, SIGSEGV (0xb) at pc=0x00007f9e0a6e59a5] ...so I backed off for simplicity. I can reinstate this safety net. -Aleksey From coleen.phillimore at oracle.com Mon Jan 28 20:59:28 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 28 Jan 2019 15:59:28 -0500 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <29e645ac-983a-6d69-10a4-1bdd2e09f3f3@oracle.com> Message-ID: <4d84bfd0-02a0-58f8-b8f8-61104df26deb@oracle.com> This seems fine. I was looking at this page: https://onlinedisassembler.com/odaweb/ What would be nice if the instructions looked like: Instructions: (pc=0x00007f911d0d6053, 0x00007f911d0d6143) and then just the hex dump, then I could cut/paste it into the window in that tool.? Or is there another way? thanks, Coleen On 1/28/19 3:47 PM, Aleksey Shipilev wrote: > On 1/28/19 9:37 PM, coleen.phillimore at oracle.com wrote: >> Can you attach a version of the hs_err file produced?? I prefer a slimmer hs_err file at least in >> the beginning so I don't have to scroll pages to find the native stack trace.? Especially for >> triaging purposes. > "Instructions" block is all the way below the stack trace in hs_err. And the block is still dense, > it used to be just 4 lines of data, now it's 32 lines of data. See here: > http://cr.openjdk.java.net/~shade/8217879/hs_err_sample.log > > -Aleksey > > From daniel.daugherty at oracle.com Mon Jan 28 21:29:24 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 28 Jan 2019 16:29:24 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> Message-ID: On 1/28/19 2:18 PM, Patricio Chilano wrote: > Hi Robbin, > > Thanks for reviewing this! Removing the block_in_safepoint_check > thread local attribute is a great idea, here is v02: > > Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev src/hotspot/os/posix/os_posix.cpp ??? No comments. src/hotspot/os/posix/os_posix.hpp ??? No comments. src/hotspot/os/solaris/os_solaris.cpp ??? No comments. src/hotspot/os/solaris/os_solaris.hpp ??? No comments. src/hotspot/os/windows/os_windows.cpp ??? L5309: ??? else { ??? L5310: ????? DWORD err = GetLastError(); ??? L5311: ????? assert(err == ERROR_TIMEOUT, "SleepConditionVariableCS: %ld:", err); ??? L5312: ??? } ??????? nit - please reduce indent by 2 spaces. src/hotspot/os/windows/os_windows.hpp ??? No comments. src/hotspot/share/logging/logTag.hpp ??? No comments. src/hotspot/share/runtime/interfaceSupport.inline.hpp ??? No comments. src/hotspot/share/runtime/mutex.cpp ??? L42: ? // Clear unhandled oops in JavaThreads so we get a crash right away ??????? nit - please add '.' and the end of the sentence. ??? L128: ? assert(_owner == Thread::current(), "invariant"); ??? L134: ? assert(_owner == Thread::current(), "invariant"); ??? L139: ? assert(_owner == Thread::current(), "invariant"); ??????? Please consider the following for better diagnostics: ??????????? assert(_owner == Thread::current(), "should be equal: owner=" INTPTR_FORMAT ?????????????????? ", current=" INTPTR_FORMAT, p2i(_owner), p2i(Thread::current())); ??????? Sorry I missed this one in the preliminary review. ??? L155: ? assert(_owner == self, "invariant"); ??????????? assert(_owner == self, "should be equal: owner=" INTPTR_FORMAT ?????????????????? ", self=" INTPTR_FORMAT, p2i(_owner), p2i(self)); ??? L205: ??????? jt->java_suspend_self(); ??? L206: ????? } ??? L207: ??? } ??? L208: ??? L209: ??? if (in_flight_monitor != NULL) { ??? L210: ????? // Conceptually reestablish ownership of the lock. ??? L211: ????? assert(_owner == NULL, "should be NULL: owner=" INTPTR_FORMAT, p2i(_owner)); ??? L212: ????? set_owner(self); ??? L213: ??? } else { ??? L214: ????? lock(); ??? L215: ??? } ??????? The lock reacquire on L214 used to be done after the ??????? java_suspend_self() on L205 which is inside the block ??????? context for the ThreadLockBlockInVM and OSThreadWaitState ??????? helps. If the lock() blocks due to a racing thread, then ??????? the calling JavaThread won't have the right thread state ??????? of OS thread wait state, etc... ??????? Also after lock() on L214, you never call set_owner(self) ??????? so the ownership is not complete for that relocated code. ??????? You won't need the set_owner(self) call if you move the ??????? lock() on L214 back to after java_suspend_self(). ??????? I'm not sure how I missed these two things in the preliminary ??????? review. I can't go back to that webrev since it looks like the ??????? preliminary webrev has been overwritten. ??????? So after java_suspend_self(), you have to re-lock() so that ??????? a potential block on that re-lock has the right states. This ??????? also means that this line has to be deleted: ??????? L204???????? in_flight_monitor = NULL; ??????? so that ThreadLockBlockInVM destructor can do the right thing ??????? if the thread is suspended and then relocks. src/hotspot/share/runtime/mutex.hpp ??? No comments. src/hotspot/share/runtime/mutexLocker.hpp ??? No comments. src/hotspot/share/runtime/safepoint.cpp ??? No comments. src/hotspot/share/runtime/safepoint.hpp ??? No comments. src/hotspot/share/runtime/safepointMechanism.hpp ??? L75, L78, and L81 - nit - need a period at the end of the sentence. src/hotspot/share/runtime/safepointMechanism.inline.hpp ??? No comments. src/hotspot/share/runtime/thread.cpp ??? No comments. src/hotspot/share/runtime/thread.hpp ??? No comments. Dan > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ > > Running mach5 again. > > Thanks, > Patricio > > On 1/28/19 8:31 AM, Robbin Ehn wrote: >> Hi Patricio, >> >> Mostly looks good! >> >> block_at_safepoint is always called with block_in_safepoint_check = >> true. (correct?) >> Changing that to a local state instead of global simplifies the code. >> >> So I'm suggesting something like below. >> >> Thanks, Robbin >> >> diff -r e65cc445234c >> src/hotspot/share/runtime/interfaceSupport.inline.hpp >> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >> 28 13:10:15 2019 +0100 >> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >> 28 14:10:59 2019 +0100 >> @@ -308,2 +308,1 @@ >> -??? thread->block_in_safepoint_check = false; >> -??? SafepointMechanism::block_at_safepoint(thread); >> +??? SafepointMechanism::callback_if_safepoint(thread); >> @@ -323,2 +322,1 @@ >> -????? SafepointMechanism::block_at_safepoint(_thread); >> -????? _thread->block_in_safepoint_check = true; >> +????? SafepointMechanism::callback_if_safepoint(_thread); >> @@ -335,2 +332,0 @@ >> -??? } else { >> -????? _thread->block_in_safepoint_check = true; >> @@ -337,0 +334,1 @@ >> + >> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 13:10:15 >> 2019 +0100 >> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 14:10:59 >> 2019 +0100 >> @@ -795,1 +795,1 @@ >> -void SafepointSynchronize::block(JavaThread *thread) { >> +void SafepointSynchronize::block(JavaThread *thread, bool >> block_in_safepoint_check) { >> @@ -850,1 +850,1 @@ >> -????? if (thread->block_in_safepoint_check) { >> +????? if (block_in_safepoint_check) { >> @@ -880,1 +880,1 @@ >> -????????? thread->block_in_safepoint_check) { >> +????????? block_in_safepoint_check) { >> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 13:10:15 >> 2019 +0100 >> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 14:10:59 >> 2019 +0100 >> @@ -146,1 +146,1 @@ >> -? static void?? block(JavaThread *thread); >> +? static void?? block(JavaThread *thread, bool >> block_in_safepoint_check = true); >> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >> 13:10:15 2019 +0100 >> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >> 14:10:59 2019 +0100 >> @@ -82,1 +82,1 @@ >> -? static inline void block_at_safepoint(JavaThread* thread); >> +? static inline void callback_if_safepoint(JavaThread* thread); >> diff -r e65cc445234c >> src/hotspot/share/runtime/safepointMechanism.inline.hpp >> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan >> 28 13:10:15 2019 +0100 >> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan >> 28 14:10:59 2019 +0100 >> @@ -82,1 +82,1 @@ >> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >> @@ -84,1 +84,1 @@ >> -??? SafepointSynchronize::block(thread); >> +??? SafepointSynchronize::block(thread, false); >> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >> 2019 +0100 >> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >> 2019 +0100 >> @@ -298,2 +297,0 @@ >> -? block_in_safepoint_check = true; >> - >> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >> 2019 +0100 >> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >> 2019 +0100 >> @@ -788,2 +787,0 @@ >> -? bool block_in_safepoint_check;????????????? // to decide whether >> to block in SS::block or not >> - >> >> >> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>> Hi all, >>> >>> Please review the following patch: >>> >>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>> >>> The current implementation of native monitors uses a technique that >>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>> during safepoints. The implementation of this technique though >>> introduces a race when a monitor is shared between the VMThread and >>> non-JavaThreads. This patch aims to solve that problem and at the >>> same time simplify the code. >>> >>> The proposal is based on the introduction of the new class >>> PlatformMonitor, which serves as a wrapper for the actual >>> synchronization primitives in each platform (mutexes and condition >>> variables). Most of the API calls can thus be implemented as simple >>> wrappers around PlatformMonitor, adding more assertions and very >>> little extra metadata. >>> To be able to remove the lock sneaking code and at the same time >>> avoid deadlocking scenarios, we combine two techniques: >>> >>> -When a JavaThread that has just acquired the lock, detects there is >>> a safepoint request in the ThreadLockBlockInVM destructor, it >>> releases the lock before blocking at the safepoint. After resuming >>> from it, the JavaThread will have to acquire the lock again. >>> >>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>> method, in order to avoid blocking we allow for a possible safepoint >>> request to make progress but without letting the JavaThread block >>> for it (since we would be stopped by the destructor anyways). We >>> also do that for the Monitor::lock() case although no deadlock is >>> being prevented there. >>> >>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>> used instead of the ThreadBlockInVM one. This allowed more >>> flexibility to handle the two techniques mentioned above. Also, >>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() which >>> creates some problems when trying to allow safepoints to continue >>> without stopping, since that method not only checks for safepoints >>> but also processes handshakes. >>> >>> In terms of performance, benchmarks show very similar results to >>> what we have now. >>> >>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>> been tested. >>> >>> Thanks, >>> Patricio >>> > From daniel.daugherty at oracle.com Mon Jan 28 22:56:08 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 28 Jan 2019 17:56:08 -0500 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <0ba510c1-6a1a-68f1-811c-0f538c3a472b@oracle.com> References: <0ba510c1-6a1a-68f1-811c-0f538c3a472b@oracle.com> Message-ID: On 1/28/19 8:04 AM, Robbin Ehn wrote: > Hi all, here is v05. > > http://cr.openjdk.java.net/~rehn/8203469/v05/ > http://cr.openjdk.java.net/~rehn/8203469/v05/inc/ src/hotspot/share/code/dependencyContext.hpp ??? No comments. src/hotspot/share/jfr/recorder/stacktrace/jfrStackTraceRepository.cpp ??? No comments. src/hotspot/share/runtime/handshake.cpp ??? No comments. src/hotspot/share/runtime/safepoint.cpp ??? No comments. src/hotspot/share/runtime/safepoint.hpp ??? No comments. Thumbs up! It took a couple of re-reads, but I think I now understand the new handshake_safe() (and its call restrictions). Dan > > I have been asked to go on-top-of: > https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036425.html > > With a small grace-period. > There will be a v06 rebase on-top of that. > > Updated after comments and changes regarding safepoint_safe(). > In JFR code path, thread is always current, so it should not be calling > safepoint_safe. It also don't control polls, so even if it returns > true it is > not safe in that case. > > Updated to a handshake_safe() private method with a friend for > handshakes. > > Test t1-3, stress testing and JFR. > > Thanks, Robbin > > On 1/15/19 11:39 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes >> 0.2ms the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock >> during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have >> transitioned to a safe >> state. To make them safe, we arm polls in compiled code and make sure >> any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native >> without being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming >> into the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all >> JavaThreads being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a >> stable load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. >> The thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since >> each java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM >> thread may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have >> not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next >> safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From patricio.chilano.mateo at oracle.com Mon Jan 28 23:13:26 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Mon, 28 Jan 2019 18:13:26 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> Message-ID: <60daed05-61d7-053a-35b4-d5b6582ea0a1@oracle.com> Hi Dan, On 1/28/19 4:29 PM, Daniel D. Daugherty wrote: > On 1/28/19 2:18 PM, Patricio Chilano wrote: >> Hi Robbin, >> >> Thanks for reviewing this! Removing the block_in_safepoint_check >> thread local attribute is a great idea, here is v02: >> >> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev > > src/hotspot/os/posix/os_posix.cpp > ??? No comments. > > src/hotspot/os/posix/os_posix.hpp > ??? No comments. > > src/hotspot/os/solaris/os_solaris.cpp > ??? No comments. > > src/hotspot/os/solaris/os_solaris.hpp > ??? No comments. > > src/hotspot/os/windows/os_windows.cpp > ??? L5309: ??? else { > ??? L5310: ????? DWORD err = GetLastError(); > ??? L5311: ????? assert(err == ERROR_TIMEOUT, > "SleepConditionVariableCS: %ld:", err); > ??? L5312: ??? } > ??????? nit - please reduce indent by 2 spaces. Done! > src/hotspot/os/windows/os_windows.hpp > ??? No comments. > > src/hotspot/share/logging/logTag.hpp > ??? No comments. > > src/hotspot/share/runtime/interfaceSupport.inline.hpp > ??? No comments. > > src/hotspot/share/runtime/mutex.cpp > ??? L42: ? // Clear unhandled oops in JavaThreads so we get a crash > right away > ??????? nit - please add '.' and the end of the sentence. Done! > L128: ? assert(_owner == Thread::current(), "invariant"); > ??? L134: ? assert(_owner == Thread::current(), "invariant"); > ??? L139: ? assert(_owner == Thread::current(), "invariant"); > ??????? Please consider the following for better diagnostics: > ??????????? assert(_owner == Thread::current(), "should be equal: > owner=" INTPTR_FORMAT > ?????????????????? ", current=" INTPTR_FORMAT, p2i(_owner), > p2i(Thread::current())); > > ??????? Sorry I missed this one in the preliminary review. > > ??? L155: ? assert(_owner == self, "invariant"); > ??????????? assert(_owner == self, "should be equal: owner=" > INTPTR_FORMAT > ?????????????????? ", self=" INTPTR_FORMAT, p2i(_owner), p2i(self)); Done! > L205: ??????? jt->java_suspend_self(); > ??? L206: ????? } > ??? L207: ??? } > ??? L208: > ??? L209: ??? if (in_flight_monitor != NULL) { > ??? L210: ????? // Conceptually reestablish ownership of the lock. > ??? L211: ????? assert(_owner == NULL, "should be NULL: owner=" > INTPTR_FORMAT, p2i(_owner)); > ??? L212: ????? set_owner(self); > ??? L213: ??? } else { > ??? L214: ????? lock(); > ??? L215: ??? } > ??????? The lock reacquire on L214 used to be done after the > ??????? java_suspend_self() on L205 which is inside the block > ??????? context for the ThreadLockBlockInVM and OSThreadWaitState > ??????? helps. If the lock() blocks due to a racing thread, then > ??????? the calling JavaThread won't have the right thread state > ??????? of OS thread wait state, etc... > So after java_suspend_self(), you have to re-lock() so that > ??????? a potential block on that re-lock has the right states. This > ??????? also means that this line has to be deleted: > > ??????? L204???????? in_flight_monitor = NULL; > > ??????? so that ThreadLockBlockInVM destructor can do the right thing > ??????? if the thread is suspended and then relocks. Yes, I see your point. The problem is that after executing the TLBIVM destructor (which executes after the OSThreadWaitState destructor with the current order)? there is always the possibility that we had to release the lock, and so afterwards we will have to re-acquire it with a different state. One simple way of solving this could be moving the OSThreadWaitState object outside the TLBIVM block. Based on David's comment about OSThreadWaitState I don't think changing the order should break things, since it seems more like a debugging tool. What do you think then of doing something like this: (I also included a re-lock after java_suspend_self() and removed in_flight_monitor=NULL as you suggested.) diff --git a/src/hotspot/share/runtime/mutex.cpp b/src/hotspot/share/runtime/mutex.cpp --- a/src/hotspot/share/runtime/mutex.cpp +++ b/src/hotspot/share/runtime/mutex.cpp @@ -182,9 +186,9 @@ ???? JavaThread *jt = (JavaThread *)self; ???? Monitor* in_flight_monitor = NULL; +??? OSThreadWaitState osts(self->osthread(), false /* not Object.wait() */); ???? { ?????? ThreadLockBlockInVM tlbivm(jt, &in_flight_monitor); -????? OSThreadWaitState osts(self->osthread(), false /* not Object.wait() */); ?????? if (as_suspend_equivalent) { ???????? jt->set_suspend_equivalent(); ???????? // cleared by handle_special_suspend_equivalent_condition() or @@ -201,8 +205,8 @@ ???????? // want to hold the lock while suspended because that ???????? // would surprise the thread that suspended us. ???????? _lock.unlock(); -??????? in_flight_monitor = NULL; ???????? jt->java_suspend_self(); +??????? _lock.lock(); ?????? } ???? } > Also after lock() on L214, you never call set_owner(self) > ??????? so the ownership is not complete for that relocated code. > ??????? You won't need the set_owner(self) call if you move the > ??????? lock() on L214 back to after java_suspend_self(). That lock() is actually Monitor::lock(). But I agree is confusing, every time I look at it I say the same. Maybe I can rewrite it as Monitor::lock() ? > I'm not sure how I missed these two things in the preliminary > ??????? review. I can't go back to that webrev since it looks like the > ??????? preliminary webrev has been overwritten. Yes, sorry I moved it to http://cr.openjdk.java.net/~pchilanomate/8210832/preview/ > src/hotspot/share/runtime/mutex.hpp > ??? No comments. > > src/hotspot/share/runtime/mutexLocker.hpp > ??? No comments. > > src/hotspot/share/runtime/safepoint.cpp > ??? No comments. > > src/hotspot/share/runtime/safepoint.hpp > ??? No comments. > > src/hotspot/share/runtime/safepointMechanism.hpp > ??? L75, L78, and L81 - nit - need a period at the end of the sentence. Done! > src/hotspot/share/runtime/safepointMechanism.inline.hpp > ??? No comments. > > src/hotspot/share/runtime/thread.cpp > ??? No comments. > > src/hotspot/share/runtime/thread.hpp > ??? No comments. Thanks for the review (and the pre-review) Dan! Waiting for you comments to send v03. Thanks, Patricio > Dan > > > >> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >> >> Running mach5 again. >> >> Thanks, >> Patricio >> >> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>> Hi Patricio, >>> >>> Mostly looks good! >>> >>> block_at_safepoint is always called with block_in_safepoint_check = >>> true. (correct?) >>> Changing that to a local state instead of global simplifies the code. >>> >>> So I'm suggesting something like below. >>> >>> Thanks, Robbin >>> >>> diff -r e65cc445234c >>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>> 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>> 28 14:10:59 2019 +0100 >>> @@ -308,2 +308,1 @@ >>> -??? thread->block_in_safepoint_check = false; >>> -??? SafepointMechanism::block_at_safepoint(thread); >>> +??? SafepointMechanism::callback_if_safepoint(thread); >>> @@ -323,2 +322,1 @@ >>> -????? SafepointMechanism::block_at_safepoint(_thread); >>> -????? _thread->block_in_safepoint_check = true; >>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>> @@ -335,2 +332,0 @@ >>> -??? } else { >>> -????? _thread->block_in_safepoint_check = true; >>> @@ -337,0 +334,1 @@ >>> + >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -795,1 +795,1 @@ >>> -void SafepointSynchronize::block(JavaThread *thread) { >>> +void SafepointSynchronize::block(JavaThread *thread, bool >>> block_in_safepoint_check) { >>> @@ -850,1 +850,1 @@ >>> -????? if (thread->block_in_safepoint_check) { >>> +????? if (block_in_safepoint_check) { >>> @@ -880,1 +880,1 @@ >>> -????????? thread->block_in_safepoint_check) { >>> +????????? block_in_safepoint_check) { >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -146,1 +146,1 @@ >>> -? static void?? block(JavaThread *thread); >>> +? static void?? block(JavaThread *thread, bool >>> block_in_safepoint_check = true); >>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>> 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>> 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -? static inline void block_at_safepoint(JavaThread* thread); >>> +? static inline void callback_if_safepoint(JavaThread* thread); >>> diff -r e65cc445234c >>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>> Jan 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>> Jan 28 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>> @@ -84,1 +84,1 @@ >>> -??? SafepointSynchronize::block(thread); >>> +??? SafepointSynchronize::block(thread, false); >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -298,2 +297,0 @@ >>> -? block_in_safepoint_check = true; >>> - >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -788,2 +787,0 @@ >>> -? bool block_in_safepoint_check;????????????? // to decide whether >>> to block in SS::block or not >>> - >>> >>> >>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>> Hi all, >>>> >>>> Please review the following patch: >>>> >>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>> >>>> The current implementation of native monitors uses a technique that >>>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>>> during safepoints. The implementation of this technique though >>>> introduces a race when a monitor is shared between the VMThread and >>>> non-JavaThreads. This patch aims to solve that problem and at the >>>> same time simplify the code. >>>> >>>> The proposal is based on the introduction of the new class >>>> PlatformMonitor, which serves as a wrapper for the actual >>>> synchronization primitives in each platform (mutexes and condition >>>> variables). Most of the API calls can thus be implemented as simple >>>> wrappers around PlatformMonitor, adding more assertions and very >>>> little extra metadata. >>>> To be able to remove the lock sneaking code and at the same time >>>> avoid deadlocking scenarios, we combine two techniques: >>>> >>>> -When a JavaThread that has just acquired the lock, detects there >>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>> releases the lock before blocking at the safepoint. After resuming >>>> from it, the JavaThread will have to acquire the lock again. >>>> >>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>> method, in order to avoid blocking we allow for a possible >>>> safepoint request to make progress but without letting the >>>> JavaThread block for it (since we would be stopped by the >>>> destructor anyways). We also do that for the Monitor::lock() case >>>> although no deadlock is being prevented there. >>>> >>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>>> used instead of the ThreadBlockInVM one. This allowed more >>>> flexibility to handle the two techniques mentioned above. Also, >>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>> which creates some problems when trying to allow safepoints to >>>> continue without stopping, since that method not only checks for >>>> safepoints but also processes handshakes. >>>> >>>> In terms of performance, benchmarks show very similar results to >>>> what we have now. >>>> >>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>> been tested. >>>> >>>> Thanks, >>>> Patricio >>>> >> > From dean.long at oracle.com Tue Jan 29 01:13:21 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 28 Jan 2019 17:13:21 -0800 Subject: 12 RFR(M) 8195635: [Graal] nsk/jvmti/unit/ForceEarlyReturn/earlyretbase crashes with assertion "compilation level out of bounds" Message-ID: <9b4a4594-458e-ae18-0606-9b1ecbb400ce@oracle.com> http://cr.openjdk.java.net/~dlong/8195635/webrev.5/ https://bugs.openjdk.java.net/browse/JDK-8195635 Please see the bug report for all the gory details.? Here's the short version: If we allow any safepoint to be a suspend point, we run into trouble with PopFrame and ForceEarlyReturn, which reasonably expect the top frame not to change between the suspend and when the PopFrame/ForceEarlyReturn is executed.? Normally this is not an issue, but certain safepoints cause problems, when we are about to call a new Java method.? In particular, if we safepoint and suspend in JavaCallWrapper, the top frame will still be the caller, but when we execute the PopFrame/ForceEarlyReturn we will be in the callee. The solution this patch takes is to block suspend around troublesome VM code using a new "allow_suspend" thread flag.? This means JavaThread::java_suspend can't just ask the VMThread to safepoint and be done.? Instead it has wait and allow threads to roll forward to an allowed suspend point. dl From david.holmes at oracle.com Tue Jan 29 04:16:55 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Jan 2019 14:16:55 +1000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> Message-ID: <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> On 28/01/2019 10:23 pm, Baesken, Matthias wrote: >> >> Can't you include this information in an existing section of the error >> processing code instead of adding a new function that is empty >> everywhere except Linux? >> > > Hi David , do you mean something like > > > #if defined(S390) > > STEP("printing virtualization info") > ... > > #endif No I was thinking more about just adding the virtualization info to an existing step like print_os_info or print_cpu_info. Cheers, David ----- > in vmError.cpp ? > > I thought about doing this. > > > But on the other hand , the now still empty os::pd_print_virtualization_info in platforms != linux > might fill over time ( we could add [at least for some platforms] other virtualization related info ). > > > Best regards, Matthias > > >> -----Original Message----- >> From: David Holmes >> Sent: Montag, 28. Januar 2019 12:35 >> To: Baesken, Matthias ; 'hotspot- >> dev at openjdk.java.net' >> Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error >> file on linux s390x >> >> Hi Matthias, >> >> On 28/01/2019 6:48 pm, Baesken, Matthias wrote: >>> Hello, please review this change ; it adds virtualization related info in the >> hs_error file on linux s390x . >> >> Can't you include this information in an existing section of the error >> processing code instead of adding a new function that is empty >> everywhere except Linux? >> >> Thanks, >> David >> >>> On linux s390x, we usually (always?) run in virtualized environments >> (LPAR and/or z/VM / KVM ). >>> >>> It is helpful for instance in support cases to get some information about the >> virtualized environment in the hs_error file . >>> A lot of info can be taken from the /proc/sysinfo file on linux s390x . >>> >>> >>> Bug/webrev : >>> >>> https://bugs.openjdk.java.net/browse/JDK-8217786 >>> >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ >>> >>> >>> >>> Best regards, Matthias >>> From david.holmes at oracle.com Tue Jan 29 05:13:35 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Jan 2019 15:13:35 +1000 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> Message-ID: <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Hi Aleksey, On 29/01/2019 2:48 am, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217879 > > Fix: > http://cr.openjdk.java.net/~shade/8217879/webrev.01/ > > "Instructions" block is useful when following up on hs_errs that happened without the disassembler > attached, which is usually the case coming from users. One can use the disassembler [1] to look > around the code that was crashing, and get extended conjectures why the error happened, including > rewinding a bit of history. However, current window is sometimes too small to infer enough context. > I propose we extend it! The existing comment states: // Note: it may be unsafe to inspect memory near pc. For example, pc may // point to garbage if entry point in an nmethod is corrupted. Leave // this at the end, and hope for the best. so by increasing the size of the block you are potentially greatly increasing the risk that you will do something unsafe. Perhaps, as Thomas suggested, if you really want to expand this range it should be done in a more safe manner. I'm also very much in favour of the slimmer hs_err file. At this rate of expansion the hs_err file will look more like a core file, and we already have core files for that. ;-) hs_err files were never intended to be a one-stop debugging shop. Cheers, David ----- > The patch also commons the paths across OS/Arch-specific files to that current "delta" appears less > of the magic number. Plus, it adds cr()-s for consistency across the arches and within the methods. > > Testing: eyeballing hs_errs from artificial crashes, Linux x86_64 build, jdk-submit > > Thanks, > -Aleksey > > [1] I use https://onlinedisassembler.com, for example. > From thomas.stuefe at gmail.com Tue Jan 29 06:57:24 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 07:57:24 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <0799f64f-c263-005a-7196-6e244ea37a4a@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <0799f64f-c263-005a-7196-6e244ea37a4a@redhat.com> Message-ID: On Mon, Jan 28, 2019 at 9:55 PM Aleksey Shipilev wrote: > On 1/28/19 8:51 PM, Thomas St?fe wrote: > > I agree that a larger dump would be helpful, I had the same thought in > the past. I know others > > prefer slim hs-err files, but they would have to chime in. > > There is a bespoke rule in Hotspot that choosing between more asserts and > fastdebug performance one > should choose more asserts. I am mentally extending this to hs_err: > choosing between more debugging > information and slimmer hs_err one should choose more debugging > information :) > > > os::print_hex_dump_surrounding(address pivot, size_t len, size_t > unitsize) > > > > and let that function print the pivot address up front (obviously > avoiding to name it "pc"), then > > the dump (+- len around pivot) and maybe a little ">" marker at the > start of the pivot line. > > ...except current function outputs "Instructions" header, so it is dumping > raw instruction stream. I > used to call it os::print_hex_dump_near, but renamed it because of that > "Instructions" header. > It is okay I guess. There are a number of possible code reshufflings there, but that can be left for later. > > > > In addition, I never liked that "os::print_context" even calls this. The > function name sounds > > harmless, like a simple "print the ucontext structure" but is > potentially dangerous since it just > > dereferences whatever it finds in pc. So even though it looks that way > it is by no means a general > > purpose function, but only usable in error reporting. Your patch makes > it more dangerous since the > > printed area now is larger and so we run a larger risk in segfaulting. > > > > I have a version of os::print_hex_dump locally somewhere which uses > SafeFetch32 to print out the hex > > dump, printing little "?" for unmapped memory. I think that would be the > correct way. > > Yes, my early version had something like this: > > address bottom = pc - delta; > address top = pc + delta; > while (pc > bottom && !is_readable_pointer(bottom)) bottom++; > while (pc < top && !is_readable_pointer(top)) top--; > os::print_hex_dump(bottom, top, ...); > > ...which makes it safe by reducing the dump window to only readable > memory. But then I realized that > vmError machinery would not trip hard on this failure, and would just > print this message: > > Instructions: (pc=0x0000000000000000) > 0x0000000000000000: > [error occurred during error reporting (printing registers, top of stack, > instructions near pc), id > 0xb, SIGSEGV (0xb) at pc=0x00007f9e0a6e59a5] > > ...so I backed off for simplicity. I can reinstate this safety net. > > Sounds reasonable. Unless we happen to run on architectures with very small page sizes ;) ..thomas -Aleksey > > From matthias.baesken at sap.com Tue Jan 29 08:02:23 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 29 Jan 2019 08:02:23 +0000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> Message-ID: > > No I was thinking more about just adding the virtualization info to an > existing step like print_os_info or print_cpu_info. > Hi David , print_cpu_info does not sound like a great fit . Some info like LPAR Number: 14 LPAR Characteristics: Shared LPAR Name: VM12 Does not really belong there . print_os_info looks better , it already contains "container_info" on Linux, so I think this might fit . Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Dienstag, 29. Januar 2019 05:17 > To: Baesken, Matthias ; 'hotspot- > dev at openjdk.java.net' > Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error > file on linux s390x > > On 28/01/2019 10:23 pm, Baesken, Matthias wrote: > >> > >> Can't you include this information in an existing section of the error > >> processing code instead of adding a new function that is empty > >> everywhere except Linux? > >> > > > > Hi David , do you mean something like > > > > > > #if defined(S390) > > > > STEP("printing virtualization info") > > ... > > > > #endif > > No I was thinking more about just adding the virtualization info to an > existing step like print_os_info or print_cpu_info. > > Cheers, > David > ----- > > > in vmError.cpp ? > > > > I thought about doing this. > > > > > > But on the other hand , the now still empty > os::pd_print_virtualization_info in platforms != linux > > might fill over time ( we could add [at least for some platforms] other > virtualization related info ). > > > > > > Best regards, Matthias > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Montag, 28. Januar 2019 12:35 > >> To: Baesken, Matthias ; 'hotspot- > >> dev at openjdk.java.net' > >> Subject: Re: RFR : 8217786: Provide virtualization related info in the > hs_error > >> file on linux s390x > >> > >> Hi Matthias, > >> > >> On 28/01/2019 6:48 pm, Baesken, Matthias wrote: > >>> Hello, please review this change ; it adds virtualization related info in > the > >> hs_error file on linux s390x . > >> > >> Can't you include this information in an existing section of the error > >> processing code instead of adding a new function that is empty > >> everywhere except Linux? > >> > >> Thanks, > >> David > >> > >>> On linux s390x, we usually (always?) run in virtualized environments > >> (LPAR and/or z/VM / KVM ). > >>> > >>> It is helpful for instance in support cases to get some information about > the > >> virtualized environment in the hs_error file . > >>> A lot of info can be taken from the /proc/sysinfo file on linux s390x . > >>> > >>> > >>> Bug/webrev : > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8217786 > >>> > >>> > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > >>> > >>> > >>> > >>> Best regards, Matthias > >>> From tobias.hartmann at oracle.com Tue Jan 29 08:16:38 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 29 Jan 2019 09:16:38 +0100 Subject: 8216541: CompiledICHolders of VM locked unloaded nmethods are released too late In-Reply-To: <0345dd70-fa19-a7e3-7892-8f4500f8339f@oracle.com> References: <0345dd70-fa19-a7e3-7892-8f4500f8339f@oracle.com> Message-ID: <4195c9a1-e91c-1014-8fa5-2b0d3f6dfc30@oracle.com> Hi Erik, very nice analysis, thanks a lot for investigating! On 28.01.19 14:56, Erik ?sterlund wrote: > http://cr.openjdk.java.net/~eosterlund/8216541/webrev.00/ Why did you remove the call to thread->set_scanned_compiled_method(NULL) in sweeper.cpp? > The proposed change has survived 200 rounds of kitchensink, hs-tier1-3 and hs-precheckin-comp. In the meanwhile, could you please run some more 100x iterations of kitchensink? Thanks, Tobias From thomas.stuefe at gmail.com Tue Jan 29 08:23:26 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 09:23:26 +0100 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> Message-ID: I'm still unhappy with that solution, since we have fanned out this coding for all architectures into the architecture independent os_linux.cpp. A generic "Show matching lines from given file" would be a better (slimmer, better reusable) solution IMHO. Side note: Could you please exchange strstr() .. with strncmp() since you require the start of the string to match. So no reason to parse the whole line if the start does not match. Cheers, Thomas On Tue, Jan 29, 2019 at 9:03 AM Baesken, Matthias wrote: > > > > > No I was thinking more about just adding the virtualization info to an > > existing step like print_os_info or print_cpu_info. > > > > Hi David , print_cpu_info does not sound like a great fit . Some info > like > > LPAR Number: 14 > LPAR Characteristics: Shared > LPAR Name: VM12 > > Does not really belong there . > > print_os_info looks better , it already contains "container_info" > on Linux, so I think this might fit . > > > Best regards, Matthias > > > > -----Original Message----- > > From: David Holmes > > Sent: Dienstag, 29. Januar 2019 05:17 > > To: Baesken, Matthias ; 'hotspot- > > dev at openjdk.java.net' > > Subject: Re: RFR : 8217786: Provide virtualization related info in the > hs_error > > file on linux s390x > > > > On 28/01/2019 10:23 pm, Baesken, Matthias wrote: > > >> > > >> Can't you include this information in an existing section of the error > > >> processing code instead of adding a new function that is empty > > >> everywhere except Linux? > > >> > > > > > > Hi David , do you mean something like > > > > > > > > > #if defined(S390) > > > > > > STEP("printing virtualization info") > > > ... > > > > > > #endif > > > > No I was thinking more about just adding the virtualization info to an > > existing step like print_os_info or print_cpu_info. > > > > Cheers, > > David > > ----- > > > > > in vmError.cpp ? > > > > > > I thought about doing this. > > > > > > > > > But on the other hand , the now still empty > > os::pd_print_virtualization_info in platforms != linux > > > might fill over time ( we could add [at least for some > platforms] other > > virtualization related info ). > > > > > > > > > Best regards, Matthias > > > > > > > > >> -----Original Message----- > > >> From: David Holmes > > >> Sent: Montag, 28. Januar 2019 12:35 > > >> To: Baesken, Matthias ; 'hotspot- > > >> dev at openjdk.java.net' > > >> Subject: Re: RFR : 8217786: Provide virtualization related info in the > > hs_error > > >> file on linux s390x > > >> > > >> Hi Matthias, > > >> > > >> On 28/01/2019 6:48 pm, Baesken, Matthias wrote: > > >>> Hello, please review this change ; it adds virtualization related > info in > > the > > >> hs_error file on linux s390x . > > >> > > >> Can't you include this information in an existing section of the error > > >> processing code instead of adding a new function that is empty > > >> everywhere except Linux? > > >> > > >> Thanks, > > >> David > > >> > > >>> On linux s390x, we usually (always?) run in virtualized > environments > > >> (LPAR and/or z/VM / KVM ). > > >>> > > >>> It is helpful for instance in support cases to get some information > about > > the > > >> virtualized environment in the hs_error file . > > >>> A lot of info can be taken from the /proc/sysinfo file on linux > s390x . > > >>> > > >>> > > >>> Bug/webrev : > > >>> > > >>> https://bugs.openjdk.java.net/browse/JDK-8217786 > > >>> > > >>> > > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > > >>> > > >>> > > >>> > > >>> Best regards, Matthias > > >>> > From robbin.ehn at oracle.com Tue Jan 29 08:52:19 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 29 Jan 2019 09:52:19 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: References: <0ba510c1-6a1a-68f1-811c-0f538c3a472b@oracle.com> Message-ID: > Thumbs up! Thanks! > > It took a couple of re-reads, but I think I now understand the > new handshake_safe() (and its call restrictions). Great! /Robbin > > Dan > >> >> I have been asked to go on-top-of: >> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036425.html >> With a small grace-period. >> There will be a v06 rebase on-top of that. >> >> Updated after comments and changes regarding safepoint_safe(). >> In JFR code path, thread is always current, so it should not be calling >> safepoint_safe. It also don't control polls, so even if it returns true it is >> not safe in that case. >> >> Updated to a handshake_safe() private method with a friend for handshakes. >> >> Test t1-3, stress testing and JFR. >> >> Thanks, Robbin >> >> On 1/15/19 11:39 AM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin > From per.liden at oracle.com Tue Jan 29 09:22:06 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 29 Jan 2019 10:22:06 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> Message-ID: Hi Patricio, On 01/28/2019 08:18 PM, Patricio Chilano wrote: > Hi Robbin, > > Thanks for reviewing this! Removing the block_in_safepoint_check thread > local attribute is a great idea, here is v02: > > Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev I really like that we're ditching our old locking code in favor of using pthread_mutex, et al. Nice work! General comment ---------------- I think Mutex to be a plain mutex and not come with the baggage of having a conditional variable. With this new code, it seems we're in a really good position to make that happen. I.e. something like this: class PlatformMutex { protected: pthread_mutex_t _mutex; public: PlatformMutex(); ~PlatformMutex(); void lock(); void unlock(); bool try_lock(); }; class PlatformMonitor : public PlatformMutex { private: pthread_cond_t _cond; public: PlatformMonitor(); ~PlatformMonitor(); int wait(jlong millis); void notify(); void notify_all(); }; It might be that we want to do that as a separate step later instead of including it in this patch. But I think we should try to get there. src/hotspot/os/posix/os_*.[ch]pp --------------------------------- * I'd suggest that we place the PlatformMonitor class in a separate file (like src/hotspot/os/posix/monitor_posix.cpp), just like we have done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). src/hotspot/os/posix/os_posix.hpp src/hotspot/os/solaris/os_solaris.hpp src/hotspot/os/windows/os_windows.hpp ------------------------------------- * Please make _mutex/_cond plain variables, instead of arrays of 1. That's just ugly ;) src/hotspot/os/posix/os_posix.cpp --------------------------------- * Destructor missing, to call pthread_(mutex|cond)_destroy(). src/hotspot/os/solaris/os_solaris.hpp ------------------------------------- * Not sure if there's a good reason to have the constructor be inlined here. I'd suggest moving it to the cpp file. * Destructor missing. src/hotspot/os/windows/os_windows.cpp ------------------------------------- * Destructor missing (I'm not too familiar with the windows API but I assume there's a destroy function we should call here). src/hotspot/share/runtime/interfaceSupport.inline.hpp ----------------------------------------------------- * Move "private:" above monitor_adr; 289 class ThreadLockBlockInVM : public ThreadStateTransition { 290 Monitor** monitor_adr; 291 private: 292 void do_preempted(Monitor** in_flight_monitor_adr) { * monitor_adr should be _monitor_adr, or maybe even _in_flight_monitor_adr to better match the name of the argument. cheers, Per > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ > > Running mach5 again. > > Thanks, > Patricio > > On 1/28/19 8:31 AM, Robbin Ehn wrote: >> Hi Patricio, >> >> Mostly looks good! >> >> block_at_safepoint is always called with block_in_safepoint_check = >> true. (correct?) >> Changing that to a local state instead of global simplifies the code. >> >> So I'm suggesting something like below. >> >> Thanks, Robbin >> >> diff -r e65cc445234c >> src/hotspot/share/runtime/interfaceSupport.inline.hpp >> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >> 28 13:10:15 2019 +0100 >> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >> 28 14:10:59 2019 +0100 >> @@ -308,2 +308,1 @@ >> - thread->block_in_safepoint_check = false; >> - SafepointMechanism::block_at_safepoint(thread); >> + SafepointMechanism::callback_if_safepoint(thread); >> @@ -323,2 +322,1 @@ >> - SafepointMechanism::block_at_safepoint(_thread); >> - _thread->block_in_safepoint_check = true; >> + SafepointMechanism::callback_if_safepoint(_thread); >> @@ -335,2 +332,0 @@ >> - } else { >> - _thread->block_in_safepoint_check = true; >> @@ -337,0 +334,1 @@ >> + >> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >> --- a/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 13:10:15 >> 2019 +0100 >> +++ b/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 14:10:59 >> 2019 +0100 >> @@ -795,1 +795,1 @@ >> -void SafepointSynchronize::block(JavaThread *thread) { >> +void SafepointSynchronize::block(JavaThread *thread, bool >> block_in_safepoint_check) { >> @@ -850,1 +850,1 @@ >> - if (thread->block_in_safepoint_check) { >> + if (block_in_safepoint_check) { >> @@ -880,1 +880,1 @@ >> - thread->block_in_safepoint_check) { >> + block_in_safepoint_check) { >> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >> --- a/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 13:10:15 >> 2019 +0100 >> +++ b/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 14:10:59 >> 2019 +0100 >> @@ -146,1 +146,1 @@ >> - static void block(JavaThread *thread); >> + static void block(JavaThread *thread, bool >> block_in_safepoint_check = true); >> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >> --- a/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 >> 13:10:15 2019 +0100 >> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 >> 14:10:59 2019 +0100 >> @@ -82,1 +82,1 @@ >> - static inline void block_at_safepoint(JavaThread* thread); >> + static inline void callback_if_safepoint(JavaThread* thread); >> diff -r e65cc445234c >> src/hotspot/share/runtime/safepointMechanism.inline.hpp >> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan >> 28 13:10:15 2019 +0100 >> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan >> 28 14:10:59 2019 +0100 >> @@ -82,1 +82,1 @@ >> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >> @@ -84,1 +84,1 @@ >> - SafepointSynchronize::block(thread); >> + SafepointSynchronize::block(thread, false); >> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >> --- a/src/hotspot/share/runtime/thread.cpp Mon Jan 28 13:10:15 2019 >> +0100 >> +++ b/src/hotspot/share/runtime/thread.cpp Mon Jan 28 14:10:59 2019 >> +0100 >> @@ -298,2 +297,0 @@ >> - block_in_safepoint_check = true; >> - >> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >> --- a/src/hotspot/share/runtime/thread.hpp Mon Jan 28 13:10:15 2019 >> +0100 >> +++ b/src/hotspot/share/runtime/thread.hpp Mon Jan 28 14:10:59 2019 >> +0100 >> @@ -788,2 +787,0 @@ >> - bool block_in_safepoint_check; // to decide whether to >> block in SS::block or not >> - >> >> >> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>> Hi all, >>> >>> Please review the following patch: >>> >>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>> >>> The current implementation of native monitors uses a technique that >>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>> during safepoints. The implementation of this technique though >>> introduces a race when a monitor is shared between the VMThread and >>> non-JavaThreads. This patch aims to solve that problem and at the >>> same time simplify the code. >>> >>> The proposal is based on the introduction of the new class >>> PlatformMonitor, which serves as a wrapper for the actual >>> synchronization primitives in each platform (mutexes and condition >>> variables). Most of the API calls can thus be implemented as simple >>> wrappers around PlatformMonitor, adding more assertions and very >>> little extra metadata. >>> To be able to remove the lock sneaking code and at the same time >>> avoid deadlocking scenarios, we combine two techniques: >>> >>> -When a JavaThread that has just acquired the lock, detects there is >>> a safepoint request in the ThreadLockBlockInVM destructor, it >>> releases the lock before blocking at the safepoint. After resuming >>> from it, the JavaThread will have to acquire the lock again. >>> >>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>> method, in order to avoid blocking we allow for a possible safepoint >>> request to make progress but without letting the JavaThread block for >>> it (since we would be stopped by the destructor anyways). We also do >>> that for the Monitor::lock() case although no deadlock is being >>> prevented there. >>> >>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>> used instead of the ThreadBlockInVM one. This allowed more >>> flexibility to handle the two techniques mentioned above. Also, >>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() which >>> creates some problems when trying to allow safepoints to continue >>> without stopping, since that method not only checks for safepoints >>> but also processes handshakes. >>> >>> In terms of performance, benchmarks show very similar results to what >>> we have now. >>> >>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have been >>> tested. >>> >>> Thanks, >>> Patricio >>> > From david.holmes at oracle.com Tue Jan 29 09:53:53 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Jan 2019 19:53:53 +1000 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> Message-ID: <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> Hi Per, If I may jump in on one thing you suggest ... destructors. Do we ever actually destroy Mutex or Monitor instances? There are inherent races that can make it very dangerous to try and actually delete the low-level PlatformMonitor and destroy the pthread_mutex or pthread_cond, and even release the memory used. The related PlatformEvent and PlatformParker are expected to be immortal and I think that is the same for PlatformMonitor. Aside: I don't think distinct PlatformMutex and PlatforMonitor is worth the effort unless we also rework the Mutex/Monitor relationship as well. Cheers, David On 29/01/2019 7:22 pm, Per Liden wrote: > Hi Patricio, > > On 01/28/2019 08:18 PM, Patricio Chilano wrote: >> Hi Robbin, >> >> Thanks for reviewing this! Removing the block_in_safepoint_check >> thread local attribute is a great idea, here is v02: >> >> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev > > I really like that we're ditching our old locking code in favor of using > pthread_mutex, et al. Nice work! > > > General comment > ---------------- > I think Mutex to be a plain mutex and not come with the baggage of > having a conditional variable. With this new code, it seems we're in a > really good position to make that happen. I.e. something like this: > > class PlatformMutex { > protected: > ? pthread_mutex_t _mutex; > > public: > ? PlatformMutex(); > ? ~PlatformMutex(); > > ? void lock(); > ? void unlock(); > ? bool try_lock(); > }; > > class PlatformMonitor : public PlatformMutex { > private: > ? pthread_cond_t _cond; > > public: > ? PlatformMonitor(); > ? ~PlatformMonitor(); > > ? int wait(jlong millis); > ? void notify(); > ? void notify_all(); > }; > > It might be that we want to do that as a separate step later instead of > including it in this patch. But I think we should try to get there. > > > src/hotspot/os/posix/os_*.[ch]pp > --------------------------------- > * I'd suggest that we place the PlatformMonitor class in a separate file > (like src/hotspot/os/posix/monitor_posix.cpp), just like we have done > with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). > > > src/hotspot/os/posix/os_posix.hpp > src/hotspot/os/solaris/os_solaris.hpp > src/hotspot/os/windows/os_windows.hpp > ------------------------------------- > * Please make _mutex/_cond plain variables, instead of arrays of 1. > That's just ugly ;) > > > src/hotspot/os/posix/os_posix.cpp > --------------------------------- > * Destructor missing, to call pthread_(mutex|cond)_destroy(). > > > src/hotspot/os/solaris/os_solaris.hpp > ------------------------------------- > * Not sure if there's a good reason to have the constructor be inlined > here. I'd suggest moving it to the cpp file. > > * Destructor missing. > > > src/hotspot/os/windows/os_windows.cpp > ------------------------------------- > * Destructor missing (I'm not too familiar with the windows API but I > assume there's a destroy function we should call here). > > > src/hotspot/share/runtime/interfaceSupport.inline.hpp > ----------------------------------------------------- > * Move "private:" above monitor_adr; > > ?289 class ThreadLockBlockInVM : public ThreadStateTransition { > ?290?? Monitor** monitor_adr; > ?291? private: > ?292?? void do_preempted(Monitor** in_flight_monitor_adr) { > > * monitor_adr should be _monitor_adr, or maybe even > _in_flight_monitor_adr to better match the name of the argument. > > > cheers, > Per > >> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >> >> Running mach5 again. >> >> Thanks, >> Patricio >> >> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>> Hi Patricio, >>> >>> Mostly looks good! >>> >>> block_at_safepoint is always called with block_in_safepoint_check = >>> true. (correct?) >>> Changing that to a local state instead of global simplifies the code. >>> >>> So I'm suggesting something like below. >>> >>> Thanks, Robbin >>> >>> diff -r e65cc445234c >>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon >>> Jan 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon >>> Jan 28 14:10:59 2019 +0100 >>> @@ -308,2 +308,1 @@ >>> -??? thread->block_in_safepoint_check = false; >>> -??? SafepointMechanism::block_at_safepoint(thread); >>> +??? SafepointMechanism::callback_if_safepoint(thread); >>> @@ -323,2 +322,1 @@ >>> -????? SafepointMechanism::block_at_safepoint(_thread); >>> -????? _thread->block_in_safepoint_check = true; >>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>> @@ -335,2 +332,0 @@ >>> -??? } else { >>> -????? _thread->block_in_safepoint_check = true; >>> @@ -337,0 +334,1 @@ >>> + >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -795,1 +795,1 @@ >>> -void SafepointSynchronize::block(JavaThread *thread) { >>> +void SafepointSynchronize::block(JavaThread *thread, bool >>> block_in_safepoint_check) { >>> @@ -850,1 +850,1 @@ >>> -????? if (thread->block_in_safepoint_check) { >>> +????? if (block_in_safepoint_check) { >>> @@ -880,1 +880,1 @@ >>> -????????? thread->block_in_safepoint_check) { >>> +????????? block_in_safepoint_check) { >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -146,1 +146,1 @@ >>> -? static void?? block(JavaThread *thread); >>> +? static void?? block(JavaThread *thread, bool >>> block_in_safepoint_check = true); >>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>> 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>> 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -? static inline void block_at_safepoint(JavaThread* thread); >>> +? static inline void callback_if_safepoint(JavaThread* thread); >>> diff -r e65cc445234c >>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan >>> 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan >>> 28 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>> @@ -84,1 +84,1 @@ >>> -??? SafepointSynchronize::block(thread); >>> +??? SafepointSynchronize::block(thread, false); >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -298,2 +297,0 @@ >>> -? block_in_safepoint_check = true; >>> - >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -788,2 +787,0 @@ >>> -? bool block_in_safepoint_check;????????????? // to decide whether >>> to block in SS::block or not >>> - >>> >>> >>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>> Hi all, >>>> >>>> Please review the following patch: >>>> >>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>> >>>> The current implementation of native monitors uses a technique that >>>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>>> during safepoints. The implementation of this technique though >>>> introduces a race when a monitor is shared between the VMThread and >>>> non-JavaThreads. This patch aims to solve that problem and at the >>>> same time simplify the code. >>>> >>>> The proposal is based on the introduction of the new class >>>> PlatformMonitor, which serves as a wrapper for the actual >>>> synchronization primitives in each platform (mutexes and condition >>>> variables). Most of the API calls can thus be implemented as simple >>>> wrappers around PlatformMonitor, adding more assertions and very >>>> little extra metadata. >>>> To be able to remove the lock sneaking code and at the same time >>>> avoid deadlocking scenarios, we combine two techniques: >>>> >>>> -When a JavaThread that has just acquired the lock, detects there is >>>> a safepoint request in the ThreadLockBlockInVM destructor, it >>>> releases the lock before blocking at the safepoint. After resuming >>>> from it, the JavaThread will have to acquire the lock again. >>>> >>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>> method, in order to avoid blocking we allow for a possible safepoint >>>> request to make progress but without letting the JavaThread block >>>> for it (since we would be stopped by the destructor anyways). We >>>> also do that for the Monitor::lock() case although no deadlock is >>>> being prevented there. >>>> >>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>>> used instead of the ThreadBlockInVM one. This allowed more >>>> flexibility to handle the two techniques mentioned above. Also, >>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() which >>>> creates some problems when trying to allow safepoints to continue >>>> without stopping, since that method not only checks for safepoints >>>> but also processes handshakes. >>>> >>>> In terms of performance, benchmarks show very similar results to >>>> what we have now. >>>> >>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>> been tested. >>>> >>>> Thanks, >>>> Patricio >>>> >> From thomas.schatzl at oracle.com Tue Jan 29 09:56:41 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 29 Jan 2019 10:56:41 +0100 Subject: RFR(XXS) 8217785: Padding ParallelTaskTerminator::_offered_termination variable In-Reply-To: <1548703217.31327.58.camel@redhat.com> References: <1548703217.31327.58.camel@redhat.com> Message-ID: <770a1007b2a3d721baa21a62cce66d77320fc271.camel@oracle.com> Hi, On Mon, 2019-01-28 at 14:20 -0500, zgu at redhat.com wrote: > Hi, > > Could I have reviews for this small enhancement, that pads > _offer_termination variable into a separate cacheline? cause the > variable may be highly contended during task termination. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8217785 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8217785/webrev.00/ > > Test: > hotspot_gc (+/- UseOWSTTaskTerminator) on Linux x64 (fastdebug and > release) > looks good. Please make sure that the title in the push message is correct :) (s/offerred/offered/ - I fixed it in the CR title). Thanks, Thomas From robbin.ehn at oracle.com Tue Jan 29 10:00:12 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 29 Jan 2019 11:00:12 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> Message-ID: <1a528505-3ccd-a6e8-0920-22e54eb75d37@oracle.com> Hi Per/Patricio, On 1/29/19 10:22 AM, Per Liden wrote: > General comment > ---------------- > I think Mutex to be a plain mutex and not come with the baggage of having a > conditional variable. With this new code, it seems we're in a really good > position to make that happen. I.e. something like this: This is a good idea. I suppose it enables us to remove zlock ? Also comment for callback_if_safepoint should be updated. 81 // Blocks a thread until safepoint is completed 82 static inline void callback_if_safepoint(JavaThread* thread); Thanks, Robbin > > class PlatformMutex { > protected: > ? pthread_mutex_t _mutex; > > public: > ? PlatformMutex(); > ? ~PlatformMutex(); > > ? void lock(); > ? void unlock(); > ? bool try_lock(); > }; > > class PlatformMonitor : public PlatformMutex { > private: > ? pthread_cond_t _cond; > > public: > ? PlatformMonitor(); > ? ~PlatformMonitor(); > > ? int wait(jlong millis); > ? void notify(); > ? void notify_all(); > }; > > It might be that we want to do that as a separate step later instead of > including it in this patch. But I think we should try to get there. > > > src/hotspot/os/posix/os_*.[ch]pp > --------------------------------- > * I'd suggest that we place the PlatformMonitor class in a separate file (like > src/hotspot/os/posix/monitor_posix.cpp), just like we have done with Semaphore > (in src/hotspot/os/posix/semaphore_posix.cpp). > > > src/hotspot/os/posix/os_posix.hpp > src/hotspot/os/solaris/os_solaris.hpp > src/hotspot/os/windows/os_windows.hpp > ------------------------------------- > * Please make _mutex/_cond plain variables, instead of arrays of 1. That's just > ugly ;) > > > src/hotspot/os/posix/os_posix.cpp > --------------------------------- > * Destructor missing, to call pthread_(mutex|cond)_destroy(). > > > src/hotspot/os/solaris/os_solaris.hpp > ------------------------------------- > * Not sure if there's a good reason to have the constructor be inlined here. I'd > suggest moving it to the cpp file. > > * Destructor missing. > > > src/hotspot/os/windows/os_windows.cpp > ------------------------------------- > * Destructor missing (I'm not too familiar with the windows API but I assume > there's a destroy function we should call here). > > > src/hotspot/share/runtime/interfaceSupport.inline.hpp > ----------------------------------------------------- > * Move "private:" above monitor_adr; > > ?289 class ThreadLockBlockInVM : public ThreadStateTransition { > ?290?? Monitor** monitor_adr; > ?291? private: > ?292?? void do_preempted(Monitor** in_flight_monitor_adr) { > > * monitor_adr should be _monitor_adr, or maybe even _in_flight_monitor_adr to > better match the name of the argument. > > > cheers, > Per > >> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >> >> Running mach5 again. >> >> Thanks, >> Patricio >> >> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>> Hi Patricio, >>> >>> Mostly looks good! >>> >>> block_at_safepoint is always called with block_in_safepoint_check = true. >>> (correct?) >>> Changing that to a local state instead of global simplifies the code. >>> >>> So I'm suggesting something like below. >>> >>> Thanks, Robbin >>> >>> diff -r e65cc445234c src/hotspot/share/runtime/interfaceSupport.inline.hpp >>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon Jan 28 >>> 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon Jan 28 >>> 14:10:59 2019 +0100 >>> @@ -308,2 +308,1 @@ >>> -??? thread->block_in_safepoint_check = false; >>> -??? SafepointMechanism::block_at_safepoint(thread); >>> +??? SafepointMechanism::callback_if_safepoint(thread); >>> @@ -323,2 +322,1 @@ >>> -????? SafepointMechanism::block_at_safepoint(_thread); >>> -????? _thread->block_in_safepoint_check = true; >>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>> @@ -335,2 +332,0 @@ >>> -??? } else { >>> -????? _thread->block_in_safepoint_check = true; >>> @@ -337,0 +334,1 @@ >>> + >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 14:10:59 2019 +0100 >>> @@ -795,1 +795,1 @@ >>> -void SafepointSynchronize::block(JavaThread *thread) { >>> +void SafepointSynchronize::block(JavaThread *thread, bool >>> block_in_safepoint_check) { >>> @@ -850,1 +850,1 @@ >>> -????? if (thread->block_in_safepoint_check) { >>> +????? if (block_in_safepoint_check) { >>> @@ -880,1 +880,1 @@ >>> -????????? thread->block_in_safepoint_check) { >>> +????????? block_in_safepoint_check) { >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 14:10:59 2019 +0100 >>> @@ -146,1 +146,1 @@ >>> -? static void?? block(JavaThread *thread); >>> +? static void?? block(JavaThread *thread, bool block_in_safepoint_check = >>> true); >>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -? static inline void block_at_safepoint(JavaThread* thread); >>> +? static inline void callback_if_safepoint(JavaThread* thread); >>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.inline.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan 28 >>> 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon Jan 28 >>> 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>> @@ -84,1 +84,1 @@ >>> -??? SafepointSynchronize::block(thread); >>> +??? SafepointSynchronize::block(thread, false); >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 2019 +0100 >>> @@ -298,2 +297,0 @@ >>> -? block_in_safepoint_check = true; >>> - >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 2019 +0100 >>> @@ -788,2 +787,0 @@ >>> -? bool block_in_safepoint_check;????????????? // to decide whether to block >>> in SS::block or not >>> - >>> >>> >>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>> Hi all, >>>> >>>> Please review the following patch: >>>> >>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>> >>>> The current implementation of native monitors uses a technique that we name >>>> "sneaky locking" to prevent possible deadlocks of the JVM during safepoints. >>>> The implementation of this technique though introduces a race when a monitor >>>> is shared between the VMThread and non-JavaThreads. This patch aims to solve >>>> that problem and at the same time simplify the code. >>>> >>>> The proposal is based on the introduction of the new class PlatformMonitor, >>>> which serves as a wrapper for the actual synchronization primitives in each >>>> platform (mutexes and condition variables). Most of the API calls can thus >>>> be implemented as simple wrappers around PlatformMonitor, adding more >>>> assertions and very little extra metadata. >>>> To be able to remove the lock sneaking code and at the same time avoid >>>> deadlocking scenarios, we combine two techniques: >>>> >>>> -When a JavaThread that has just acquired the lock, detects there is a >>>> safepoint request in the ThreadLockBlockInVM destructor, it releases the >>>> lock before blocking at the safepoint. After resuming from it, the >>>> JavaThread will have to acquire the lock again. >>>> >>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() method, in >>>> order to avoid blocking we allow for a possible safepoint request to make >>>> progress but without letting the JavaThread block for it (since we would be >>>> stopped by the destructor anyways). We also do that for the Monitor::lock() >>>> case although no deadlock is being prevented there. >>>> >>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class used >>>> instead of the ThreadBlockInVM one. This allowed more flexibility to handle >>>> the two techniques mentioned above. Also, ThreadBlockInVM calls >>>> SafepointMechanism::block_if_requested() which creates some problems when >>>> trying to allow safepoints to continue without stopping, since that method >>>> not only checks for safepoints but also processes handshakes. >>>> >>>> In terms of performance, benchmarks show very similar results to what we >>>> have now. >>>> >>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have been tested. >>>> >>>> Thanks, >>>> Patricio >>>> >> From per.liden at oracle.com Tue Jan 29 10:26:59 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 29 Jan 2019 11:26:59 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> Message-ID: <26e4d98a-62c5-35a8-9cf2-a1d9618b453b@oracle.com> On 01/29/2019 10:53 AM, David Holmes wrote: > Hi Per, > > If I may jump in on one thing you suggest ... destructors. Do we ever > actually destroy Mutex or Monitor instances? There are inherent races > that can make it very dangerous to try and actually delete the low-level > PlatformMonitor and destroy the pthread_mutex or pthread_cond, and even > release the memory used. The related PlatformEvent and PlatformParker > are expected to be immortal and I think that is the same for > PlatformMonitor. There are examples of where we do destroy Mutex instances today (like JVM_RawMonitorDestroy), and I don't think that's something the API should forbid. As Robbin hinted, I'm hoping ZLock (which is just a plain pthread_mutex) in ZGC can be converted to be a plain mutex using PlatforMutex in the future. ZLocks require dynamic creation/destruction to be supported as they are e.g. attached to nmethods which can come and go. > > Aside: I don't think distinct PlatformMutex and PlatforMonitor is worth > the effort unless we also rework the Mutex/Monitor relationship as well. I completely agree, Mutex/Monitor would also need to be reworked a bit. cheers, Per > > Cheers, > David > > On 29/01/2019 7:22 pm, Per Liden wrote: >> Hi Patricio, >> >> On 01/28/2019 08:18 PM, Patricio Chilano wrote: >>> Hi Robbin, >>> >>> Thanks for reviewing this! Removing the block_in_safepoint_check >>> thread local attribute is a great idea, here is v02: >>> >>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >> >> I really like that we're ditching our old locking code in favor of >> using pthread_mutex, et al. Nice work! >> >> >> General comment >> ---------------- >> I think Mutex to be a plain mutex and not come with the baggage of >> having a conditional variable. With this new code, it seems we're in a >> really good position to make that happen. I.e. something like this: >> >> class PlatformMutex { >> protected: >> pthread_mutex_t _mutex; >> >> public: >> PlatformMutex(); >> ~PlatformMutex(); >> >> void lock(); >> void unlock(); >> bool try_lock(); >> }; >> >> class PlatformMonitor : public PlatformMutex { >> private: >> pthread_cond_t _cond; >> >> public: >> PlatformMonitor(); >> ~PlatformMonitor(); >> >> int wait(jlong millis); >> void notify(); >> void notify_all(); >> }; >> >> It might be that we want to do that as a separate step later instead >> of including it in this patch. But I think we should try to get there. >> >> >> src/hotspot/os/posix/os_*.[ch]pp >> --------------------------------- >> * I'd suggest that we place the PlatformMonitor class in a separate >> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we have >> done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). >> >> >> src/hotspot/os/posix/os_posix.hpp >> src/hotspot/os/solaris/os_solaris.hpp >> src/hotspot/os/windows/os_windows.hpp >> ------------------------------------- >> * Please make _mutex/_cond plain variables, instead of arrays of 1. >> That's just ugly ;) >> >> >> src/hotspot/os/posix/os_posix.cpp >> --------------------------------- >> * Destructor missing, to call pthread_(mutex|cond)_destroy(). >> >> >> src/hotspot/os/solaris/os_solaris.hpp >> ------------------------------------- >> * Not sure if there's a good reason to have the constructor be inlined >> here. I'd suggest moving it to the cpp file. >> >> * Destructor missing. >> >> >> src/hotspot/os/windows/os_windows.cpp >> ------------------------------------- >> * Destructor missing (I'm not too familiar with the windows API but I >> assume there's a destroy function we should call here). >> >> >> src/hotspot/share/runtime/interfaceSupport.inline.hpp >> ----------------------------------------------------- >> * Move "private:" above monitor_adr; >> >> 289 class ThreadLockBlockInVM : public ThreadStateTransition { >> 290 Monitor** monitor_adr; >> 291 private: >> 292 void do_preempted(Monitor** in_flight_monitor_adr) { >> >> * monitor_adr should be _monitor_adr, or maybe even >> _in_flight_monitor_adr to better match the name of the argument. >> >> >> cheers, >> Per >> >>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >>> >>> Running mach5 again. >>> >>> Thanks, >>> Patricio >>> >>> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>>> Hi Patricio, >>>> >>>> Mostly looks good! >>>> >>>> block_at_safepoint is always called with block_in_safepoint_check = >>>> true. (correct?) >>>> Changing that to a local state instead of global simplifies the code. >>>> >>>> So I'm suggesting something like below. >>>> >>>> Thanks, Robbin >>>> >>>> diff -r e65cc445234c >>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon >>>> Jan 28 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon >>>> Jan 28 14:10:59 2019 +0100 >>>> @@ -308,2 +308,1 @@ >>>> - thread->block_in_safepoint_check = false; >>>> - SafepointMechanism::block_at_safepoint(thread); >>>> + SafepointMechanism::callback_if_safepoint(thread); >>>> @@ -323,2 +322,1 @@ >>>> - SafepointMechanism::block_at_safepoint(_thread); >>>> - _thread->block_in_safepoint_check = true; >>>> + SafepointMechanism::callback_if_safepoint(_thread); >>>> @@ -335,2 +332,0 @@ >>>> - } else { >>>> - _thread->block_in_safepoint_check = true; >>>> @@ -337,0 +334,1 @@ >>>> + >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>>> --- a/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -795,1 +795,1 @@ >>>> -void SafepointSynchronize::block(JavaThread *thread) { >>>> +void SafepointSynchronize::block(JavaThread *thread, bool >>>> block_in_safepoint_check) { >>>> @@ -850,1 +850,1 @@ >>>> - if (thread->block_in_safepoint_check) { >>>> + if (block_in_safepoint_check) { >>>> @@ -880,1 +880,1 @@ >>>> - thread->block_in_safepoint_check) { >>>> + block_in_safepoint_check) { >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>>> --- a/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -146,1 +146,1 @@ >>>> - static void block(JavaThread *thread); >>>> + static void block(JavaThread *thread, bool >>>> block_in_safepoint_check = true); >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 >>>> 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 >>>> 14:10:59 2019 +0100 >>>> @@ -82,1 +82,1 @@ >>>> - static inline void block_at_safepoint(JavaThread* thread); >>>> + static inline void callback_if_safepoint(JavaThread* thread); >>>> diff -r e65cc445234c >>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>> Jan 28 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>> Jan 28 14:10:59 2019 +0100 >>>> @@ -82,1 +82,1 @@ >>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>>> @@ -84,1 +84,1 @@ >>>> - SafepointSynchronize::block(thread); >>>> + SafepointSynchronize::block(thread, false); >>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>>> --- a/src/hotspot/share/runtime/thread.cpp Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/thread.cpp Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -298,2 +297,0 @@ >>>> - block_in_safepoint_check = true; >>>> - >>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>>> --- a/src/hotspot/share/runtime/thread.hpp Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/thread.hpp Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -788,2 +787,0 @@ >>>> - bool block_in_safepoint_check; // to decide whether >>>> to block in SS::block or not >>>> - >>>> >>>> >>>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>>> Hi all, >>>>> >>>>> Please review the following patch: >>>>> >>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>>> >>>>> The current implementation of native monitors uses a technique that >>>>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>>>> during safepoints. The implementation of this technique though >>>>> introduces a race when a monitor is shared between the VMThread and >>>>> non-JavaThreads. This patch aims to solve that problem and at the >>>>> same time simplify the code. >>>>> >>>>> The proposal is based on the introduction of the new class >>>>> PlatformMonitor, which serves as a wrapper for the actual >>>>> synchronization primitives in each platform (mutexes and condition >>>>> variables). Most of the API calls can thus be implemented as simple >>>>> wrappers around PlatformMonitor, adding more assertions and very >>>>> little extra metadata. >>>>> To be able to remove the lock sneaking code and at the same time >>>>> avoid deadlocking scenarios, we combine two techniques: >>>>> >>>>> -When a JavaThread that has just acquired the lock, detects there >>>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>>> releases the lock before blocking at the safepoint. After resuming >>>>> from it, the JavaThread will have to acquire the lock again. >>>>> >>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>>> method, in order to avoid blocking we allow for a possible >>>>> safepoint request to make progress but without letting the >>>>> JavaThread block for it (since we would be stopped by the >>>>> destructor anyways). We also do that for the Monitor::lock() case >>>>> although no deadlock is being prevented there. >>>>> >>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>>>> used instead of the ThreadBlockInVM one. This allowed more >>>>> flexibility to handle the two techniques mentioned above. Also, >>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>>> which creates some problems when trying to allow safepoints to >>>>> continue without stopping, since that method not only checks for >>>>> safepoints but also processes handshakes. >>>>> >>>>> In terms of performance, benchmarks show very similar results to >>>>> what we have now. >>>>> >>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>>> been tested. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>> From shade at redhat.com Tue Jan 29 10:30:41 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 11:30:41 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: On 1/29/19 6:13 AM, David Holmes wrote: > so by increasing the size of the block you are potentially greatly increasing the risk that you will > do something unsafe. Perhaps, as Thomas suggested, if you really want to expand this range it should > be done in a more safe manner. SafeFetch-based probing implemented here: http://cr.openjdk.java.net/~shade/8217879/webrev.03/ When you hack the code to pass pc=NULL, it would print: Instructions: (pc=0x0000000000000000) Memory is not readable > I'm also very much in favour of the slimmer hs_err file. At this rate of expansion the hs_err file > will look more like a core file, and we already have core files for that. ;-) hs_err files were > never intended to be a one-stop debugging shop. I don't know what your users submit on JVM crashes, but for us the availability of the core dump is an exception rather than the rule. Most of the time all we have is hs_err. Having a little bit more context does help figuring out what went wrong. -Aleksey From shade at redhat.com Tue Jan 29 10:34:24 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 11:34:24 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <4d84bfd0-02a0-58f8-b8f8-61104df26deb@oracle.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <29e645ac-983a-6d69-10a4-1bdd2e09f3f3@oracle.com> <4d84bfd0-02a0-58f8-b8f8-61104df26deb@oracle.com> Message-ID: <3e020446-c08c-e318-99a3-b078d76f7c47@redhat.com> On 1/28/19 9:59 PM, coleen.phillimore at oracle.com wrote: > This seems fine. > > I was looking at this page: > https://onlinedisassembler.com/odaweb/ > > What would be nice if the instructions looked like: > > Instructions: (pc=0x00007f911d0d6053, 0x00007f911d0d6143) > > and then just the hex dump, then I could cut/paste it into the window in that tool.? Or is there > another way? That's not unreasonable, but you need to have the PC marked in some way to see where it crashed. Right now I can read the hex dump and figure what are the bytes at PC, and cross-reference that with disassembly. Yes, that requires some hand work to remove the offsets when dumping to disassembler. It also requires changing the os::print_hex_dump, which I am not very eager to do. In some sense, this should be the RFE against the disassembler itself, rather than our dumping code :) -Aleksey From erik.osterlund at oracle.com Tue Jan 29 10:38:35 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 29 Jan 2019 11:38:35 +0100 Subject: 8216541: CompiledICHolders of VM locked unloaded nmethods are released too late In-Reply-To: <4195c9a1-e91c-1014-8fa5-2b0d3f6dfc30@oracle.com> References: <0345dd70-fa19-a7e3-7892-8f4500f8339f@oracle.com> <4195c9a1-e91c-1014-8fa5-2b0d3f6dfc30@oracle.com> Message-ID: <44b1aa28-81b0-6e9a-1b13-bb973bb5bd30@oracle.com> Hi Tobias, Thanks for having a look at this. On 2019-01-29 09:16, Tobias Hartmann wrote: > Hi Erik, > > very nice analysis, thanks a lot for investigating! > > On 28.01.19 14:56, Erik ?sterlund wrote: >> http://cr.openjdk.java.net/~eosterlund/8216541/webrev.00/ > > Why did you remove the call to thread->set_scanned_compiled_method(NULL) in sweeper.cpp? Because the CompiledMethodMarker destructor already nulls this out, and redundantly nulling it out again offers no extra protection. The idea of nulling it out before calling flush seems to have been to prevent the GC scanning from seeing this flushed nmethod in a safepoint, accidentally resurrecting it from the dead. But that is already impossible, because flush() is called with a never safepoint checking lock (which guarantees we don't have any and can't add any safepoint checks while holding that lock or we will deadlock badly). Therefore such safepoints will happen strictly after the processing of the compiled method is finished, and it is already cleared the normal way. By removing that pointless clearing, I could get rid of the release_compiled_method() function and just call flush directly instead. I get confused by there being two "destroy" functions, one in the sweeper and one in the nmethod, so I wanted it gone. > >> The proposed change has survived 200 rounds of kitchensink, hs-tier1-3 and hs-precheckin-comp. > > In the meanwhile, could you please run some more 100x iterations of kitchensink? Sure, running some more as we speak. Thanks, /Erik > > Thanks, > Tobias > From tobias.hartmann at oracle.com Tue Jan 29 10:40:34 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 29 Jan 2019 11:40:34 +0100 Subject: 8216541: CompiledICHolders of VM locked unloaded nmethods are released too late In-Reply-To: <44b1aa28-81b0-6e9a-1b13-bb973bb5bd30@oracle.com> References: <0345dd70-fa19-a7e3-7892-8f4500f8339f@oracle.com> <4195c9a1-e91c-1014-8fa5-2b0d3f6dfc30@oracle.com> <44b1aa28-81b0-6e9a-1b13-bb973bb5bd30@oracle.com> Message-ID: Hi Erik, okay, got it. Thanks for the details, your fix looks good to me! Best regards, Tobias On 29.01.19 11:38, Erik ?sterlund wrote: > Hi Tobias, > > Thanks for having a look at this. > > On 2019-01-29 09:16, Tobias Hartmann wrote: >> Hi Erik, >> >> very nice analysis, thanks a lot for investigating! >> >> On 28.01.19 14:56, Erik ?sterlund wrote: >>> http://cr.openjdk.java.net/~eosterlund/8216541/webrev.00/ >> >> Why did you remove the call to thread->set_scanned_compiled_method(NULL) in sweeper.cpp? > > Because the CompiledMethodMarker destructor already nulls this out, and redundantly nulling it out > again offers no extra protection. > > The idea of nulling it out before calling flush seems to have been to prevent the GC scanning from > seeing this flushed nmethod in a safepoint, accidentally resurrecting it from the dead. But that is > already impossible, because flush() is called with a never safepoint checking lock (which guarantees > we don't have any and can't add any safepoint checks while holding that lock or we will deadlock > badly). Therefore such safepoints will happen strictly after the processing of the compiled method > is finished, and it is already cleared the normal way. > > By removing that pointless clearing, I could get rid of the release_compiled_method() function and > just call flush directly instead. I get confused by there being two "destroy" functions, one in the > sweeper and one in the nmethod, so I wanted it gone. > >> >>> The proposed change has survived 200 rounds of kitchensink, hs-tier1-3 and hs-precheckin-comp. >> >> In the meanwhile, could you please run some more 100x iterations of kitchensink? > > Sure, running some more as we speak. > > Thanks, > /Erik > >> >> Thanks, >> Tobias >> From stefan.karlsson at oracle.com Tue Jan 29 10:44:48 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Jan 2019 11:44:48 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> Message-ID: <3bfbab0f-712e-1fe8-154b-13389e10c13f@oracle.com> Hi Aleksey, I like this proposal! I looked at the patch and have some comments: https://cr.openjdk.java.net/~shade/8217879/webrev.02/src/hotspot/share/runtime/os.cpp.udiff.html ------------------------------------------------------------------------------ Did you intend to use & instead of && here ?: + while (delta > 0 & pc > low && !is_readable_pointer(low)) { ------------------------------------------------------------------------------Regarding this comment: + // Safe probing relies on the conjecture that the entire pages are either + // readable or unreadable. Therefore, we cannot have delta larger than page + // size, since otherwise we can discover "far" page as readable, while "near" + // page is not readable. The code doesn't seem to prevent this. If 'pc' is somewhere at the beginning of a page, then 'low = pc - delta' could cross over to the previous page. ------------------------------------------------------------------------------ The probing of safe reads only happen at power-of-two offsets from 'pc'. Did you consider using binary search to find the exact break point where the reading fails? It's probably overkill for this feature though. ------------------------------------------------------------------------------ I see that: bool os::is_readable_range(const void* from, const void* to) { for (address p = align_down((address)from, min_page_size()); p < to; p += min_page_size()) { if (!is_readable_pointer(p)) { return false; } } return true; } is using min_page_size(), instead of os::vm_page_size(), which has this comment: // Return a lower bound for page sizes. Also works before os::init completed. static size_t min_page_size() { return 4 * K; } Is this something this code needs to consider as well? os::vm_page_size() can assert / return -1 if called too early. Thanks, StefanK On 2019-01-28 17:48, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217879 > > Fix: > http://cr.openjdk.java.net/~shade/8217879/webrev.01/ > > "Instructions" block is useful when following up on hs_errs that happened without the disassembler > attached, which is usually the case coming from users. One can use the disassembler [1] to look > around the code that was crashing, and get extended conjectures why the error happened, including > rewinding a bit of history. However, current window is sometimes too small to infer enough context. > I propose we extend it! > > The patch also commons the paths across OS/Arch-specific files to that current "delta" appears less > of the magic number. Plus, it adds cr()-s for consistency across the arches and within the methods. > > Testing: eyeballing hs_errs from artificial crashes, Linux x86_64 build, jdk-submit > > Thanks, > -Aleksey > > [1] I use https://onlinedisassembler.com, for example. > From thomas.schatzl at oracle.com Tue Jan 29 10:43:00 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 29 Jan 2019 11:43:00 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> Message-ID: <75b87cc51462f03000e92c7b258e8e1f576676ce.camel@oracle.com> Hi, On Tue, 2019-01-29 at 19:53 +1000, David Holmes wrote: > Hi Per, > > If I may jump in on one thing you suggest ... destructors. Do we > ever actually destroy Mutex or Monitor instances? There are inherent It would be really useful to keep being able to do this: I have been on-and-off working on creating remembered sets for regions on demand (where currently for each region's remembered set there is a Mutex associated to it, allocated at startup at the moment), and free them when no longer necessary. Doing so improves memory footprint to avoid having a big "empty" remembered set around that's actually never used for many regions, particularly at startup. More flexibility with assigning remembered sets also allows other changes that improve overall gc performance quite a bit. > races that can make it very dangerous to try and actually delete the > low-level PlatformMonitor and destroy the pthread_mutex or > pthread_cond, and even release the memory used. The related Thanks, Thomas From shade at redhat.com Tue Jan 29 10:54:00 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 11:54:00 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <3bfbab0f-712e-1fe8-154b-13389e10c13f@oracle.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <3bfbab0f-712e-1fe8-154b-13389e10c13f@oracle.com> Message-ID: <0e9a7b25-7c36-1091-27a2-74bd4676ed12@redhat.com> On 1/29/19 11:44 AM, Stefan Karlsson wrote: > ------------------------------------------------------------------------------ > Did you intend to use & instead of && here ?: > > + while (delta > 0 & pc > low && !is_readable_pointer(low)) { Yes! Fixed. > ------------------------------------------------------------------------------Regarding this comment: > > + // Safe probing relies on the conjecture that the entire pages are either + // readable or > unreadable. Therefore, we cannot have delta larger than page + // size, since otherwise we can > discover "far" page as readable, while "near" + // page is not readable. > > The code doesn't seem to prevent this. If 'pc' is somewhere at the beginning of a page, then > > 'low = pc - delta' could cross over to the previous page. Which is exactly what we want. We want to probe that "near" page. If it ends ups being unreadable, delta adjustment should roll "low" up. If it is readable, we would dump the piece of it. Probably the comment is confusing: we "only" need to make sure there are no "un-probed" full pages within the probing range. Updated the comment. > ------------------------------------------------------------------------------ The probing of safe > reads only happen at power-of-two offsets from 'pc'. Did you consider using binary search to find > the exact break point where the reading fails? It's probably overkill for this feature though. Overkill, IMO. > ------------------------------------------------------------------------------ I see that: bool > os::is_readable_range(const void* from, const void* to) { for (address p = align_down((address)from, > min_page_size()); p < to; p += min_page_size()) { if (!is_readable_pointer(p)) { return false; } } > return true; } is using min_page_size(), instead of os::vm_page_size(), which has this comment: // > Return a lower bound for page sizes. Also works before os::init completed. static size_t > min_page_size() { return 4 * K; } Is this something this code needs to consider as well? > os::vm_page_size() can assert / return -1 if called too early. Yes, os::min_page_size() is safer here, changed. New webrev: http://cr.openjdk.java.net/~shade/8217879/webrev.04 Thanks, -Aleksey From thomas.stuefe at gmail.com Tue Jan 29 10:57:25 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 11:57:25 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: On Tue, Jan 29, 2019 at 11:31 AM Aleksey Shipilev wrote: > On 1/29/19 6:13 AM, David Holmes wrote: > > so by increasing the size of the block you are potentially greatly > increasing the risk that you will > > do something unsafe. Perhaps, as Thomas suggested, if you really want to > expand this range it should > > be done in a more safe manner. > > SafeFetch-based probing implemented here: > http://cr.openjdk.java.net/~shade/8217879/webrev.03/ > > When you hack the code to pass pc=NULL, it would print: > > Instructions: (pc=0x0000000000000000) > Memory is not readable > I would bail out right away for bogus/unreadable pc since those may happen more often than not. Is this exponential stepping really necessary? You run a risk of loosing information, e.g. for a pc 250 bytes int a readable page preceeded with an unmapped one. This coding runs during error reporting, so no big speed concerns I think. Rest is fine. > > I'm also very much in favour of the slimmer hs_err file. At this rate of > expansion the hs_err file > > will look more like a core file, and we already have core files for > that. ;-) hs_err files were > > never intended to be a one-stop debugging shop. > > I don't know what your users submit on JVM crashes, but for us the > availability of the core dump is > an exception rather than the rule. Most of the time all we have is hs_err. > Having a little bit more > context does help figuring out what went wrong. > +1 ! ..thomas > > -Aleksey > > From shade at redhat.com Tue Jan 29 11:15:24 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 12:15:24 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: On 1/29/19 11:57 AM, Thomas St?fe wrote: > On Tue, Jan 29, 2019 at 11:31 AM Aleksey Shipilev > wrote: > I would bail out right away for bogus/unreadable pc since those may happen more often than not. I like the current code, because it exercises the "adjustment" part often, implicitly verifying it. > Is this exponential stepping really necessary? You run a risk of loosing information, e.g. for a pc > 250 bytes int a readable page preceeded with an unmapped one. This coding runs during error > reporting, so no big speed concerns I think. Yes, that would print "only" -128 bytes from that page; still much better than nothing. I did exponential stepping because delivering hundreds of signals via SafeFetch is probably not a good practice for both performance and debugging. I'd hate to "gdb cont" a hundred times :) The ideal way out would be to have os::print_hex_dump that is page-aware and that can probe once per page. But there is a bigger fish to fry, and this seems good for the overwhelming majority of cases, and universally safe. -Aleksey From matthias.baesken at sap.com Tue Jan 29 11:22:45 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 29 Jan 2019 11:22:45 +0000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> Message-ID: Hello here is a 2nd webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.2/ * Introduced static bool print_matching_lines_from_sysinfo_file(outputStream* st, const char* keywords_to_match[]) * Moved call to Linux-only print_os_info Best regards, Matthias From: Thomas St?fe Sent: Dienstag, 29. Januar 2019 09:23 To: Baesken, Matthias ; David Holmes Cc: hotspot-dev at openjdk.java.net Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x I'm still unhappy with that solution, since we have fanned out this coding for all architectures into the architecture independent os_linux.cpp. A generic "Show matching lines from given file" would be a better (slimmer, better reusable) solution IMHO. Side note: Could you please exchange strstr() .. with strncmp() since you require the start of the string to match. So no reason to parse the whole line if the start does not match. Cheers, Thomas On Tue, Jan 29, 2019 at 9:03 AM Baesken, Matthias > wrote: > > No I was thinking more about just adding the virtualization info to an > existing step like print_os_info or print_cpu_info. > Hi David , print_cpu_info does not sound like a great fit . Some info like LPAR Number: 14 LPAR Characteristics: Shared LPAR Name: VM12 Does not really belong there . print_os_info looks better , it already contains "container_info" on Linux, so I think this might fit . Best regards, Matthias > -----Original Message----- > From: David Holmes > > Sent: Dienstag, 29. Januar 2019 05:17 > To: Baesken, Matthias >; 'hotspot- > dev at openjdk.java.net' > > Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error > file on linux s390x > > On 28/01/2019 10:23 pm, Baesken, Matthias wrote: > >> > >> Can't you include this information in an existing section of the error > >> processing code instead of adding a new function that is empty > >> everywhere except Linux? > >> > > > > Hi David , do you mean something like > > > > > > #if defined(S390) > > > > STEP("printing virtualization info") > > ... > > > > #endif > > No I was thinking more about just adding the virtualization info to an > existing step like print_os_info or print_cpu_info. > > Cheers, > David > ----- > > > in vmError.cpp ? > > > > I thought about doing this. > > > > > > But on the other hand , the now still empty > os::pd_print_virtualization_info in platforms != linux > > might fill over time ( we could add [at least for some platforms] other > virtualization related info ). > > > > > > Best regards, Matthias > > > > > >> -----Original Message----- > >> From: David Holmes > > >> Sent: Montag, 28. Januar 2019 12:35 > >> To: Baesken, Matthias >; 'hotspot- > >> dev at openjdk.java.net' > > >> Subject: Re: RFR : 8217786: Provide virtualization related info in the > hs_error > >> file on linux s390x > >> > >> Hi Matthias, > >> > >> On 28/01/2019 6:48 pm, Baesken, Matthias wrote: > >>> Hello, please review this change ; it adds virtualization related info in > the > >> hs_error file on linux s390x . > >> > >> Can't you include this information in an existing section of the error > >> processing code instead of adding a new function that is empty > >> everywhere except Linux? > >> > >> Thanks, > >> David > >> > >>> On linux s390x, we usually (always?) run in virtualized environments > >> (LPAR and/or z/VM / KVM ). > >>> > >>> It is helpful for instance in support cases to get some information about > the > >> virtualized environment in the hs_error file . > >>> A lot of info can be taken from the /proc/sysinfo file on linux s390x . > >>> > >>> > >>> Bug/webrev : > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8217786 > >>> > >>> > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > >>> > >>> > >>> > >>> Best regards, Matthias > >>> From stefan.karlsson at oracle.com Tue Jan 29 11:21:36 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Jan 2019 12:21:36 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <0e9a7b25-7c36-1091-27a2-74bd4676ed12@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <3bfbab0f-712e-1fe8-154b-13389e10c13f@oracle.com> <0e9a7b25-7c36-1091-27a2-74bd4676ed12@redhat.com> Message-ID: On 2019-01-29 11:54, Aleksey Shipilev wrote: > On 1/29/19 11:44 AM, Stefan Karlsson wrote: >> ------------------------------------------------------------------------------ >> Did you intend to use & instead of && here ?: >> >> + while (delta > 0 & pc > low && !is_readable_pointer(low)) { > Yes! Fixed. > >> ------------------------------------------------------------------------------Regarding this comment: >> >> + // Safe probing relies on the conjecture that the entire pages are either + // readable or >> unreadable. Therefore, we cannot have delta larger than page + // size, since otherwise we can >> discover "far" page as readable, while "near" + // page is not readable. >> >> The code doesn't seem to prevent this. If 'pc' is somewhere at the beginning of a page, then >> >> 'low = pc - delta' could cross over to the previous page. > Which is exactly what we want. We want to probe that "near" page. If it ends ups being unreadable, > delta adjustment should roll "low" up. If it is readable, we would dump the piece of it. Probably > the comment is confusing: we "only" need to make sure there are no "un-probed" full pages within the > probing range. Updated the comment. (not sure if this is going to get garbled, like my previous mail) For this to work as you intend it, max_delta must not be set to page size. If max_delta is set to page size, the following would end up with low in Page 0 and high in Page 2: +--------+ | Page 0 |?? readable +--------+ < 'pc' beginning of Page 1 | Page 1|?? unreadable +--------+ | Page 2 |?? readable +--------+ Therefore it seems a bit off to cap with the page size in: MIN2(256, min_page_size()); Given that you use 256 today, this isn't really a problem, but maybe future proof this a bit and use min_page_size / 2 to guarantee that either low or high ends up in Page 1? Thanks, StefanK > >> ------------------------------------------------------------------------------ The probing of safe >> reads only happen at power-of-two offsets from 'pc'. Did you consider using binary search to find >> the exact break point where the reading fails? It's probably overkill for this feature though. > Overkill, IMO. > >> ------------------------------------------------------------------------------ I see that: bool >> os::is_readable_range(const void* from, const void* to) { for (address p = align_down((address)from, >> min_page_size()); p < to; p += min_page_size()) { if (!is_readable_pointer(p)) { return false; } } >> return true; } is using min_page_size(), instead of os::vm_page_size(), which has this comment: // >> Return a lower bound for page sizes. Also works before os::init completed. static size_t >> min_page_size() { return 4 * K; } Is this something this code needs to consider as well? >> os::vm_page_size() can assert / return -1 if called too early. > Yes, os::min_page_size() is safer here, changed. > > New webrev: > http://cr.openjdk.java.net/~shade/8217879/webrev.04 > > > Thanks, > -Aleksey > From shade at redhat.com Tue Jan 29 11:36:11 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 12:36:11 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <3bfbab0f-712e-1fe8-154b-13389e10c13f@oracle.com> <0e9a7b25-7c36-1091-27a2-74bd4676ed12@redhat.com> Message-ID: On 1/29/19 12:21 PM, Stefan Karlsson wrote: > For this to work as you intend it, max_delta must not be set to page size. If max_delta is set to > page size, the following would end up with low in Page 0 and high in Page 2: > > +--------+ > | Page 0 |?? readable > +--------+ < 'pc' beginning of Page 1 > | Page 1|?? unreadable > +--------+ > | Page 2 |?? readable > +--------+ > > Therefore it seems a bit off to cap with the page size in: > MIN2(256, min_page_size()); > > Given that you use 256 today, this isn't really a problem, but maybe future proof this a bit and use > min_page_size / 2 to guarantee that either low or high ends up in Page 1? Right, thanks. To handle that corner case, let's do /2 indeed, updated webrev in-place. -Aleksey From david.holmes at oracle.com Tue Jan 29 12:13:02 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Jan 2019 22:13:02 +1000 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <26e4d98a-62c5-35a8-9cf2-a1d9618b453b@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> <26e4d98a-62c5-35a8-9cf2-a1d9618b453b@oracle.com> Message-ID: <21edd9b0-bce2-732f-6550-6439ba03348d@oracle.com> On 29/01/2019 8:26 pm, Per Liden wrote: > On 01/29/2019 10:53 AM, David Holmes wrote: >> Hi Per, >> >> If I may jump in on one thing you suggest ... destructors. Do we ever >> actually destroy Mutex or Monitor instances? There are inherent races >> that can make it very dangerous to try and actually delete the >> low-level PlatformMonitor and destroy the pthread_mutex or >> pthread_cond, and even release the memory used. The related >> PlatformEvent and PlatformParker are expected to be immortal and I >> think that is the same for PlatformMonitor. > > There are examples of where we do destroy Mutex instances today (like > JVM_RawMonitorDestroy), and I don't think that's something the API > should forbid. RawMonitors are used very rarely and I would not have much confidence that their destruction is actually done in a safe manner. > > As Robbin hinted, I'm hoping ZLock (which is just a plain pthread_mutex) > in ZGC can be converted to be a plain mutex using PlatforMutex in the > future. ZLocks require dynamic creation/destruction to be supported as > they are e.g. attached to nmethods which can come and go. I think this is something that requires some detailed consideration. You need to have very robust lifecycle management to be able to delete the low-level implementation objects safely. (It's not sufficient for example to test the ability to destroy a pthread_mutex by checking if you can first acquire it (and release it again) then destroy it and deallocate it, as the previous owner could still be executing inside pthread_mutex_unlock!). In the context of the current changes though we do have an inconsistency between the high-level Mutex/Monitor and the PlatformMonitor. This was an oversight on my part when doing some earlier work on this. David ----- >> >> Aside: I don't think distinct PlatformMutex and PlatforMonitor is >> worth the effort unless we also rework the Mutex/Monitor relationship >> as well. > > I completely agree, Mutex/Monitor would also need to be reworked a bit. > > cheers, > Per > >> >> Cheers, >> David >> >> On 29/01/2019 7:22 pm, Per Liden wrote: >>> Hi Patricio, >>> >>> On 01/28/2019 08:18 PM, Patricio Chilano wrote: >>>> Hi Robbin, >>>> >>>> Thanks for reviewing this! Removing the block_in_safepoint_check >>>> thread local attribute is a great idea, here is v02: >>>> >>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >>> >>> I really like that we're ditching our old locking code in favor of >>> using pthread_mutex, et al. Nice work! >>> >>> >>> General comment >>> ---------------- >>> I think Mutex to be a plain mutex and not come with the baggage of >>> having a conditional variable. With this new code, it seems we're in >>> a really good position to make that happen. I.e. something like this: >>> >>> class PlatformMutex { >>> protected: >>> ?? pthread_mutex_t _mutex; >>> >>> public: >>> ?? PlatformMutex(); >>> ?? ~PlatformMutex(); >>> >>> ?? void lock(); >>> ?? void unlock(); >>> ?? bool try_lock(); >>> }; >>> >>> class PlatformMonitor : public PlatformMutex { >>> private: >>> ?? pthread_cond_t _cond; >>> >>> public: >>> ?? PlatformMonitor(); >>> ?? ~PlatformMonitor(); >>> >>> ?? int wait(jlong millis); >>> ?? void notify(); >>> ?? void notify_all(); >>> }; >>> >>> It might be that we want to do that as a separate step later instead >>> of including it in this patch. But I think we should try to get there. >>> >>> >>> src/hotspot/os/posix/os_*.[ch]pp >>> --------------------------------- >>> * I'd suggest that we place the PlatformMonitor class in a separate >>> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we have >>> done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). >>> >>> >>> src/hotspot/os/posix/os_posix.hpp >>> src/hotspot/os/solaris/os_solaris.hpp >>> src/hotspot/os/windows/os_windows.hpp >>> ------------------------------------- >>> * Please make _mutex/_cond plain variables, instead of arrays of 1. >>> That's just ugly ;) >>> >>> >>> src/hotspot/os/posix/os_posix.cpp >>> --------------------------------- >>> * Destructor missing, to call pthread_(mutex|cond)_destroy(). >>> >>> >>> src/hotspot/os/solaris/os_solaris.hpp >>> ------------------------------------- >>> * Not sure if there's a good reason to have the constructor be >>> inlined here. I'd suggest moving it to the cpp file. >>> >>> * Destructor missing. >>> >>> >>> src/hotspot/os/windows/os_windows.cpp >>> ------------------------------------- >>> * Destructor missing (I'm not too familiar with the windows API but I >>> assume there's a destroy function we should call here). >>> >>> >>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>> ----------------------------------------------------- >>> * Move "private:" above monitor_adr; >>> >>> ? 289 class ThreadLockBlockInVM : public ThreadStateTransition { >>> ? 290?? Monitor** monitor_adr; >>> ? 291? private: >>> ? 292?? void do_preempted(Monitor** in_flight_monitor_adr) { >>> >>> * monitor_adr should be _monitor_adr, or maybe even >>> _in_flight_monitor_adr to better match the name of the argument. >>> >>> >>> cheers, >>> Per >>> >>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >>>> >>>> Running mach5 again. >>>> >>>> Thanks, >>>> Patricio >>>> >>>> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>>>> Hi Patricio, >>>>> >>>>> Mostly looks good! >>>>> >>>>> block_at_safepoint is always called with block_in_safepoint_check = >>>>> true. (correct?) >>>>> Changing that to a local state instead of global simplifies the code. >>>>> >>>>> So I'm suggesting something like below. >>>>> >>>>> Thanks, Robbin >>>>> >>>>> diff -r e65cc445234c >>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon >>>>> Jan 28 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon >>>>> Jan 28 14:10:59 2019 +0100 >>>>> @@ -308,2 +308,1 @@ >>>>> -??? thread->block_in_safepoint_check = false; >>>>> -??? SafepointMechanism::block_at_safepoint(thread); >>>>> +??? SafepointMechanism::callback_if_safepoint(thread); >>>>> @@ -323,2 +322,1 @@ >>>>> -????? SafepointMechanism::block_at_safepoint(_thread); >>>>> -????? _thread->block_in_safepoint_check = true; >>>>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>>>> @@ -335,2 +332,0 @@ >>>>> -??? } else { >>>>> -????? _thread->block_in_safepoint_check = true; >>>>> @@ -337,0 +334,1 @@ >>>>> + >>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>>>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 >>>>> 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 >>>>> 14:10:59 2019 +0100 >>>>> @@ -795,1 +795,1 @@ >>>>> -void SafepointSynchronize::block(JavaThread *thread) { >>>>> +void SafepointSynchronize::block(JavaThread *thread, bool >>>>> block_in_safepoint_check) { >>>>> @@ -850,1 +850,1 @@ >>>>> -????? if (thread->block_in_safepoint_check) { >>>>> +????? if (block_in_safepoint_check) { >>>>> @@ -880,1 +880,1 @@ >>>>> -????????? thread->block_in_safepoint_check) { >>>>> +????????? block_in_safepoint_check) { >>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>>>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 >>>>> 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 >>>>> 14:10:59 2019 +0100 >>>>> @@ -146,1 +146,1 @@ >>>>> -? static void?? block(JavaThread *thread); >>>>> +? static void?? block(JavaThread *thread, bool >>>>> block_in_safepoint_check = true); >>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan >>>>> 28 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan >>>>> 28 14:10:59 2019 +0100 >>>>> @@ -82,1 +82,1 @@ >>>>> -? static inline void block_at_safepoint(JavaThread* thread); >>>>> +? static inline void callback_if_safepoint(JavaThread* thread); >>>>> diff -r e65cc445234c >>>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>> Jan 28 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>> Jan 28 14:10:59 2019 +0100 >>>>> @@ -82,1 +82,1 @@ >>>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>>>> @@ -84,1 +84,1 @@ >>>>> -??? SafepointSynchronize::block(thread); >>>>> +??? SafepointSynchronize::block(thread, false); >>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>>>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>>>> 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>>>> 2019 +0100 >>>>> @@ -298,2 +297,0 @@ >>>>> -? block_in_safepoint_check = true; >>>>> - >>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>>>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>>>> 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>>>> 2019 +0100 >>>>> @@ -788,2 +787,0 @@ >>>>> -? bool block_in_safepoint_check;????????????? // to decide whether >>>>> to block in SS::block or not >>>>> - >>>>> >>>>> >>>>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review the following patch: >>>>>> >>>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>>>> >>>>>> The current implementation of native monitors uses a technique >>>>>> that we name "sneaky locking" to prevent possible deadlocks of the >>>>>> JVM during safepoints. The implementation of this technique though >>>>>> introduces a race when a monitor is shared between the VMThread >>>>>> and non-JavaThreads. This patch aims to solve that problem and at >>>>>> the same time simplify the code. >>>>>> >>>>>> The proposal is based on the introduction of the new class >>>>>> PlatformMonitor, which serves as a wrapper for the actual >>>>>> synchronization primitives in each platform (mutexes and condition >>>>>> variables). Most of the API calls can thus be implemented as >>>>>> simple wrappers around PlatformMonitor, adding more assertions and >>>>>> very little extra metadata. >>>>>> To be able to remove the lock sneaking code and at the same time >>>>>> avoid deadlocking scenarios, we combine two techniques: >>>>>> >>>>>> -When a JavaThread that has just acquired the lock, detects there >>>>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>>>> releases the lock before blocking at the safepoint. After resuming >>>>>> from it, the JavaThread will have to acquire the lock again. >>>>>> >>>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>>>> method, in order to avoid blocking we allow for a possible >>>>>> safepoint request to make progress but without letting the >>>>>> JavaThread block for it (since we would be stopped by the >>>>>> destructor anyways). We also do that for the Monitor::lock() case >>>>>> although no deadlock is being prevented there. >>>>>> >>>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition >>>>>> class used instead of the ThreadBlockInVM one. This allowed more >>>>>> flexibility to handle the two techniques mentioned above. Also, >>>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>>>> which creates some problems when trying to allow safepoints to >>>>>> continue without stopping, since that method not only checks for >>>>>> safepoints but also processes handshakes. >>>>>> >>>>>> In terms of performance, benchmarks show very similar results to >>>>>> what we have now. >>>>>> >>>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>>>> been tested. >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>> >>>> From erik.osterlund at oracle.com Tue Jan 29 13:18:23 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 29 Jan 2019 14:18:23 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <21edd9b0-bce2-732f-6550-6439ba03348d@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> <26e4d98a-62c5-35a8-9cf2-a1d9618b453b@oracle.com> <21edd9b0-bce2-732f-6550-6439ba03348d@oracle.com> Message-ID: <9bbf5ba9-ac9e-8b1b-8f57-4bd4548a00ff@oracle.com> Hi David, I don't think we can get away with not deleting monitors. Because we do that all over the place. Another example of deleted monitor is the SR_lock. We have one per thread, and delete them when the thread is deleted. SMR makes sure it is deleted when nobody is using it any longer. All sane. The extra_data_lock() of MDOs belongs to MethodData objects, and is deleted when the MethodData is deleted (at which point nobody can be using it any longer). All sane. There is plenty more. I think it's up to the user of the API to make sure locks are deleted only when they are not being used concurrently. We have plenty of tools to ensure this today (GlobalCounter, thread-local handshakes, safepoints, hazard pointers, outer locks, etc) Perhaps a suitable assert could catch errors where locking races with deletion, so the user can choose what tool to use to ensure its safety. Thanks, /Erik On 2019-01-29 13:13, David Holmes wrote: > On 29/01/2019 8:26 pm, Per Liden wrote: >> On 01/29/2019 10:53 AM, David Holmes wrote: >>> Hi Per, >>> >>> If I may jump in on one thing you suggest ... destructors. Do we ever >>> actually destroy Mutex or Monitor instances? There are inherent races >>> that can make it very dangerous to try and actually delete the >>> low-level PlatformMonitor and destroy the pthread_mutex or >>> pthread_cond, and even release the memory used. The related >>> PlatformEvent and PlatformParker are expected to be immortal and I >>> think that is the same for PlatformMonitor. >> >> There are examples of where we do destroy Mutex instances today (like >> JVM_RawMonitorDestroy), and I don't think that's something the API >> should forbid. > > RawMonitors are used very rarely and I would not have much confidence > that their destruction is actually done in a safe manner. > >> >> As Robbin hinted, I'm hoping ZLock (which is just a plain >> pthread_mutex) in ZGC can be converted to be a plain mutex using >> PlatforMutex in the future. ZLocks require dynamic >> creation/destruction to be supported as they are e.g. attached to >> nmethods which can come and go. > > I think this is something that requires some detailed consideration. You > need to have very robust lifecycle management to be able to delete the > low-level implementation objects safely. (It's not sufficient for > example to test the ability to destroy a pthread_mutex by checking if > you can first acquire it (and release it again) then destroy it and > deallocate it, as the previous owner could still be executing inside > pthread_mutex_unlock!). > > In the context of the current changes though we do have an inconsistency > between the high-level Mutex/Monitor and the PlatformMonitor. This was > an oversight on my part when doing some earlier work on this. > > David > ----- > >>> >>> Aside: I don't think distinct PlatformMutex and PlatforMonitor is >>> worth the effort unless we also rework the Mutex/Monitor relationship >>> as well. >> >> I completely agree, Mutex/Monitor would also need to be reworked a bit. >> >> cheers, >> Per >> >>> >>> Cheers, >>> David >>> >>> On 29/01/2019 7:22 pm, Per Liden wrote: >>>> Hi Patricio, >>>> >>>> On 01/28/2019 08:18 PM, Patricio Chilano wrote: >>>>> Hi Robbin, >>>>> >>>>> Thanks for reviewing this! Removing the block_in_safepoint_check >>>>> thread local attribute is a great idea, here is v02: >>>>> >>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >>>> >>>> I really like that we're ditching our old locking code in favor of >>>> using pthread_mutex, et al. Nice work! >>>> >>>> >>>> General comment >>>> ---------------- >>>> I think Mutex to be a plain mutex and not come with the baggage of >>>> having a conditional variable. With this new code, it seems we're in >>>> a really good position to make that happen. I.e. something like this: >>>> >>>> class PlatformMutex { >>>> protected: >>>> ?? pthread_mutex_t _mutex; >>>> >>>> public: >>>> ?? PlatformMutex(); >>>> ?? ~PlatformMutex(); >>>> >>>> ?? void lock(); >>>> ?? void unlock(); >>>> ?? bool try_lock(); >>>> }; >>>> >>>> class PlatformMonitor : public PlatformMutex { >>>> private: >>>> ?? pthread_cond_t _cond; >>>> >>>> public: >>>> ?? PlatformMonitor(); >>>> ?? ~PlatformMonitor(); >>>> >>>> ?? int wait(jlong millis); >>>> ?? void notify(); >>>> ?? void notify_all(); >>>> }; >>>> >>>> It might be that we want to do that as a separate step later instead >>>> of including it in this patch. But I think we should try to get there. >>>> >>>> >>>> src/hotspot/os/posix/os_*.[ch]pp >>>> --------------------------------- >>>> * I'd suggest that we place the PlatformMonitor class in a separate >>>> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we >>>> have done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). >>>> >>>> >>>> src/hotspot/os/posix/os_posix.hpp >>>> src/hotspot/os/solaris/os_solaris.hpp >>>> src/hotspot/os/windows/os_windows.hpp >>>> ------------------------------------- >>>> * Please make _mutex/_cond plain variables, instead of arrays of 1. >>>> That's just ugly ;) >>>> >>>> >>>> src/hotspot/os/posix/os_posix.cpp >>>> --------------------------------- >>>> * Destructor missing, to call pthread_(mutex|cond)_destroy(). >>>> >>>> >>>> src/hotspot/os/solaris/os_solaris.hpp >>>> ------------------------------------- >>>> * Not sure if there's a good reason to have the constructor be >>>> inlined here. I'd suggest moving it to the cpp file. >>>> >>>> * Destructor missing. >>>> >>>> >>>> src/hotspot/os/windows/os_windows.cpp >>>> ------------------------------------- >>>> * Destructor missing (I'm not too familiar with the windows API but >>>> I assume there's a destroy function we should call here). >>>> >>>> >>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>> ----------------------------------------------------- >>>> * Move "private:" above monitor_adr; >>>> >>>> ? 289 class ThreadLockBlockInVM : public ThreadStateTransition { >>>> ? 290?? Monitor** monitor_adr; >>>> ? 291? private: >>>> ? 292?? void do_preempted(Monitor** in_flight_monitor_adr) { >>>> >>>> * monitor_adr should be _monitor_adr, or maybe even >>>> _in_flight_monitor_adr to better match the name of the argument. >>>> >>>> >>>> cheers, >>>> Per >>>> >>>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >>>>> >>>>> Running mach5 again. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>>>> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>>>>> Hi Patricio, >>>>>> >>>>>> Mostly looks good! >>>>>> >>>>>> block_at_safepoint is always called with block_in_safepoint_check >>>>>> = true. (correct?) >>>>>> Changing that to a local state instead of global simplifies the code. >>>>>> >>>>>> So I'm suggesting something like below. >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> diff -r e65cc445234c >>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon >>>>>> Jan 28 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp??? Mon >>>>>> Jan 28 14:10:59 2019 +0100 >>>>>> @@ -308,2 +308,1 @@ >>>>>> -??? thread->block_in_safepoint_check = false; >>>>>> -??? SafepointMechanism::block_at_safepoint(thread); >>>>>> +??? SafepointMechanism::callback_if_safepoint(thread); >>>>>> @@ -323,2 +322,1 @@ >>>>>> -????? SafepointMechanism::block_at_safepoint(_thread); >>>>>> -????? _thread->block_in_safepoint_check = true; >>>>>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>>>>> @@ -335,2 +332,0 @@ >>>>>> -??? } else { >>>>>> -????? _thread->block_in_safepoint_check = true; >>>>>> @@ -337,0 +334,1 @@ >>>>>> + >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>>>>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 >>>>>> 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 >>>>>> 14:10:59 2019 +0100 >>>>>> @@ -795,1 +795,1 @@ >>>>>> -void SafepointSynchronize::block(JavaThread *thread) { >>>>>> +void SafepointSynchronize::block(JavaThread *thread, bool >>>>>> block_in_safepoint_check) { >>>>>> @@ -850,1 +850,1 @@ >>>>>> -????? if (thread->block_in_safepoint_check) { >>>>>> +????? if (block_in_safepoint_check) { >>>>>> @@ -880,1 +880,1 @@ >>>>>> -????????? thread->block_in_safepoint_check) { >>>>>> +????????? block_in_safepoint_check) { >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>>>>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 >>>>>> 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 >>>>>> 14:10:59 2019 +0100 >>>>>> @@ -146,1 +146,1 @@ >>>>>> -? static void?? block(JavaThread *thread); >>>>>> +? static void?? block(JavaThread *thread, bool >>>>>> block_in_safepoint_check = true); >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>>>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan >>>>>> 28 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan >>>>>> 28 14:10:59 2019 +0100 >>>>>> @@ -82,1 +82,1 @@ >>>>>> -? static inline void block_at_safepoint(JavaThread* thread); >>>>>> +? static inline void callback_if_safepoint(JavaThread* thread); >>>>>> diff -r e65cc445234c >>>>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>>>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>>> Jan 28 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>>> Jan 28 14:10:59 2019 +0100 >>>>>> @@ -82,1 +82,1 @@ >>>>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>>>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>>>>> @@ -84,1 +84,1 @@ >>>>>> -??? SafepointSynchronize::block(thread); >>>>>> +??? SafepointSynchronize::block(thread, false); >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>>>>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>>>>> 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>>>>> 2019 +0100 >>>>>> @@ -298,2 +297,0 @@ >>>>>> -? block_in_safepoint_check = true; >>>>>> - >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>>>>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>>>>> 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>>>>> 2019 +0100 >>>>>> @@ -788,2 +787,0 @@ >>>>>> -? bool block_in_safepoint_check;????????????? // to decide >>>>>> whether to block in SS::block or not >>>>>> - >>>>>> >>>>>> >>>>>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review the following patch: >>>>>>> >>>>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>>>>> >>>>>>> The current implementation of native monitors uses a technique >>>>>>> that we name "sneaky locking" to prevent possible deadlocks of >>>>>>> the JVM during safepoints. The implementation of this technique >>>>>>> though introduces a race when a monitor is shared between the >>>>>>> VMThread and non-JavaThreads. This patch aims to solve that >>>>>>> problem and at the same time simplify the code. >>>>>>> >>>>>>> The proposal is based on the introduction of the new class >>>>>>> PlatformMonitor, which serves as a wrapper for the actual >>>>>>> synchronization primitives in each platform (mutexes and >>>>>>> condition variables). Most of the API calls can thus be >>>>>>> implemented as simple wrappers around PlatformMonitor, adding >>>>>>> more assertions and very little extra metadata. >>>>>>> To be able to remove the lock sneaking code and at the same time >>>>>>> avoid deadlocking scenarios, we combine two techniques: >>>>>>> >>>>>>> -When a JavaThread that has just acquired the lock, detects there >>>>>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>>>>> releases the lock before blocking at the safepoint. After >>>>>>> resuming from it, the JavaThread will have to acquire the lock >>>>>>> again. >>>>>>> >>>>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>>>>> method, in order to avoid blocking we allow for a possible >>>>>>> safepoint request to make progress but without letting the >>>>>>> JavaThread block for it (since we would be stopped by the >>>>>>> destructor anyways). We also do that for the Monitor::lock() case >>>>>>> although no deadlock is being prevented there. >>>>>>> >>>>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition >>>>>>> class used instead of the ThreadBlockInVM one. This allowed more >>>>>>> flexibility to handle the two techniques mentioned above. Also, >>>>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>>>>> which creates some problems when trying to allow safepoints to >>>>>>> continue without stopping, since that method not only checks for >>>>>>> safepoints but also processes handshakes. >>>>>>> >>>>>>> In terms of performance, benchmarks show very similar results to >>>>>>> what we have now. >>>>>>> >>>>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>>>>> been tested. >>>>>>> >>>>>>> Thanks, >>>>>>> Patricio >>>>>>> >>>>> From thomas.stuefe at gmail.com Tue Jan 29 13:20:38 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 14:20:38 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: On Tue, Jan 29, 2019 at 12:15 PM Aleksey Shipilev wrote: > On 1/29/19 11:57 AM, Thomas St?fe wrote: > > On Tue, Jan 29, 2019 at 11:31 AM Aleksey Shipilev > wrote: > > I would bail out right away for bogus/unreadable pc since those may > happen more often than not. > > I like the current code, because it exercises the "adjustment" part often, > implicitly verifying it. > > > Is this exponential stepping really necessary? You run a risk of loosing > information, e.g. for a pc > > 250 bytes int a readable page preceeded with an unmapped one. This > coding runs during error > > reporting, so no big speed concerns I think. > > Yes, that would print "only" -128 bytes from that page; still much better > than nothing. > > I did exponential stepping because delivering hundreds of signals via > SafeFetch is probably not a > good practice for both performance and debugging. I'd hate to "gdb cont" a > hundred times :) > > The ideal way out would be to have os::print_hex_dump that is page-aware > and that can probe once per > page. Or, start at the pc and work your way downward byte for byte (or word for word since SafeFetch aligns to word size anyway), squirreling the bytes away temporarily, until -256 or first fault. Then print those bytes. You only have to do this on the first leg, on the second you work upward, the same direction you print, so no need to temporarily store the bytes. But the change is fine for me as it is. ..Thomas But there is a bigger fish to fry, and this seems good for the overwhelming > majority of cases, > and universally safe. > > -Aleksey > > From zgu at redhat.com Tue Jan 29 13:24:49 2019 From: zgu at redhat.com (zgu at redhat.com) Date: Tue, 29 Jan 2019 08:24:49 -0500 Subject: RFR(XXS) 8217785: Padding ParallelTaskTerminator::_offered_termination variable In-Reply-To: <770a1007b2a3d721baa21a62cce66d77320fc271.camel@oracle.com> References: <1548703217.31327.58.camel@redhat.com> <770a1007b2a3d721baa21a62cce66d77320fc271.camel@oracle.com> Message-ID: <1548768289.31327.63.camel@redhat.com> > Please make sure that the title in the push message is correct :) > (s/offerred/offered/ - I fixed it in the CR title). Oops. Thank you, Thomas, -Zhengyu > > Thanks, > Thomas > > From per.liden at oracle.com Tue Jan 29 13:36:39 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 29 Jan 2019 14:36:39 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <21edd9b0-bce2-732f-6550-6439ba03348d@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> <26e4d98a-62c5-35a8-9cf2-a1d9618b453b@oracle.com> <21edd9b0-bce2-732f-6550-6439ba03348d@oracle.com> Message-ID: On 01/29/2019 01:13 PM, David Holmes wrote: > On 29/01/2019 8:26 pm, Per Liden wrote: >> On 01/29/2019 10:53 AM, David Holmes wrote: >>> Hi Per, >>> >>> If I may jump in on one thing you suggest ... destructors. Do we ever >>> actually destroy Mutex or Monitor instances? There are inherent races >>> that can make it very dangerous to try and actually delete the >>> low-level PlatformMonitor and destroy the pthread_mutex or >>> pthread_cond, and even release the memory used. The related >>> PlatformEvent and PlatformParker are expected to be immortal and I >>> think that is the same for PlatformMonitor. >> >> There are examples of where we do destroy Mutex instances today (like >> JVM_RawMonitorDestroy), and I don't think that's something the API >> should forbid. > > RawMonitors are used very rarely and I would not have much confidence > that their destruction is actually done in a safe manner. That was just one example, there are many other places where do to this too. Many locks don't have the same lifecycle of the VM. > >> >> As Robbin hinted, I'm hoping ZLock (which is just a plain >> pthread_mutex) in ZGC can be converted to be a plain mutex using >> PlatforMutex in the future. ZLocks require dynamic >> creation/destruction to be supported as they are e.g. attached to >> nmethods which can come and go. > > I think this is something that requires some detailed consideration. You Destroying a mutex is no different from destroying any other shared data structure, and we do that all the time in hotspot. > need to have very robust lifecycle management to be able to delete the > low-level implementation objects safely. (It's not sufficient for > example to test the ability to destroy a pthread_mutex by checking if > you can first acquire it (and release it again) then destroy it and > deallocate it, as the previous owner could still be executing inside > pthread_mutex_unlock!). Well, that would be an absurd and completely broken scheme, to say the least ;) cheers, Per > > In the context of the current changes though we do have an inconsistency > between the high-level Mutex/Monitor and the PlatformMonitor. This was > an oversight on my part when doing some earlier work on this. > > David > ----- > >>> >>> Aside: I don't think distinct PlatformMutex and PlatforMonitor is >>> worth the effort unless we also rework the Mutex/Monitor relationship >>> as well. >> >> I completely agree, Mutex/Monitor would also need to be reworked a bit. >> >> cheers, >> Per >> >>> >>> Cheers, >>> David >>> >>> On 29/01/2019 7:22 pm, Per Liden wrote: >>>> Hi Patricio, >>>> >>>> On 01/28/2019 08:18 PM, Patricio Chilano wrote: >>>>> Hi Robbin, >>>>> >>>>> Thanks for reviewing this! Removing the block_in_safepoint_check >>>>> thread local attribute is a great idea, here is v02: >>>>> >>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >>>> >>>> I really like that we're ditching our old locking code in favor of >>>> using pthread_mutex, et al. Nice work! >>>> >>>> >>>> General comment >>>> ---------------- >>>> I think Mutex to be a plain mutex and not come with the baggage of >>>> having a conditional variable. With this new code, it seems we're in >>>> a really good position to make that happen. I.e. something like this: >>>> >>>> class PlatformMutex { >>>> protected: >>>> pthread_mutex_t _mutex; >>>> >>>> public: >>>> PlatformMutex(); >>>> ~PlatformMutex(); >>>> >>>> void lock(); >>>> void unlock(); >>>> bool try_lock(); >>>> }; >>>> >>>> class PlatformMonitor : public PlatformMutex { >>>> private: >>>> pthread_cond_t _cond; >>>> >>>> public: >>>> PlatformMonitor(); >>>> ~PlatformMonitor(); >>>> >>>> int wait(jlong millis); >>>> void notify(); >>>> void notify_all(); >>>> }; >>>> >>>> It might be that we want to do that as a separate step later instead >>>> of including it in this patch. But I think we should try to get there. >>>> >>>> >>>> src/hotspot/os/posix/os_*.[ch]pp >>>> --------------------------------- >>>> * I'd suggest that we place the PlatformMonitor class in a separate >>>> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we >>>> have done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). >>>> >>>> >>>> src/hotspot/os/posix/os_posix.hpp >>>> src/hotspot/os/solaris/os_solaris.hpp >>>> src/hotspot/os/windows/os_windows.hpp >>>> ------------------------------------- >>>> * Please make _mutex/_cond plain variables, instead of arrays of 1. >>>> That's just ugly ;) >>>> >>>> >>>> src/hotspot/os/posix/os_posix.cpp >>>> --------------------------------- >>>> * Destructor missing, to call pthread_(mutex|cond)_destroy(). >>>> >>>> >>>> src/hotspot/os/solaris/os_solaris.hpp >>>> ------------------------------------- >>>> * Not sure if there's a good reason to have the constructor be >>>> inlined here. I'd suggest moving it to the cpp file. >>>> >>>> * Destructor missing. >>>> >>>> >>>> src/hotspot/os/windows/os_windows.cpp >>>> ------------------------------------- >>>> * Destructor missing (I'm not too familiar with the windows API but >>>> I assume there's a destroy function we should call here). >>>> >>>> >>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>> ----------------------------------------------------- >>>> * Move "private:" above monitor_adr; >>>> >>>> 289 class ThreadLockBlockInVM : public ThreadStateTransition { >>>> 290 Monitor** monitor_adr; >>>> 291 private: >>>> 292 void do_preempted(Monitor** in_flight_monitor_adr) { >>>> >>>> * monitor_adr should be _monitor_adr, or maybe even >>>> _in_flight_monitor_adr to better match the name of the argument. >>>> >>>> >>>> cheers, >>>> Per >>>> >>>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >>>>> >>>>> Running mach5 again. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>>>> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>>>>> Hi Patricio, >>>>>> >>>>>> Mostly looks good! >>>>>> >>>>>> block_at_safepoint is always called with block_in_safepoint_check >>>>>> = true. (correct?) >>>>>> Changing that to a local state instead of global simplifies the code. >>>>>> >>>>>> So I'm suggesting something like below. >>>>>> >>>>>> Thanks, Robbin >>>>>> >>>>>> diff -r e65cc445234c >>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon >>>>>> Jan 28 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon >>>>>> Jan 28 14:10:59 2019 +0100 >>>>>> @@ -308,2 +308,1 @@ >>>>>> - thread->block_in_safepoint_check = false; >>>>>> - SafepointMechanism::block_at_safepoint(thread); >>>>>> + SafepointMechanism::callback_if_safepoint(thread); >>>>>> @@ -323,2 +322,1 @@ >>>>>> - SafepointMechanism::block_at_safepoint(_thread); >>>>>> - _thread->block_in_safepoint_check = true; >>>>>> + SafepointMechanism::callback_if_safepoint(_thread); >>>>>> @@ -335,2 +332,0 @@ >>>>>> - } else { >>>>>> - _thread->block_in_safepoint_check = true; >>>>>> @@ -337,0 +334,1 @@ >>>>>> + >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>>>>> --- a/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 >>>>>> 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 >>>>>> 14:10:59 2019 +0100 >>>>>> @@ -795,1 +795,1 @@ >>>>>> -void SafepointSynchronize::block(JavaThread *thread) { >>>>>> +void SafepointSynchronize::block(JavaThread *thread, bool >>>>>> block_in_safepoint_check) { >>>>>> @@ -850,1 +850,1 @@ >>>>>> - if (thread->block_in_safepoint_check) { >>>>>> + if (block_in_safepoint_check) { >>>>>> @@ -880,1 +880,1 @@ >>>>>> - thread->block_in_safepoint_check) { >>>>>> + block_in_safepoint_check) { >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>>>>> --- a/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 >>>>>> 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 >>>>>> 14:10:59 2019 +0100 >>>>>> @@ -146,1 +146,1 @@ >>>>>> - static void block(JavaThread *thread); >>>>>> + static void block(JavaThread *thread, bool >>>>>> block_in_safepoint_check = true); >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>>>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan >>>>>> 28 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan >>>>>> 28 14:10:59 2019 +0100 >>>>>> @@ -82,1 +82,1 @@ >>>>>> - static inline void block_at_safepoint(JavaThread* thread); >>>>>> + static inline void callback_if_safepoint(JavaThread* thread); >>>>>> diff -r e65cc445234c >>>>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>>>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>>> Jan 28 13:10:15 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>>> Jan 28 14:10:59 2019 +0100 >>>>>> @@ -82,1 +82,1 @@ >>>>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>>>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>>>>> @@ -84,1 +84,1 @@ >>>>>> - SafepointSynchronize::block(thread); >>>>>> + SafepointSynchronize::block(thread, false); >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>>>>> --- a/src/hotspot/share/runtime/thread.cpp Mon Jan 28 13:10:15 >>>>>> 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/thread.cpp Mon Jan 28 14:10:59 >>>>>> 2019 +0100 >>>>>> @@ -298,2 +297,0 @@ >>>>>> - block_in_safepoint_check = true; >>>>>> - >>>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>>>>> --- a/src/hotspot/share/runtime/thread.hpp Mon Jan 28 13:10:15 >>>>>> 2019 +0100 >>>>>> +++ b/src/hotspot/share/runtime/thread.hpp Mon Jan 28 14:10:59 >>>>>> 2019 +0100 >>>>>> @@ -788,2 +787,0 @@ >>>>>> - bool block_in_safepoint_check; // to decide >>>>>> whether to block in SS::block or not >>>>>> - >>>>>> >>>>>> >>>>>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review the following patch: >>>>>>> >>>>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>>>>> >>>>>>> The current implementation of native monitors uses a technique >>>>>>> that we name "sneaky locking" to prevent possible deadlocks of >>>>>>> the JVM during safepoints. The implementation of this technique >>>>>>> though introduces a race when a monitor is shared between the >>>>>>> VMThread and non-JavaThreads. This patch aims to solve that >>>>>>> problem and at the same time simplify the code. >>>>>>> >>>>>>> The proposal is based on the introduction of the new class >>>>>>> PlatformMonitor, which serves as a wrapper for the actual >>>>>>> synchronization primitives in each platform (mutexes and >>>>>>> condition variables). Most of the API calls can thus be >>>>>>> implemented as simple wrappers around PlatformMonitor, adding >>>>>>> more assertions and very little extra metadata. >>>>>>> To be able to remove the lock sneaking code and at the same time >>>>>>> avoid deadlocking scenarios, we combine two techniques: >>>>>>> >>>>>>> -When a JavaThread that has just acquired the lock, detects there >>>>>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>>>>> releases the lock before blocking at the safepoint. After >>>>>>> resuming from it, the JavaThread will have to acquire the lock >>>>>>> again. >>>>>>> >>>>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>>>>> method, in order to avoid blocking we allow for a possible >>>>>>> safepoint request to make progress but without letting the >>>>>>> JavaThread block for it (since we would be stopped by the >>>>>>> destructor anyways). We also do that for the Monitor::lock() case >>>>>>> although no deadlock is being prevented there. >>>>>>> >>>>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition >>>>>>> class used instead of the ThreadBlockInVM one. This allowed more >>>>>>> flexibility to handle the two techniques mentioned above. Also, >>>>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>>>>> which creates some problems when trying to allow safepoints to >>>>>>> continue without stopping, since that method not only checks for >>>>>>> safepoints but also processes handshakes. >>>>>>> >>>>>>> In terms of performance, benchmarks show very similar results to >>>>>>> what we have now. >>>>>>> >>>>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>>>>> been tested. >>>>>>> >>>>>>> Thanks, >>>>>>> Patricio >>>>>>> >>>>> From stefan.karlsson at oracle.com Tue Jan 29 13:40:35 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Jan 2019 14:40:35 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <3bfbab0f-712e-1fe8-154b-13389e10c13f@oracle.com> <0e9a7b25-7c36-1091-27a2-74bd4676ed12@redhat.com> Message-ID: While looking at this I also considered if this: ?delta = max_delta; ?address low = pc - delta; ?while (delta > 0 && pc > low && !is_readable_pointer(low)) { ?? delta >>= 1; ?? low = pc - delta; ?} ?delta = max_delta; ?address high = pc + delta; ?while (delta > 0 && pc < high && !is_readable_pointer(high)) { ?? delta >>= 1; ?? high = pc + delta; ?} could be changed to: ?delta = max_delta; while (delta > 0 && !is_readable_pointer(pc - delta)) { ?? delta >>= 1; ?} ?address low = pc - delta; ?delta = max_delta; while (delta > 0 && !is_readable_pointer(pc + delta)) { ?? delta >>= 1; ?} address high = pc + delta; to simplify the loops. I realize that this removes the underflow/overflow checks, but I think the compiler might actually reduce them to true, because of the undefined behavior of underflowing/overflowing: ?address low = pc - delta; while (delta > 0 && pc > low ...) { Since the compiler can assume that 'pc - delta' doesn't underflow it can transform 'pc > low' to 'pc > pc - delta' and then to '0 > 0 - delta', which is always true when delta is positive. StefanK On 2019-01-29 12:36, Aleksey Shipilev wrote: > On 1/29/19 12:21 PM, Stefan Karlsson wrote: >> For this to work as you intend it, max_delta must not be set to page size. If max_delta is set to >> page size, the following would end up with low in Page 0 and high in Page 2: >> >> +--------+ >> | Page 0 |?? readable >> +--------+ < 'pc' beginning of Page 1 >> | Page 1|?? unreadable >> +--------+ >> | Page 2 |?? readable >> +--------+ >> >> Therefore it seems a bit off to cap with the page size in: >> MIN2(256, min_page_size()); >> >> Given that you use 256 today, this isn't really a problem, but maybe future proof this a bit and use >> min_page_size / 2 to guarantee that either low or high ends up in Page 1? > Right, thanks. To handle that corner case, let's do /2 indeed, updated webrev in-place. > > -Aleksey > From erik.osterlund at oracle.com Tue Jan 29 13:57:20 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 29 Jan 2019 14:57:20 +0100 Subject: 8216541: CompiledICHolders of VM locked unloaded nmethods are released too late In-Reply-To: References: <0345dd70-fa19-a7e3-7892-8f4500f8339f@oracle.com> <4195c9a1-e91c-1014-8fa5-2b0d3f6dfc30@oracle.com> <44b1aa28-81b0-6e9a-1b13-bb973bb5bd30@oracle.com> Message-ID: Hi Tobias, Thanks for the review! /Erik On 2019-01-29 11:40, Tobias Hartmann wrote: > Hi Erik, > > okay, got it. Thanks for the details, your fix looks good to me! > > Best regards, > Tobias > > On 29.01.19 11:38, Erik ?sterlund wrote: >> Hi Tobias, >> >> Thanks for having a look at this. >> >> On 2019-01-29 09:16, Tobias Hartmann wrote: >>> Hi Erik, >>> >>> very nice analysis, thanks a lot for investigating! >>> >>> On 28.01.19 14:56, Erik ?sterlund wrote: >>>> http://cr.openjdk.java.net/~eosterlund/8216541/webrev.00/ >>> >>> Why did you remove the call to thread->set_scanned_compiled_method(NULL) in sweeper.cpp? >> >> Because the CompiledMethodMarker destructor already nulls this out, and redundantly nulling it out >> again offers no extra protection. >> >> The idea of nulling it out before calling flush seems to have been to prevent the GC scanning from >> seeing this flushed nmethod in a safepoint, accidentally resurrecting it from the dead. But that is >> already impossible, because flush() is called with a never safepoint checking lock (which guarantees >> we don't have any and can't add any safepoint checks while holding that lock or we will deadlock >> badly). Therefore such safepoints will happen strictly after the processing of the compiled method >> is finished, and it is already cleared the normal way. >> >> By removing that pointless clearing, I could get rid of the release_compiled_method() function and >> just call flush directly instead. I get confused by there being two "destroy" functions, one in the >> sweeper and one in the nmethod, so I wanted it gone. >> >>> >>>> The proposed change has survived 200 rounds of kitchensink, hs-tier1-3 and hs-precheckin-comp. >>> >>> In the meanwhile, could you please run some more 100x iterations of kitchensink? >> >> Sure, running some more as we speak. >> >> Thanks, >> /Erik >> >>> >>> Thanks, >>> Tobias >>> From thomas.stuefe at gmail.com Tue Jan 29 14:21:55 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 15:21:55 +0100 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> Message-ID: Hi Matthias, + if (strncmp(line, keywords_to_match[i], strlen(keywords_to_match[i])) == 0) { + st->print("%s", line); + } you should break here otherwise a line containing multiple keywords will be printed multiple times. + // the LPAR / CPUs / VM - related infos usually come in blocks This comment can be removed. +#if defined(S390) I still do not like the arch specific code here, but for now I can live with it. Should this section grow and cover other architectures as well, we should fan out into os_linux_.cpp. Thanks, Thomas On Tue, Jan 29, 2019 at 12:22 PM Baesken, Matthias wrote: > Hello here is a 2nd webrev : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.2/ > > > > > > - Introduced static bool > print_matching_lines_from_sysinfo_file(outputStream* st, const char* > keywords_to_match[]) > - Moved call to Linux-only print_os_info > > > > > > Best regards, Matthias > > > > > > *From:* Thomas St?fe > *Sent:* Dienstag, 29. Januar 2019 09:23 > *To:* Baesken, Matthias ; David Holmes < > david.holmes at oracle.com> > *Cc:* hotspot-dev at openjdk.java.net > *Subject:* Re: RFR : 8217786: Provide virtualization related info in the > hs_error file on linux s390x > > > > I'm still unhappy with that solution, since we have fanned out this coding > for all architectures into the architecture independent os_linux.cpp. A > generic "Show matching lines from given file" would be a better (slimmer, > better reusable) solution IMHO. > > > > Side note: Could you please exchange strstr() .. with strncmp() since you > require the start of the string to match. So no reason to parse the whole > line if the start does not match. > > > > Cheers, Thomas > > > > On Tue, Jan 29, 2019 at 9:03 AM Baesken, Matthias < > matthias.baesken at sap.com> wrote: > > > > > > No I was thinking more about just adding the virtualization info to an > > existing step like print_os_info or print_cpu_info. > > > > Hi David , print_cpu_info does not sound like a great fit . Some info > like > > LPAR Number: 14 > LPAR Characteristics: Shared > LPAR Name: VM12 > > Does not really belong there . > > print_os_info looks better , it already contains "container_info" > on Linux, so I think this might fit . > > > Best regards, Matthias > > > > -----Original Message----- > > From: David Holmes > > Sent: Dienstag, 29. Januar 2019 05:17 > > To: Baesken, Matthias ; 'hotspot- > > dev at openjdk.java.net' > > Subject: Re: RFR : 8217786: Provide virtualization related info in the > hs_error > > file on linux s390x > > > > On 28/01/2019 10:23 pm, Baesken, Matthias wrote: > > >> > > >> Can't you include this information in an existing section of the error > > >> processing code instead of adding a new function that is empty > > >> everywhere except Linux? > > >> > > > > > > Hi David , do you mean something like > > > > > > > > > #if defined(S390) > > > > > > STEP("printing virtualization info") > > > ... > > > > > > #endif > > > > No I was thinking more about just adding the virtualization info to an > > existing step like print_os_info or print_cpu_info. > > > > Cheers, > > David > > ----- > > > > > in vmError.cpp ? > > > > > > I thought about doing this. > > > > > > > > > But on the other hand , the now still empty > > os::pd_print_virtualization_info in platforms != linux > > > might fill over time ( we could add [at least for some > platforms] other > > virtualization related info ). > > > > > > > > > Best regards, Matthias > > > > > > > > >> -----Original Message----- > > >> From: David Holmes > > >> Sent: Montag, 28. Januar 2019 12:35 > > >> To: Baesken, Matthias ; 'hotspot- > > >> dev at openjdk.java.net' > > >> Subject: Re: RFR : 8217786: Provide virtualization related info in the > > hs_error > > >> file on linux s390x > > >> > > >> Hi Matthias, > > >> > > >> On 28/01/2019 6:48 pm, Baesken, Matthias wrote: > > >>> Hello, please review this change ; it adds virtualization related > info in > > the > > >> hs_error file on linux s390x . > > >> > > >> Can't you include this information in an existing section of the error > > >> processing code instead of adding a new function that is empty > > >> everywhere except Linux? > > >> > > >> Thanks, > > >> David > > >> > > >>> On linux s390x, we usually (always?) run in virtualized > environments > > >> (LPAR and/or z/VM / KVM ). > > >>> > > >>> It is helpful for instance in support cases to get some information > about > > the > > >> virtualized environment in the hs_error file . > > >>> A lot of info can be taken from the /proc/sysinfo file on linux > s390x . > > >>> > > >>> > > >>> Bug/webrev : > > >>> > > >>> https://bugs.openjdk.java.net/browse/JDK-8217786 > > >>> > > >>> > > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > > >>> > > >>> > > >>> > > >>> Best regards, Matthias > > >>> > > From shade at redhat.com Tue Jan 29 15:27:29 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 16:27:29 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: On 1/29/19 2:20 PM, Thomas St?fe wrote: > Or, start at the pc and work your way downward byte for byte (or word for word since SafeFetch > aligns to word size anyway), squirreling the bytes away temporarily, until -256 or first fault. Then > print those bytes. You only have to do this on the first leg, on the second you work upward, the > same direction you print, so no need to temporarily store the bytes. Actually, we can use this idea for linear probing here as well. This allows to dispense the exponential steps, because we only fail twice in worst case, and it simplifies reliance on page sizes. Like this: void os::print_instructions(outputStream* st, address pc, int unitsize) { st->print_cr("Instructions: (pc=" PTR_FORMAT ")", p2i(pc)); // Probe the memory around the PC for readability. const int max_delta = 256; int delta; address low = pc, high = pc; delta = 0; while (delta <= max_delta && is_readable_pointer(pc - delta)) { low = pc - delta; delta++; } delta = 0; while (delta <= max_delta && is_readable_pointer(pc + delta)) { high = pc + delta; delta++; } if (low < high) { print_hex_dump(st, low, high, unitsize); } else { st->print_cr("Memory is not readable"); } } WDYT? -Aleksey From coleen.phillimore at oracle.com Tue Jan 29 15:59:54 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 29 Jan 2019 10:59:54 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> Message-ID: <76976f72-4218-847b-af0e-6ef7a4c41bdd@oracle.com> On 1/29/19 4:22 AM, Per Liden wrote: > Hi Patricio, > > On 01/28/2019 08:18 PM, Patricio Chilano wrote: >> Hi Robbin, >> >> Thanks for reviewing this! Removing the block_in_safepoint_check >> thread local attribute is a great idea, here is v02: >> >> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev > > I really like that we're ditching our old locking code in favor of > using pthread_mutex, et al. Nice work! > > > General comment > ---------------- > I think Mutex to be a plain mutex and not come with the baggage of > having a conditional variable. With this new code, it seems we're in a > really good position to make that happen. I.e. something like this: > > class PlatformMutex { > protected: > ? pthread_mutex_t _mutex; > > public: > ? PlatformMutex(); > ? ~PlatformMutex(); > > ? void lock(); > ? void unlock(); > ? bool try_lock(); > }; > > class PlatformMonitor : public PlatformMutex { > private: > ? pthread_cond_t _cond; > > public: > ? PlatformMonitor(); > ? ~PlatformMonitor(); > > ? int wait(jlong millis); > ? void notify(); > ? void notify_all(); > }; > > It might be that we want to do that as a separate step later instead > of including it in this patch. But I think we should try to get there. > > > src/hotspot/os/posix/os_*.[ch]pp > --------------------------------- > * I'd suggest that we place the PlatformMonitor class in a separate > file (like src/hotspot/os/posix/monitor_posix.cpp), just like we have > done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). I had this same thought, when looking through this. Coleen > > > src/hotspot/os/posix/os_posix.hpp > src/hotspot/os/solaris/os_solaris.hpp > src/hotspot/os/windows/os_windows.hpp > ------------------------------------- > * Please make _mutex/_cond plain variables, instead of arrays of 1. > That's just ugly ;) > > > src/hotspot/os/posix/os_posix.cpp > --------------------------------- > * Destructor missing, to call pthread_(mutex|cond)_destroy(). > > > src/hotspot/os/solaris/os_solaris.hpp > ------------------------------------- > * Not sure if there's a good reason to have the constructor be inlined > here. I'd suggest moving it to the cpp file. > > * Destructor missing. > > > src/hotspot/os/windows/os_windows.cpp > ------------------------------------- > * Destructor missing (I'm not too familiar with the windows API but I > assume there's a destroy function we should call here). > > > src/hotspot/share/runtime/interfaceSupport.inline.hpp > ----------------------------------------------------- > * Move "private:" above monitor_adr; > > ?289 class ThreadLockBlockInVM : public ThreadStateTransition { > ?290?? Monitor** monitor_adr; > ?291? private: > ?292?? void do_preempted(Monitor** in_flight_monitor_adr) { > > * monitor_adr should be _monitor_adr, or maybe even > _in_flight_monitor_adr to better match the name of the argument. > > > cheers, > Per > >> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >> >> Running mach5 again. >> >> Thanks, >> Patricio >> >> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>> Hi Patricio, >>> >>> Mostly looks good! >>> >>> block_at_safepoint is always called with block_in_safepoint_check = >>> true. (correct?) >>> Changing that to a local state instead of global simplifies the code. >>> >>> So I'm suggesting something like below. >>> >>> Thanks, Robbin >>> >>> diff -r e65cc445234c >>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>> 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>> 28 14:10:59 2019 +0100 >>> @@ -308,2 +308,1 @@ >>> -??? thread->block_in_safepoint_check = false; >>> -??? SafepointMechanism::block_at_safepoint(thread); >>> +??? SafepointMechanism::callback_if_safepoint(thread); >>> @@ -323,2 +322,1 @@ >>> -????? SafepointMechanism::block_at_safepoint(_thread); >>> -????? _thread->block_in_safepoint_check = true; >>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>> @@ -335,2 +332,0 @@ >>> -??? } else { >>> -????? _thread->block_in_safepoint_check = true; >>> @@ -337,0 +334,1 @@ >>> + >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -795,1 +795,1 @@ >>> -void SafepointSynchronize::block(JavaThread *thread) { >>> +void SafepointSynchronize::block(JavaThread *thread, bool >>> block_in_safepoint_check) { >>> @@ -850,1 +850,1 @@ >>> -????? if (thread->block_in_safepoint_check) { >>> +????? if (block_in_safepoint_check) { >>> @@ -880,1 +880,1 @@ >>> -????????? thread->block_in_safepoint_check) { >>> +????????? block_in_safepoint_check) { >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -146,1 +146,1 @@ >>> -? static void?? block(JavaThread *thread); >>> +? static void?? block(JavaThread *thread, bool >>> block_in_safepoint_check = true); >>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>> 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>> 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -? static inline void block_at_safepoint(JavaThread* thread); >>> +? static inline void callback_if_safepoint(JavaThread* thread); >>> diff -r e65cc445234c >>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>> Jan 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>> Jan 28 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>> @@ -84,1 +84,1 @@ >>> -??? SafepointSynchronize::block(thread); >>> +??? SafepointSynchronize::block(thread, false); >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -298,2 +297,0 @@ >>> -? block_in_safepoint_check = true; >>> - >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -788,2 +787,0 @@ >>> -? bool block_in_safepoint_check;????????????? // to decide whether >>> to block in SS::block or not >>> - >>> >>> >>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>> Hi all, >>>> >>>> Please review the following patch: >>>> >>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>> >>>> The current implementation of native monitors uses a technique that >>>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>>> during safepoints. The implementation of this technique though >>>> introduces a race when a monitor is shared between the VMThread and >>>> non-JavaThreads. This patch aims to solve that problem and at the >>>> same time simplify the code. >>>> >>>> The proposal is based on the introduction of the new class >>>> PlatformMonitor, which serves as a wrapper for the actual >>>> synchronization primitives in each platform (mutexes and condition >>>> variables). Most of the API calls can thus be implemented as simple >>>> wrappers around PlatformMonitor, adding more assertions and very >>>> little extra metadata. >>>> To be able to remove the lock sneaking code and at the same time >>>> avoid deadlocking scenarios, we combine two techniques: >>>> >>>> -When a JavaThread that has just acquired the lock, detects there >>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>> releases the lock before blocking at the safepoint. After resuming >>>> from it, the JavaThread will have to acquire the lock again. >>>> >>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>> method, in order to avoid blocking we allow for a possible >>>> safepoint request to make progress but without letting the >>>> JavaThread block for it (since we would be stopped by the >>>> destructor anyways). We also do that for the Monitor::lock() case >>>> although no deadlock is being prevented there. >>>> >>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>>> used instead of the ThreadBlockInVM one. This allowed more >>>> flexibility to handle the two techniques mentioned above. Also, >>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>> which creates some problems when trying to allow safepoints to >>>> continue without stopping, since that method not only checks for >>>> safepoints but also processes handshakes. >>>> >>>> In terms of performance, benchmarks show very similar results to >>>> what we have now. >>>> >>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>> been tested. >>>> >>>> Thanks, >>>> Patricio >>>> >> From daniel.daugherty at oracle.com Tue Jan 29 16:00:16 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 29 Jan 2019 11:00:16 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <60daed05-61d7-053a-35b4-d5b6582ea0a1@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <60daed05-61d7-053a-35b4-d5b6582ea0a1@oracle.com> Message-ID: <10c59adf-79b9-ba82-228d-1d87a0668769@oracle.com> Hi Patricio, I'm stripping this down to just the possible issue in mutex.cpp... On 1/28/19 6:13 PM, Patricio Chilano wrote: > On 1/28/19 4:29 PM, Daniel D. Daugherty wrote: >> On 1/28/19 2:18 PM, Patricio Chilano wrote: >>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >> src/hotspot/share/runtime/mutex.cpp >> L205: ??????? jt->java_suspend_self(); >> ??? L206: ????? } >> ??? L207: ??? } >> ??? L208: >> ??? L209: ??? if (in_flight_monitor != NULL) { >> ??? L210: ????? // Conceptually reestablish ownership of the lock. >> ??? L211: ????? assert(_owner == NULL, "should be NULL: owner=" >> INTPTR_FORMAT, p2i(_owner)); >> ??? L212: ????? set_owner(self); >> ??? L213: ??? } else { >> ??? L214: ????? lock(); >> ??? L215: ??? } >> ??????? The lock reacquire on L214 used to be done after the >> ??????? java_suspend_self() on L205 which is inside the block >> ??????? context for the ThreadLockBlockInVM and OSThreadWaitState >> ??????? helps. If the lock() blocks due to a racing thread, then >> ??????? the calling JavaThread won't have the right thread state >> ??????? of OS thread wait state, etc... >> So after java_suspend_self(), you have to re-lock() so that >> ??????? a potential block on that re-lock has the right states. This >> ??????? also means that this line has to be deleted: >> >> ??????? L204???????? in_flight_monitor = NULL; >> >> ??????? so that ThreadLockBlockInVM destructor can do the right thing >> ??????? if the thread is suspended and then relocks. > Yes, I see your point. The problem is that after executing the TLBIVM > destructor (which executes after the OSThreadWaitState destructor with > the current order)? there is always the possibility that we had to > release the lock, and so afterwards we will have to re-acquire it with > a different state. One simple way of solving this could be moving the > OSThreadWaitState object outside the TLBIVM block. Based on David's > comment about OSThreadWaitState I don't think changing the order > should break things, since it seems more like a debugging tool. What > do you think then of doing something like this: (I also included a > re-lock after java_suspend_self() and removed in_flight_monitor=NULL > as you suggested.) > > diff --git a/src/hotspot/share/runtime/mutex.cpp > b/src/hotspot/share/runtime/mutex.cpp > --- a/src/hotspot/share/runtime/mutex.cpp > +++ b/src/hotspot/share/runtime/mutex.cpp > @@ -182,9 +186,9 @@ > ???? JavaThread *jt = (JavaThread *)self; > ???? Monitor* in_flight_monitor = NULL; > > +??? OSThreadWaitState osts(self->osthread(), false /* not > Object.wait() */); > ???? { > ?????? ThreadLockBlockInVM tlbivm(jt, &in_flight_monitor); > -????? OSThreadWaitState osts(self->osthread(), false /* not > Object.wait() */); > ?????? if (as_suspend_equivalent) { > ???????? jt->set_suspend_equivalent(); > ???????? // cleared by handle_special_suspend_equivalent_condition() or > @@ -201,8 +205,8 @@ > ???????? // want to hold the lock while suspended because that > ???????? // would surprise the thread that suspended us. > ???????? _lock.unlock(); > -??????? in_flight_monitor = NULL; > ???????? jt->java_suspend_self(); > +??????? _lock.lock(); > ?????? } > ???? } > >> Also after lock() on L214, you never call set_owner(self) >> ??????? so the ownership is not complete for that relocated code. >> ??????? You won't need the set_owner(self) call if you move the >> ??????? lock() on L214 back to after java_suspend_self(). > That lock() is actually Monitor::lock(). But I agree is confusing, > every time I look at it I say the same. Maybe I can rewrite it as > Monitor::lock() ? The fact that the lock() on L214 is really Monitor::lock() changes everything with respect to my comment. I'm sorry I missed that crucial detail. Please ignore the above from me and undo the changes that I suggested. Here's a new set of comments: ??? L195: ????? in_flight_monitor = this; ? ? ? ? Please consider adding a comment: ??????????????? in_flight_monitor = this;? // save for ~ThreadLockBlockInVM ??? L204: ??????? in_flight_monitor = NULL; ??????? Please consider adding a comment: ????????????????? in_flight_monitor = this;? // ~ThreadLockBlockInVM does not need to unlock ??? L209: ??? if (in_flight_monitor != NULL) { ??????? Please consider adding a comment below L209: ??????????????? // Not unlocked by ~ThreadLockBlockInVM or self-suspend above. ??? L214: ????? lock(); ??????? I missed the fact that this is Monitor::lock(). However, the ??????? problem with Monitor::lock() here is that if this call blocks, ??????? then we don't have a ThreadLockBlockInVM to change our state ??????? so we won't be recognized as blocking. ??????? Please consider changing to: ??????????????? Monitor::lock(self); ??????? that will give us full locking semantics. Okay, I think this new set of comments is on the right track, but obviously I need other reviewers to point out my silliness... :-) I am a bit worried about whether our use of a ThreadLockBlockInVM in wait() and another in lock(self) on this code path could cause any confusion by observing threads. I don't think so, but I haven't traced it all out from beginning to end yet. Dan From thomas.stuefe at gmail.com Tue Jan 29 16:34:56 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 17:34:56 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: On Tue, Jan 29, 2019 at 4:27 PM Aleksey Shipilev wrote: > On 1/29/19 2:20 PM, Thomas St?fe wrote: > > Or, start at the pc and work your way downward byte for byte (or word > for word since SafeFetch > > aligns to word size anyway), squirreling the bytes away temporarily, > until -256 or first fault. Then > > print those bytes. You only have to do this on the first leg, on the > second you work upward, the > > same direction you print, so no need to temporarily store the bytes. > > Actually, we can use this idea for linear probing here as well. This > allows to dispense the > exponential steps, because we only fail twice in worst case, and it > simplifies reliance on page > sizes. Like this: > > void os::print_instructions(outputStream* st, address pc, int unitsize) { > st->print_cr("Instructions: (pc=" PTR_FORMAT ")", p2i(pc)); > > // Probe the memory around the PC for readability. > const int max_delta = 256; > int delta; > address low = pc, high = pc; > > delta = 0; > while (delta <= max_delta && is_readable_pointer(pc - delta)) { > low = pc - delta; > delta++; > } > > delta = 0; > while (delta <= max_delta && is_readable_pointer(pc + delta)) { > high = pc + delta; > delta++; > } > > if (low < high) { > print_hex_dump(st, low, high, unitsize); > } else { > st->print_cr("Memory is not readable"); > } > } > > WDYT? > Even better. No need to store those bytes on the first leg. Note that your approach assumes that the readable addresses stay readable, and since we are in error handling all bets are open, someone may unmap the pages concurrently right after we probed them and before printing. But well, this is error handling and we have secondary crash handling and there is a thing like too much paranoia :) ..thomas > > -Aleksey > > From boris.ulasevich at bell-sw.com Tue Jan 29 16:43:59 2019 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Tue, 29 Jan 2019 19:43:59 +0300 Subject: RFR 8217647: JFR: wrong recording on 32-bit systems Message-ID: Hi, Can I please have a review for the following FlightRecorder fix: http://cr.openjdk.java.net/~bulasevich/8217647/webrev.01 https://bugs.openjdk.java.net/browse/JDK-8217647 The issue is about intptr_t/jlong size mismatch on 32-bit systems. The essence of the fix is is to specify jlong type implicitly when storing data to jfr recording, plus minor types rearrangement was done to avoid unnecessary type conversions. Testing: JFR tests on Linux x64/x32/arm64/arm32, Mach5, Mission Control. thanks, Boris From shade at redhat.com Tue Jan 29 16:54:15 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 17:54:15 +0100 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory Message-ID: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> RFE: https://bugs.openjdk.java.net/browse/JDK-8217994 Fix: http://cr.openjdk.java.net/~shade/8217994/webrev.01/ This is related to JDK-8217879 (hs_err should print more instructions in hex dump), and this more generic fix should cover more cases in error handler. New gtest verifies we can call os::p_h_d on bad memory now. It also implicitly verifies that SafeFetch machinery works fine. Consider running that gtest (make images run-test TEST=gtest:os) on your platform if you suspect it does not. Testing: (Linux, Windows) x86_64 build, gtest, eyeballing gtest output Thanks, -Aleksey From matthias.baesken at sap.com Tue Jan 29 16:56:44 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 29 Jan 2019 16:56:44 +0000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> Message-ID: Hello, I added a break to avoid potential printing lines multiple times, and removed the comment line : http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.3/ Best regards, Matthias From: Thomas St?fe Sent: Dienstag, 29. Januar 2019 15:22 To: Baesken, Matthias Cc: David Holmes ; hotspot-dev at openjdk.java.net Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x Hi Matthias, + if (strncmp(line, keywords_to_match[i], strlen(keywords_to_match[i])) == 0) { + st->print("%s", line); + } you should break here otherwise a line containing multiple keywords will be printed multiple times. + // the LPAR / CPUs / VM - related infos usually come in blocks This comment can be removed. +#if defined(S390) I still do not like the arch specific code here, but for now I can live with it. Should this section grow and cover other architectures as well, we should fan out into os_linux_.cpp. Thanks, Thomas On Tue, Jan 29, 2019 at 12:22 PM Baesken, Matthias > wrote: Hello here is a 2nd webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.2/ * Introduced static bool print_matching_lines_from_sysinfo_file(outputStream* st, const char* keywords_to_match[]) * Moved call to Linux-only print_os_info Best regards, Matthias From: Thomas St?fe > Sent: Dienstag, 29. Januar 2019 09:23 To: Baesken, Matthias >; David Holmes > Cc: hotspot-dev at openjdk.java.net Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x I'm still unhappy with that solution, since we have fanned out this coding for all architectures into the architecture independent os_linux.cpp. A generic "Show matching lines from given file" would be a better (slimmer, better reusable) solution IMHO. Side note: Could you please exchange strstr() .. with strncmp() since you require the start of the string to match. So no reason to parse the whole line if the start does not match. Cheers, Thomas On Tue, Jan 29, 2019 at 9:03 AM Baesken, Matthias > wrote: > > No I was thinking more about just adding the virtualization info to an > existing step like print_os_info or print_cpu_info. > Hi David , print_cpu_info does not sound like a great fit . Some info like LPAR Number: 14 LPAR Characteristics: Shared LPAR Name: VM12 Does not really belong there . print_os_info looks better , it already contains "container_info" on Linux, so I think this might fit . Best regards, Matthias > -----Original Message----- > From: David Holmes > > Sent: Dienstag, 29. Januar 2019 05:17 > To: Baesken, Matthias >; 'hotspot- > dev at openjdk.java.net' > > Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error > file on linux s390x > > On 28/01/2019 10:23 pm, Baesken, Matthias wrote: > >> > >> Can't you include this information in an existing section of the error > >> processing code instead of adding a new function that is empty > >> everywhere except Linux? > >> > > > > Hi David , do you mean something like > > > > > > #if defined(S390) > > > > STEP("printing virtualization info") > > ... > > > > #endif > > No I was thinking more about just adding the virtualization info to an > existing step like print_os_info or print_cpu_info. > > Cheers, > David > ----- > > > in vmError.cpp ? > > > > I thought about doing this. > > > > > > But on the other hand , the now still empty > os::pd_print_virtualization_info in platforms != linux > > might fill over time ( we could add [at least for some platforms] other > virtualization related info ). > > > > > > Best regards, Matthias > > > > > >> -----Original Message----- > >> From: David Holmes > > >> Sent: Montag, 28. Januar 2019 12:35 > >> To: Baesken, Matthias >; 'hotspot- > >> dev at openjdk.java.net' > > >> Subject: Re: RFR : 8217786: Provide virtualization related info in the > hs_error > >> file on linux s390x > >> > >> Hi Matthias, > >> > >> On 28/01/2019 6:48 pm, Baesken, Matthias wrote: > >>> Hello, please review this change ; it adds virtualization related info in > the > >> hs_error file on linux s390x . > >> > >> Can't you include this information in an existing section of the error > >> processing code instead of adding a new function that is empty > >> everywhere except Linux? > >> > >> Thanks, > >> David > >> > >>> On linux s390x, we usually (always?) run in virtualized environments > >> (LPAR and/or z/VM / KVM ). > >>> > >>> It is helpful for instance in support cases to get some information about > the > >> virtualized environment in the hs_error file . > >>> A lot of info can be taken from the /proc/sysinfo file on linux s390x . > >>> > >>> > >>> Bug/webrev : > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8217786 > >>> > >>> > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > >>> > >>> > >>> > >>> Best regards, Matthias > >>> From coleen.phillimore at oracle.com Tue Jan 29 16:59:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 29 Jan 2019 11:59:04 -0500 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <3e020446-c08c-e318-99a3-b078d76f7c47@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <29e645ac-983a-6d69-10a4-1bdd2e09f3f3@oracle.com> <4d84bfd0-02a0-58f8-b8f8-61104df26deb@oracle.com> <3e020446-c08c-e318-99a3-b078d76f7c47@redhat.com> Message-ID: On 1/29/19 5:34 AM, Aleksey Shipilev wrote: > On 1/28/19 9:59 PM, coleen.phillimore at oracle.com wrote: >> This seems fine. >> >> I was looking at this page: >> https://onlinedisassembler.com/odaweb/ >> >> What would be nice if the instructions looked like: >> >> Instructions: (pc=0x00007f911d0d6053, 0x00007f911d0d6143) >> >> and then just the hex dump, then I could cut/paste it into the window in that tool.? Or is there >> another way? > That's not unreasonable, but you need to have the PC marked in some way to see where it crashed. > Right now I can read the hex dump and figure what are the bytes at PC, and cross-reference that with > disassembly. Yes, that requires some hand work to remove the offsets when dumping to disassembler. ok, too bad. Coleen > > It also requires changing the os::print_hex_dump, which I am not very eager to do. In some sense, > this should be the RFE against the disassembler itself, rather than our dumping code :) > > -Aleksey > From zgu at redhat.com Tue Jan 29 17:01:14 2019 From: zgu at redhat.com (zgu at redhat.com) Date: Tue, 29 Jan 2019 12:01:14 -0500 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> Message-ID: <1548781274.31327.71.camel@redhat.com> Looks good to me. But you may need "default" switch to make some compilers happy. Thanks, -Zhengyu On Tue, 2019-01-29 at 17:54 +0100, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217994 > > Fix: > http://cr.openjdk.java.net/~shade/8217994/webrev.01/ > > This is related to JDK-8217879 (hs_err should print more instructions > in hex dump), and this more > generic fix should cover more cases in error handler. New gtest > verifies we can call os::p_h_d on > bad memory now. It also implicitly verifies that SafeFetch machinery > works fine. Consider running > that gtest (make images run-test TEST=gtest:os) on your platform if > you suspect it does not. > > Testing: (Linux, Windows) x86_64 build, gtest, eyeballing gtest > output > > Thanks, > -Aleksey > From shade at redhat.com Tue Jan 29 17:06:23 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 18:06:23 +0100 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: <1548781274.31327.71.camel@redhat.com> References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> <1548781274.31327.71.camel@redhat.com> Message-ID: <5b8a282c-2f26-c50f-00f0-c40fd4aeb9da@redhat.com> On 1/29/19 6:01 PM, zgu at redhat.com wrote: > Looks good to me. Thanks. > But you may need "default" switch to make some compilers happy. It follows the form of the existing switch. I also think we never actually reach any default label in those switches, because there is an entry switch that returns on default. -Aleksey From shade at redhat.com Tue Jan 29 17:12:53 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 18:12:53 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: On 1/29/19 5:34 PM, Thomas St?fe wrote: > Even better. No need to store those bytes on the first leg.? This would be webrev.05: http://cr.openjdk.java.net/~shade/8217879/webrev.05/ But I wonder if we should stop this madness and just fix os::print_hex_dump, as you suggested earlier: https://bugs.openjdk.java.net/browse/JDK-8217994 -Aleksey From shade at redhat.com Tue Jan 29 17:15:03 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 18:15:03 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <3bfbab0f-712e-1fe8-154b-13389e10c13f@oracle.com> <0e9a7b25-7c36-1091-27a2-74bd4676ed12@redhat.com> Message-ID: <741653d3-9157-bcfe-526a-3d3ed9ba7379@redhat.com> On 1/29/19 2:40 PM, Stefan Karlsson wrote: > While looking at this I also considered if this: I think we can get rid of exponential probing at all, so we don't need to concern outselves with optimizing those loops too much, see the other branch in this thread: http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036493.html -Aleksey From thomas.stuefe at gmail.com Tue Jan 29 17:27:18 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 18:27:18 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: On Tue, Jan 29, 2019 at 6:12 PM Aleksey Shipilev wrote: > On 1/29/19 5:34 PM, Thomas St?fe wrote: > > Even better. No need to store those bytes on the first leg. > > This would be webrev.05: > http://cr.openjdk.java.net/~shade/8217879/webrev.05/ > > Looks fine. You could move calculation of low/high out of the loops though. If you want to go with this one, I do not need another webrev. > But I wonder if we should stop this madness and just fix > os::print_hex_dump, as you suggested earlier: > https://bugs.openjdk.java.net/browse/JDK-8217994 Sure this would make sense. I have a half finished version somewhere but I got carried away and wanted it to work with unaligned pointers/lengths and all unit sizes and be super fast and look nice and then I forgot about the whole thing and did something different :P ..Thomas > > -Aleksey > > > > From zgu at redhat.com Tue Jan 29 17:31:57 2019 From: zgu at redhat.com (zgu at redhat.com) Date: Tue, 29 Jan 2019 12:31:57 -0500 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: <5b8a282c-2f26-c50f-00f0-c40fd4aeb9da@redhat.com> References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> <1548781274.31327.71.camel@redhat.com> <5b8a282c-2f26-c50f-00f0-c40fd4aeb9da@redhat.com> Message-ID: <1548783117.31327.76.camel@redhat.com> On Tue, 2019-01-29 at 18:06 +0100, Aleksey Shipilev wrote: > On 1/29/19 6:01 PM, zgu at redhat.com wrote: > > Looks good to me. > > Thanks. > > > But you may need "default" switch to make some compilers happy. > > It follows the form of the existing switch. I also think we never > actually reach any default label > in those switches, because there is an entry switch that returns on > default. Got it. Thanks, -Zhengyu > > -Aleksey > From shade at redhat.com Tue Jan 29 17:33:32 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 29 Jan 2019 18:33:32 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> Message-ID: <2dff1a27-289f-82b2-0042-1fedd74d0593@redhat.com> On 1/29/19 6:27 PM, Thomas St?fe wrote: > On Tue, Jan 29, 2019 at 6:12 PM Aleksey Shipilev > wrote: > > On 1/29/19 5:34 PM, Thomas St?fe wrote: > > Even better. No need to store those bytes on the first leg.? > > This would be webrev.05: > ? http://cr.openjdk.java.net/~shade/8217879/webrev.05/ > > Looks fine. You could move calculation of low/high out of the loops though. > If you want to go with this one, I do not need another webrev. You cannot that easily? Having calculation is the loop guarantees low/high are definitely readable. You can do this outside the loop, with +1/-1 to delta, but that sets us up for the off-by-one errors... -Aleksey From thomas.stuefe at gmail.com Tue Jan 29 17:46:25 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 18:46:25 +0100 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> Message-ID: Looks good. Note that this coding assumes (always did) that the input pointer is aligned to the unitsize, otherwise the printing would not work on platforms which do not allow unaligned loads. This means that if the pc in the ucontext is unaligned rubbish we may crash on platforms where we print with a unitsize > 1 and unaligned access is not allowed. But your patch does not make the problem worse, so it is fine to me. Cheers, Thomas On Tue, Jan 29, 2019 at 5:54 PM Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217994 > > Fix: > http://cr.openjdk.java.net/~shade/8217994/webrev.01/ > > This is related to JDK-8217879 (hs_err should print more instructions in > hex dump), and this more > generic fix should cover more cases in error handler. New gtest verifies > we can call os::p_h_d on > bad memory now. It also implicitly verifies that SafeFetch machinery works > fine. Consider running > that gtest (make images run-test TEST=gtest:os) on your platform if you > suspect it does not. > > Testing: (Linux, Windows) x86_64 build, gtest, eyeballing gtest output > > Thanks, > -Aleksey > > From thomas.stuefe at gmail.com Tue Jan 29 17:48:10 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Jan 2019 18:48:10 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <2dff1a27-289f-82b2-0042-1fedd74d0593@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> <2dff1a27-289f-82b2-0042-1fedd74d0593@redhat.com> Message-ID: On Tue, Jan 29, 2019 at 6:33 PM Aleksey Shipilev wrote: > On 1/29/19 6:27 PM, Thomas St?fe wrote: > > On Tue, Jan 29, 2019 at 6:12 PM Aleksey Shipilev > wrote: > > > > On 1/29/19 5:34 PM, Thomas St?fe wrote: > > > Even better. No need to store those bytes on the first leg. > > > > This would be webrev.05: > > http://cr.openjdk.java.net/~shade/8217879/webrev.05/ > > > > Looks fine. You could move calculation of low/high out of the loops > though. > > If you want to go with this one, I do not need another webrev. > > You cannot that easily? Having calculation is the loop guarantees low/high > are definitely readable. > You can do this outside the loop, with +1/-1 to delta, but that sets us up > for the off-by-one errors... > > Oh, okay. This is fine to me then. ..Thomas > -Aleksey > > From stefan.karlsson at oracle.com Tue Jan 29 19:04:53 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Jan 2019 20:04:53 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <741653d3-9157-bcfe-526a-3d3ed9ba7379@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <3bfbab0f-712e-1fe8-154b-13389e10c13f@oracle.com> <0e9a7b25-7c36-1091-27a2-74bd4676ed12@redhat.com> <741653d3-9157-bcfe-526a-3d3ed9ba7379@redhat.com> Message-ID: <38e51f52-b0a6-b969-a237-86a82c700dbf@oracle.com> On 2019-01-29 18:15, Aleksey Shipilev wrote: > On 1/29/19 2:40 PM, Stefan Karlsson wrote: >> While looking at this I also considered if this: > I think we can get rid of exponential probing at all, so we don't need to concern outselves with > optimizing those loops too much, see the other branch in this thread: > http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036493.html Yeah. That's seems to remove some of the (readability) overhead and UB. Looks good. StefanK > > -Aleksey > From coleen.phillimore at oracle.com Tue Jan 29 19:39:15 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 29 Jan 2019 14:39:15 -0500 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading Message-ID: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> Summary: remove gc timing for short runtime cleanup triggering; make symbol table cleaning triggered automatically on unloading Ran runThese with all Oracle GCs and got similar numbers of symbols unloaded.? Also ran tier1-5. See bug for more information. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213753.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8213753 Thanks, Coleen From david.holmes at oracle.com Tue Jan 29 21:35:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Jan 2019 07:35:36 +1000 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <9bbf5ba9-ac9e-8b1b-8f57-4bd4548a00ff@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <2e011b1d-afb2-c2ab-d083-efe74bd27b27@oracle.com> <26e4d98a-62c5-35a8-9cf2-a1d9618b453b@oracle.com> <21edd9b0-bce2-732f-6550-6439ba03348d@oracle.com> <9bbf5ba9-ac9e-8b1b-8f57-4bd4548a00ff@oracle.com> Message-ID: <09c80d33-92ce-7e2c-799d-ad1828ad779e@oracle.com> Hi Erik, On 29/01/2019 11:18 pm, Erik ?sterlund wrote: > Hi David, > > I don't think we can get away with not deleting monitors. Because we do > that all over the place. Right. I had a mental model that placed the new PlatformMonitor in the same type-stable-memory basket as PlatformEvent and PlatformParker. That was an oversight on my part. > Another example of deleted monitor is the SR_lock. We have one per > thread, and delete them when the thread is deleted. SMR makes sure it is > deleted when nobody is using it any longer. All sane. Great example as it highlights my point. Deleting the _SR_lock was buggy until you came up with SMR. It's very easy to get things wrong here, and each usecase will have different means by which they need to ensure the Mutex/Monitor is actually safe to delete. My concern is that we may have existing races (as we did with _SR_lock) but that deleting the current Monitor/mutex instances is quite benign and may allow use-after-delete to not crash. The new code is not so benign and may not be so forgiving if there are existing bugs in this area. > The extra_data_lock() of MDOs belongs to MethodData objects, and is > deleted when the MethodData is deleted (at which point nobody can be > using it any longer). All sane. Forgive my skepticism but having been reading recent bug reports about nmethod cleanup and related data structures, it is very easy to have a scheme that appears "All sane" but is in fact subtly broken. Any way bottom line is yes of course we have to continue to allow Mutex/monitor destruction. But I am concerned we may now trip over existing bugs with such destruction. Cheers, David ----- > > There is plenty more. > > I think it's up to the user of the API to make sure locks are deleted > only when they are not being used concurrently. We have plenty of tools > to ensure this today (GlobalCounter, thread-local handshakes, > safepoints, hazard pointers, outer locks, etc) Perhaps a suitable assert > could catch errors where locking races with deletion, so the user can > choose what tool to use to ensure its safety. > > Thanks, > /Erik > > On 2019-01-29 13:13, David Holmes wrote: >> On 29/01/2019 8:26 pm, Per Liden wrote: >>> On 01/29/2019 10:53 AM, David Holmes wrote: >>>> Hi Per, >>>> >>>> If I may jump in on one thing you suggest ... destructors. Do we >>>> ever actually destroy Mutex or Monitor instances? There are inherent >>>> races that can make it very dangerous to try and actually delete the >>>> low-level PlatformMonitor and destroy the pthread_mutex or >>>> pthread_cond, and even release the memory used. The related >>>> PlatformEvent and PlatformParker are expected to be immortal and I >>>> think that is the same for PlatformMonitor. >>> >>> There are examples of where we do destroy Mutex instances today (like >>> JVM_RawMonitorDestroy), and I don't think that's something the API >>> should forbid. >> >> RawMonitors are used very rarely and I would not have much confidence >> that their destruction is actually done in a safe manner. >> >>> >>> As Robbin hinted, I'm hoping ZLock (which is just a plain >>> pthread_mutex) in ZGC can be converted to be a plain mutex using >>> PlatforMutex in the future. ZLocks require dynamic >>> creation/destruction to be supported as they are e.g. attached to >>> nmethods which can come and go. >> >> I think this is something that requires some detailed consideration. >> You need to have very robust lifecycle management to be able to delete >> the low-level implementation objects safely. (It's not sufficient for >> example to test the ability to destroy a pthread_mutex by checking if >> you can first acquire it (and release it again) then destroy it and >> deallocate it, as the previous owner could still be executing inside >> pthread_mutex_unlock!). >> >> In the context of the current changes though we do have an >> inconsistency between the high-level Mutex/Monitor and the >> PlatformMonitor. This was an oversight on my part when doing some >> earlier work on this. >> >> David >> ----- >> >>>> >>>> Aside: I don't think distinct PlatformMutex and PlatforMonitor is >>>> worth the effort unless we also rework the Mutex/Monitor >>>> relationship as well. >>> >>> I completely agree, Mutex/Monitor would also need to be reworked a bit. >>> >>> cheers, >>> Per >>> >>>> >>>> Cheers, >>>> David >>>> >>>> On 29/01/2019 7:22 pm, Per Liden wrote: >>>>> Hi Patricio, >>>>> >>>>> On 01/28/2019 08:18 PM, Patricio Chilano wrote: >>>>>> Hi Robbin, >>>>>> >>>>>> Thanks for reviewing this! Removing the block_in_safepoint_check >>>>>> thread local attribute is a great idea, here is v02: >>>>>> >>>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >>>>> >>>>> I really like that we're ditching our old locking code in favor of >>>>> using pthread_mutex, et al. Nice work! >>>>> >>>>> >>>>> General comment >>>>> ---------------- >>>>> I think Mutex to be a plain mutex and not come with the baggage of >>>>> having a conditional variable. With this new code, it seems we're >>>>> in a really good position to make that happen. I.e. something like >>>>> this: >>>>> >>>>> class PlatformMutex { >>>>> protected: >>>>> ?? pthread_mutex_t _mutex; >>>>> >>>>> public: >>>>> ?? PlatformMutex(); >>>>> ?? ~PlatformMutex(); >>>>> >>>>> ?? void lock(); >>>>> ?? void unlock(); >>>>> ?? bool try_lock(); >>>>> }; >>>>> >>>>> class PlatformMonitor : public PlatformMutex { >>>>> private: >>>>> ?? pthread_cond_t _cond; >>>>> >>>>> public: >>>>> ?? PlatformMonitor(); >>>>> ?? ~PlatformMonitor(); >>>>> >>>>> ?? int wait(jlong millis); >>>>> ?? void notify(); >>>>> ?? void notify_all(); >>>>> }; >>>>> >>>>> It might be that we want to do that as a separate step later >>>>> instead of including it in this patch. But I think we should try to >>>>> get there. >>>>> >>>>> >>>>> src/hotspot/os/posix/os_*.[ch]pp >>>>> --------------------------------- >>>>> * I'd suggest that we place the PlatformMonitor class in a separate >>>>> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we >>>>> have done with Semaphore (in >>>>> src/hotspot/os/posix/semaphore_posix.cpp). >>>>> >>>>> >>>>> src/hotspot/os/posix/os_posix.hpp >>>>> src/hotspot/os/solaris/os_solaris.hpp >>>>> src/hotspot/os/windows/os_windows.hpp >>>>> ------------------------------------- >>>>> * Please make _mutex/_cond plain variables, instead of arrays of 1. >>>>> That's just ugly ;) >>>>> >>>>> >>>>> src/hotspot/os/posix/os_posix.cpp >>>>> --------------------------------- >>>>> * Destructor missing, to call pthread_(mutex|cond)_destroy(). >>>>> >>>>> >>>>> src/hotspot/os/solaris/os_solaris.hpp >>>>> ------------------------------------- >>>>> * Not sure if there's a good reason to have the constructor be >>>>> inlined here. I'd suggest moving it to the cpp file. >>>>> >>>>> * Destructor missing. >>>>> >>>>> >>>>> src/hotspot/os/windows/os_windows.cpp >>>>> ------------------------------------- >>>>> * Destructor missing (I'm not too familiar with the windows API but >>>>> I assume there's a destroy function we should call here). >>>>> >>>>> >>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>> ----------------------------------------------------- >>>>> * Move "private:" above monitor_adr; >>>>> >>>>> ? 289 class ThreadLockBlockInVM : public ThreadStateTransition { >>>>> ? 290?? Monitor** monitor_adr; >>>>> ? 291? private: >>>>> ? 292?? void do_preempted(Monitor** in_flight_monitor_adr) { >>>>> >>>>> * monitor_adr should be _monitor_adr, or maybe even >>>>> _in_flight_monitor_adr to better match the name of the argument. >>>>> >>>>> >>>>> cheers, >>>>> Per >>>>> >>>>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >>>>>> >>>>>> Running mach5 again. >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>> >>>>>> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>>>>>> Hi Patricio, >>>>>>> >>>>>>> Mostly looks good! >>>>>>> >>>>>>> block_at_safepoint is always called with block_in_safepoint_check >>>>>>> = true. (correct?) >>>>>>> Changing that to a local state instead of global simplifies the >>>>>>> code. >>>>>>> >>>>>>> So I'm suggesting something like below. >>>>>>> >>>>>>> Thanks, Robbin >>>>>>> >>>>>>> diff -r e65cc445234c >>>>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>>>> Mon Jan 28 13:10:15 2019 +0100 >>>>>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>>>> Mon Jan 28 14:10:59 2019 +0100 >>>>>>> @@ -308,2 +308,1 @@ >>>>>>> -??? thread->block_in_safepoint_check = false; >>>>>>> -??? SafepointMechanism::block_at_safepoint(thread); >>>>>>> +??? SafepointMechanism::callback_if_safepoint(thread); >>>>>>> @@ -323,2 +322,1 @@ >>>>>>> -????? SafepointMechanism::block_at_safepoint(_thread); >>>>>>> -????? _thread->block_in_safepoint_check = true; >>>>>>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>>>>>> @@ -335,2 +332,0 @@ >>>>>>> -??? } else { >>>>>>> -????? _thread->block_in_safepoint_check = true; >>>>>>> @@ -337,0 +334,1 @@ >>>>>>> + >>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>>>>>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 >>>>>>> 13:10:15 2019 +0100 >>>>>>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 >>>>>>> 14:10:59 2019 +0100 >>>>>>> @@ -795,1 +795,1 @@ >>>>>>> -void SafepointSynchronize::block(JavaThread *thread) { >>>>>>> +void SafepointSynchronize::block(JavaThread *thread, bool >>>>>>> block_in_safepoint_check) { >>>>>>> @@ -850,1 +850,1 @@ >>>>>>> -????? if (thread->block_in_safepoint_check) { >>>>>>> +????? if (block_in_safepoint_check) { >>>>>>> @@ -880,1 +880,1 @@ >>>>>>> -????????? thread->block_in_safepoint_check) { >>>>>>> +????????? block_in_safepoint_check) { >>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>>>>>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 >>>>>>> 13:10:15 2019 +0100 >>>>>>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 >>>>>>> 14:10:59 2019 +0100 >>>>>>> @@ -146,1 +146,1 @@ >>>>>>> -? static void?? block(JavaThread *thread); >>>>>>> +? static void?? block(JavaThread *thread, bool >>>>>>> block_in_safepoint_check = true); >>>>>>> diff -r e65cc445234c >>>>>>> src/hotspot/share/runtime/safepointMechanism.hpp >>>>>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan >>>>>>> 28 13:10:15 2019 +0100 >>>>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan >>>>>>> 28 14:10:59 2019 +0100 >>>>>>> @@ -82,1 +82,1 @@ >>>>>>> -? static inline void block_at_safepoint(JavaThread* thread); >>>>>>> +? static inline void callback_if_safepoint(JavaThread* thread); >>>>>>> diff -r e65cc445234c >>>>>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>>>>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>>>> Jan 28 13:10:15 2019 +0100 >>>>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>>>> Jan 28 14:10:59 2019 +0100 >>>>>>> @@ -82,1 +82,1 @@ >>>>>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>>>>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* >>>>>>> thread) { >>>>>>> @@ -84,1 +84,1 @@ >>>>>>> -??? SafepointSynchronize::block(thread); >>>>>>> +??? SafepointSynchronize::block(thread, false); >>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>>>>>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>>>>>> 2019 +0100 >>>>>>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>>>>>> 2019 +0100 >>>>>>> @@ -298,2 +297,0 @@ >>>>>>> -? block_in_safepoint_check = true; >>>>>>> - >>>>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>>>>>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>>>>>> 2019 +0100 >>>>>>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>>>>>> 2019 +0100 >>>>>>> @@ -788,2 +787,0 @@ >>>>>>> -? bool block_in_safepoint_check;????????????? // to decide >>>>>>> whether to block in SS::block or not >>>>>>> - >>>>>>> >>>>>>> >>>>>>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review the following patch: >>>>>>>> >>>>>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>>>>>> >>>>>>>> The current implementation of native monitors uses a technique >>>>>>>> that we name "sneaky locking" to prevent possible deadlocks of >>>>>>>> the JVM during safepoints. The implementation of this technique >>>>>>>> though introduces a race when a monitor is shared between the >>>>>>>> VMThread and non-JavaThreads. This patch aims to solve that >>>>>>>> problem and at the same time simplify the code. >>>>>>>> >>>>>>>> The proposal is based on the introduction of the new class >>>>>>>> PlatformMonitor, which serves as a wrapper for the actual >>>>>>>> synchronization primitives in each platform (mutexes and >>>>>>>> condition variables). Most of the API calls can thus be >>>>>>>> implemented as simple wrappers around PlatformMonitor, adding >>>>>>>> more assertions and very little extra metadata. >>>>>>>> To be able to remove the lock sneaking code and at the same time >>>>>>>> avoid deadlocking scenarios, we combine two techniques: >>>>>>>> >>>>>>>> -When a JavaThread that has just acquired the lock, detects >>>>>>>> there is a safepoint request in the ThreadLockBlockInVM >>>>>>>> destructor, it releases the lock before blocking at the >>>>>>>> safepoint. After resuming from it, the JavaThread will have to >>>>>>>> acquire the lock again. >>>>>>>> >>>>>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>>>>>> method, in order to avoid blocking we allow for a possible >>>>>>>> safepoint request to make progress but without letting the >>>>>>>> JavaThread block for it (since we would be stopped by the >>>>>>>> destructor anyways). We also do that for the Monitor::lock() >>>>>>>> case although no deadlock is being prevented there. >>>>>>>> >>>>>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition >>>>>>>> class used instead of the ThreadBlockInVM one. This allowed more >>>>>>>> flexibility to handle the two techniques mentioned above. Also, >>>>>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>>>>>> which creates some problems when trying to allow safepoints to >>>>>>>> continue without stopping, since that method not only checks for >>>>>>>> safepoints but also processes handshakes. >>>>>>>> >>>>>>>> In terms of performance, benchmarks show very similar results to >>>>>>>> what we have now. >>>>>>>> >>>>>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>>>>>> been tested. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Patricio >>>>>>>> >>>>>> From stefan.karlsson at oracle.com Tue Jan 29 22:10:57 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Jan 2019 23:10:57 +0100 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> Message-ID: GC and SystemDictionary::do_unloading changes look good. https://cr.openjdk.java.net/~coleenp/2019/8213753.01/webrev/src/hotspot/share/classfile/symbolTable.hpp.udiff.html I think you should move the implementation of these to the cpp file: + void reset_has_items_to_clean() { Atomic::store(false, &_has_items_to_clean); } + void mark_has_items_to_clean() { Atomic::store(true, &_has_items_to_clean); } + bool has_items_to_clean() const { return Atomic::load(&_has_items_to_clean); } Thanks, StefanK On 2019-01-29 20:39, coleen.phillimore at oracle.com wrote: > Summary: remove gc timing for short runtime cleanup triggering; make > symbol table cleaning triggered automatically on unloading > > Ran runThese with all Oracle GCs and got similar numbers of symbols > unloaded.? Also ran tier1-5. > > See bug for more information. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213753.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8213753 > > Thanks, > Coleen From coleen.phillimore at oracle.com Tue Jan 29 22:29:13 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 29 Jan 2019 17:29:13 -0500 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> Message-ID: Thank you for reviewing, Stefan. On 1/29/19 5:10 PM, Stefan Karlsson wrote: > GC and SystemDictionary::do_unloading changes look good. > > https://cr.openjdk.java.net/~coleenp/2019/8213753.01/webrev/src/hotspot/share/classfile/symbolTable.hpp.udiff.html > > > I think you should move the implementation of these to the cpp file: > > + void reset_has_items_to_clean() { Atomic::store(false, > &_has_items_to_clean); } > + void mark_has_items_to_clean() { Atomic::store(true, > &_has_items_to_clean); } > + bool has_items_to_clean() const { return > Atomic::load(&_has_items_to_clean); } Sure, I can do that.? We moved these atomics to the .hpp file so we could do this now, but since they're used in the .cpp file, it makes sense there. Thanks, Coleen > > Thanks, > StefanK > > On 2019-01-29 20:39, coleen.phillimore at oracle.com wrote: >> Summary: remove gc timing for short runtime cleanup triggering; make >> symbol table cleaning triggered automatically on unloading >> >> Ran runThese with all Oracle GCs and got similar numbers of symbols >> unloaded.? Also ran tier1-5. >> >> See bug for more information. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8213753.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8213753 >> >> Thanks, >> Coleen > From hohensee at amazon.com Tue Jan 29 22:31:01 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 29 Jan 2019 22:31:01 +0000 Subject: Revive support for SPARC V8 In-Reply-To: References: Message-ID: Moving to hotspot-dev and porters-dev in case anyone has something to add. compiler-dev is the javac list. :) Thanks, Paul From: compiler-dev on behalf of Martijn Verburg Date: Monday, January 28, 2019 at 3:25 PM To: Tiago Nogueira Cc: "compiler-dev at openjdk.java.net" Subject: Re: Revive support for SPARC V8 Hi Tiago, Please see https://openjdk.java.net/contribute/ - I think the reality is that a significant organization would need to commit real engineering time to support this platform for the long haul. If your organisation is able to do so then you may wish to start the work and then propose a porting project or a JEP as appropriate. Cheers, Martijn On Mon, 28 Jan 2019 at 10:00, Tiago Nogueira > wrote: Hello, I'm new to the mailing list, and i just hope this is the right place to bring this up. Support for SPARC V8 was removed in 2013 (see https://bugs.openjdk.java.net/browse/JDK-8017756). The argument at the time was that SPARC V8 hardware was not used any more (see https://bugs.openjdk.java.net/browse/JDK-8008407). Well, it turns out that the the space industry has finally caught up, and the new generation of radiation hardened SoCs, at least in Europe, will be based on SPARC V8 (see https://www.gaisler.com/doc/gr740/GR740-OVERVIEW.pdf). What would be the effort to revive support for SPARC V8? I think this would be a great opportunity to get Java running on rad-hard hardware in low Earth orbit and beyond... Thanks Tiago From vladimir.kozlov at oracle.com Tue Jan 29 22:31:26 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 29 Jan 2019 14:31:26 -0800 Subject: 8216541: CompiledICHolders of VM locked unloaded nmethods are released too late In-Reply-To: References: <0345dd70-fa19-a7e3-7892-8f4500f8339f@oracle.com> <4195c9a1-e91c-1014-8fa5-2b0d3f6dfc30@oracle.com> <44b1aa28-81b0-6e9a-1b13-bb973bb5bd30@oracle.com> Message-ID: <4751bbf9-e4a4-a1df-52f8-e339e5e09351@oracle.com> Looks good to me too. Thanks, Vladimir On 1/29/19 2:40 AM, Tobias Hartmann wrote: > Hi Erik, > > okay, got it. Thanks for the details, your fix looks good to me! > > Best regards, > Tobias > > On 29.01.19 11:38, Erik ?sterlund wrote: >> Hi Tobias, >> >> Thanks for having a look at this. >> >> On 2019-01-29 09:16, Tobias Hartmann wrote: >>> Hi Erik, >>> >>> very nice analysis, thanks a lot for investigating! >>> >>> On 28.01.19 14:56, Erik ?sterlund wrote: >>>> http://cr.openjdk.java.net/~eosterlund/8216541/webrev.00/ >>> >>> Why did you remove the call to thread->set_scanned_compiled_method(NULL) in sweeper.cpp? >> >> Because the CompiledMethodMarker destructor already nulls this out, and redundantly nulling it out >> again offers no extra protection. >> >> The idea of nulling it out before calling flush seems to have been to prevent the GC scanning from >> seeing this flushed nmethod in a safepoint, accidentally resurrecting it from the dead. But that is >> already impossible, because flush() is called with a never safepoint checking lock (which guarantees >> we don't have any and can't add any safepoint checks while holding that lock or we will deadlock >> badly). Therefore such safepoints will happen strictly after the processing of the compiled method >> is finished, and it is already cleared the normal way. >> >> By removing that pointless clearing, I could get rid of the release_compiled_method() function and >> just call flush directly instead. I get confused by there being two "destroy" functions, one in the >> sweeper and one in the nmethod, so I wanted it gone. >> >>> >>>> The proposed change has survived 200 rounds of kitchensink, hs-tier1-3 and hs-precheckin-comp. >>> >>> In the meanwhile, could you please run some more 100x iterations of kitchensink? >> >> Sure, running some more as we speak. >> >> Thanks, >> /Erik >> >>> >>> Thanks, >>> Tobias >>> From joe.darcy at oracle.com Wed Jan 30 00:01:11 2019 From: joe.darcy at oracle.com (Joseph D. Darcy) Date: Tue, 29 Jan 2019 16:01:11 -0800 Subject: test/hotspot/jtreg/runtime/CompressedOops/UseCompressedOops.java failing with -XX:+UseShenandoahGC on x86_64 In-Reply-To: References: Message-ID: <5C50E947.1060602@oracle.com> On 1/28/2019 8:16 AM, Roman Kennke wrote: > Isn't anything pushed to 12 (kindof) automatically pushed to 13 too, > eventually? Yes; Jesper regularly syncs changes from 12 into 13. At times this requires addressing merge conflicts, etc. HTH, -Joe From patricio.chilano.mateo at oracle.com Wed Jan 30 00:24:02 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Tue, 29 Jan 2019 19:24:02 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> Message-ID: <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> Hi Per, On 1/29/19 4:22 AM, Per Liden wrote: > Hi Patricio, > > On 01/28/2019 08:18 PM, Patricio Chilano wrote: >> Hi Robbin, >> >> Thanks for reviewing this! Removing the block_in_safepoint_check >> thread local attribute is a great idea, here is v02: >> >> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev > > I really like that we're ditching our old locking code in favor of > using pthread_mutex, et al. Nice work! Thanks!?? : ) > General comment > ---------------- > I think Mutex to be a plain mutex and not come with the baggage of > having a conditional variable. With this new code, it seems we're in a > really good position to make that happen. I.e. something like this: > > class PlatformMutex { > protected: > ? pthread_mutex_t _mutex; > > public: > ? PlatformMutex(); > ? ~PlatformMutex(); > > ? void lock(); > ? void unlock(); > ? bool try_lock(); > }; > > class PlatformMonitor : public PlatformMutex { > private: > ? pthread_cond_t _cond; > > public: > ? PlatformMonitor(); > ? ~PlatformMonitor(); > > ? int wait(jlong millis); > ? void notify(); > ? void notify_all(); > }; > > It might be that we want to do that as a separate step later instead > of including it in this patch. But I think we should try to get there. I agree this is a good idea, but since it would make sense to also rework them at the high-level Monitor/Mutex as David pointed out (this idea is actually also proposed in the comments of mutex.hpp) what do you think if I file this as a separate bugid to be worked after we pushed this patch ? > src/hotspot/os/posix/os_*.[ch]pp > --------------------------------- > * I'd suggest that we place the PlatformMonitor class in a separate > file (like src/hotspot/os/posix/monitor_posix.cpp), just like we have > done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). I tried to moved them but there is a small issue in that PlatformMonitor code needs static methods defined in their current os_*.cpp files (methods that parse timing structs). I can declare them as public (cannot move them since they are also used by PlatformEvent and Parker), but for the Posix version of PlatformMonitor I would also need to do that with _condAttr and _mutexAttr which are also defined static in that file and are needed by PlatformMonitor::PlatformMonitor. So not sure what the right approach is here. In any case shouldn't we aim to have all synchronization-like classes in the same file for each platform (something like syncro_posix, syncro_windows, etc) instead of a separate file for each of them (semaphore_*, monitor_*, waitbarrier_*, etc). Otherwise seems PlatformParker and PlatformEvent should also be in their own file. > src/hotspot/os/posix/os_posix.hpp > src/hotspot/os/solaris/os_solaris.hpp > src/hotspot/os/windows/os_windows.hpp > ------------------------------------- > * Please make _mutex/_cond plain variables, instead of arrays of 1. > That's just ugly ;) Done! > src/hotspot/os/posix/os_posix.cpp > --------------------------------- > * Destructor missing, to call pthread_(mutex|cond)_destroy(). Done! > src/hotspot/os/solaris/os_solaris.hpp > ------------------------------------- > * Not sure if there's a good reason to have the constructor be inlined > here. I'd suggest moving it to the cpp file. > > * Destructor missing. Done! > src/hotspot/os/windows/os_windows.cpp > ------------------------------------- > * Destructor missing (I'm not too familiar with the windows API but I > assume there's a destroy function we should call here). Done! (There is a destroy function for mutexes but not for condition variables which apparently do not need to free anything explicitly). > src/hotspot/share/runtime/interfaceSupport.inline.hpp > ----------------------------------------------------- > * Move "private:" above monitor_adr; > > ?289 class ThreadLockBlockInVM : public ThreadStateTransition { > ?290?? Monitor** monitor_adr; > ?291? private: > ?292?? void do_preempted(Monitor** in_flight_monitor_adr) { > > * monitor_adr should be _monitor_adr, or maybe even > _in_flight_monitor_adr to better match the name of the argument. Done! I realized there is no need for passing a parameter to do_preempted() since we already have the in_flight_monitor_adr so I also made small changes there. Here is v03 including also Dan and Robbin comments about mutex.cpp and safepointMechanism.hpp: Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ Running mach tiers1-3. Waiting though on you thoughts about file organization and deferring Mutex/Monitor rework. Thanks for looking into this Per! Thanks, Patricio > cheers, > Per > >> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >> >> Running mach5 again. >> >> Thanks, >> Patricio >> >> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>> Hi Patricio, >>> >>> Mostly looks good! >>> >>> block_at_safepoint is always called with block_in_safepoint_check = >>> true. (correct?) >>> Changing that to a local state instead of global simplifies the code. >>> >>> So I'm suggesting something like below. >>> >>> Thanks, Robbin >>> >>> diff -r e65cc445234c >>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>> 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>> 28 14:10:59 2019 +0100 >>> @@ -308,2 +308,1 @@ >>> -??? thread->block_in_safepoint_check = false; >>> -??? SafepointMechanism::block_at_safepoint(thread); >>> +??? SafepointMechanism::callback_if_safepoint(thread); >>> @@ -323,2 +322,1 @@ >>> -????? SafepointMechanism::block_at_safepoint(_thread); >>> -????? _thread->block_in_safepoint_check = true; >>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>> @@ -335,2 +332,0 @@ >>> -??? } else { >>> -????? _thread->block_in_safepoint_check = true; >>> @@ -337,0 +334,1 @@ >>> + >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -795,1 +795,1 @@ >>> -void SafepointSynchronize::block(JavaThread *thread) { >>> +void SafepointSynchronize::block(JavaThread *thread, bool >>> block_in_safepoint_check) { >>> @@ -850,1 +850,1 @@ >>> -????? if (thread->block_in_safepoint_check) { >>> +????? if (block_in_safepoint_check) { >>> @@ -880,1 +880,1 @@ >>> -????????? thread->block_in_safepoint_check) { >>> +????????? block_in_safepoint_check) { >>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -146,1 +146,1 @@ >>> -? static void?? block(JavaThread *thread); >>> +? static void?? block(JavaThread *thread, bool >>> block_in_safepoint_check = true); >>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>> 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>> 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -? static inline void block_at_safepoint(JavaThread* thread); >>> +? static inline void callback_if_safepoint(JavaThread* thread); >>> diff -r e65cc445234c >>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>> Jan 28 13:10:15 2019 +0100 >>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>> Jan 28 14:10:59 2019 +0100 >>> @@ -82,1 +82,1 @@ >>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>> @@ -84,1 +84,1 @@ >>> -??? SafepointSynchronize::block(thread); >>> +??? SafepointSynchronize::block(thread, false); >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -298,2 +297,0 @@ >>> -? block_in_safepoint_check = true; >>> - >>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>> 2019 +0100 >>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>> 2019 +0100 >>> @@ -788,2 +787,0 @@ >>> -? bool block_in_safepoint_check;????????????? // to decide whether >>> to block in SS::block or not >>> - >>> >>> >>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>> Hi all, >>>> >>>> Please review the following patch: >>>> >>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>> >>>> The current implementation of native monitors uses a technique that >>>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>>> during safepoints. The implementation of this technique though >>>> introduces a race when a monitor is shared between the VMThread and >>>> non-JavaThreads. This patch aims to solve that problem and at the >>>> same time simplify the code. >>>> >>>> The proposal is based on the introduction of the new class >>>> PlatformMonitor, which serves as a wrapper for the actual >>>> synchronization primitives in each platform (mutexes and condition >>>> variables). Most of the API calls can thus be implemented as simple >>>> wrappers around PlatformMonitor, adding more assertions and very >>>> little extra metadata. >>>> To be able to remove the lock sneaking code and at the same time >>>> avoid deadlocking scenarios, we combine two techniques: >>>> >>>> -When a JavaThread that has just acquired the lock, detects there >>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>> releases the lock before blocking at the safepoint. After resuming >>>> from it, the JavaThread will have to acquire the lock again. >>>> >>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>> method, in order to avoid blocking we allow for a possible >>>> safepoint request to make progress but without letting the >>>> JavaThread block for it (since we would be stopped by the >>>> destructor anyways). We also do that for the Monitor::lock() case >>>> although no deadlock is being prevented there. >>>> >>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>>> used instead of the ThreadBlockInVM one. This allowed more >>>> flexibility to handle the two techniques mentioned above. Also, >>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>> which creates some problems when trying to allow safepoints to >>>> continue without stopping, since that method not only checks for >>>> safepoints but also processes handshakes. >>>> >>>> In terms of performance, benchmarks show very similar results to >>>> what we have now. >>>> >>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>> been tested. >>>> >>>> Thanks, >>>> Patricio >>>> >> From patricio.chilano.mateo at oracle.com Wed Jan 30 00:39:11 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Tue, 29 Jan 2019 19:39:11 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <10c59adf-79b9-ba82-228d-1d87a0668769@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <60daed05-61d7-053a-35b4-d5b6582ea0a1@oracle.com> <10c59adf-79b9-ba82-228d-1d87a0668769@oracle.com> Message-ID: <915407dd-c7c6-a1f3-8e75-43d3836020bb@oracle.com> Hi Dan, On 1/29/19 11:00 AM, Daniel D. Daugherty wrote: > Hi Patricio, > > I'm stripping this down to just the possible issue in mutex.cpp... > > > On 1/28/19 6:13 PM, Patricio Chilano wrote: >> On 1/28/19 4:29 PM, Daniel D. Daugherty wrote: >>> On 1/28/19 2:18 PM, Patricio Chilano wrote: >>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >>> src/hotspot/share/runtime/mutex.cpp >>> L205: ??????? jt->java_suspend_self(); >>> ??? L206: ????? } >>> ??? L207: ??? } >>> ??? L208: >>> ??? L209: ??? if (in_flight_monitor != NULL) { >>> ??? L210: ????? // Conceptually reestablish ownership of the lock. >>> ??? L211: ????? assert(_owner == NULL, "should be NULL: owner=" >>> INTPTR_FORMAT, p2i(_owner)); >>> ??? L212: ????? set_owner(self); >>> ??? L213: ??? } else { >>> ??? L214: ????? lock(); >>> ??? L215: ??? } >>> ??????? The lock reacquire on L214 used to be done after the >>> ??????? java_suspend_self() on L205 which is inside the block >>> ??????? context for the ThreadLockBlockInVM and OSThreadWaitState >>> ??????? helps. If the lock() blocks due to a racing thread, then >>> ??????? the calling JavaThread won't have the right thread state >>> ??????? of OS thread wait state, etc... >>> So after java_suspend_self(), you have to re-lock() so that >>> ??????? a potential block on that re-lock has the right states. This >>> ??????? also means that this line has to be deleted: >>> >>> ??????? L204???????? in_flight_monitor = NULL; >>> >>> ??????? so that ThreadLockBlockInVM destructor can do the right thing >>> ??????? if the thread is suspended and then relocks. >> Yes, I see your point. The problem is that after executing the TLBIVM >> destructor (which executes after the OSThreadWaitState destructor >> with the current order)? there is always the possibility that we had >> to release the lock, and so afterwards we will have to re-acquire it >> with a different state. One simple way of solving this could be >> moving the OSThreadWaitState object outside the TLBIVM block. Based >> on David's comment about OSThreadWaitState I don't think changing the >> order should break things, since it seems more like a debugging tool. >> What do you think then of doing something like this: (I also included >> a re-lock after java_suspend_self() and removed >> in_flight_monitor=NULL as you suggested.) >> >> diff --git a/src/hotspot/share/runtime/mutex.cpp >> b/src/hotspot/share/runtime/mutex.cpp >> --- a/src/hotspot/share/runtime/mutex.cpp >> +++ b/src/hotspot/share/runtime/mutex.cpp >> @@ -182,9 +186,9 @@ >> ???? JavaThread *jt = (JavaThread *)self; >> ???? Monitor* in_flight_monitor = NULL; >> >> +??? OSThreadWaitState osts(self->osthread(), false /* not >> Object.wait() */); >> ???? { >> ?????? ThreadLockBlockInVM tlbivm(jt, &in_flight_monitor); >> -????? OSThreadWaitState osts(self->osthread(), false /* not >> Object.wait() */); >> ?????? if (as_suspend_equivalent) { >> ???????? jt->set_suspend_equivalent(); >> ???????? // cleared by handle_special_suspend_equivalent_condition() or >> @@ -201,8 +205,8 @@ >> ???????? // want to hold the lock while suspended because that >> ???????? // would surprise the thread that suspended us. >> ???????? _lock.unlock(); >> -??????? in_flight_monitor = NULL; >> ???????? jt->java_suspend_self(); >> +??????? _lock.lock(); >> ?????? } >> ???? } >> >>> Also after lock() on L214, you never call set_owner(self) >>> ??????? so the ownership is not complete for that relocated code. >>> ??????? You won't need the set_owner(self) call if you move the >>> ??????? lock() on L214 back to after java_suspend_self(). >> That lock() is actually Monitor::lock(). But I agree is confusing, >> every time I look at it I say the same. Maybe I can rewrite it as >> Monitor::lock() ? > > The fact that the lock() on L214 is really Monitor::lock() > changes everything with respect to my comment. I'm sorry > I missed that crucial detail. Please ignore the above from > me and undo the changes that I suggested. > > Here's a new set of comments: > > ??? L195: ????? in_flight_monitor = this; > ? ? ? ? Please consider adding a comment: > > ??????????????? in_flight_monitor = this;? // save for > ~ThreadLockBlockInVM > > ??? L204: ??????? in_flight_monitor = NULL; > ??????? Please consider adding a comment: > > ????????????????? in_flight_monitor = this;? // ~ThreadLockBlockInVM > does not need to unlock > > ??? L209: ??? if (in_flight_monitor != NULL) { > ??????? Please consider adding a comment below L209: > > ??????????????? // Not unlocked by ~ThreadLockBlockInVM or > self-suspend above. > > ??? L214: ????? lock(); > ??????? I missed the fact that this is Monitor::lock(). However, the > ??????? problem with Monitor::lock() here is that if this call blocks, > ??????? then we don't have a ThreadLockBlockInVM to change our state > ??????? so we won't be recognized as blocking. > > ??????? Please consider changing to: > > ??????????????? Monitor::lock(self); > > ??????? that will give us full locking semantics. > > Okay, I think this new set of comments is on the right track, but > obviously I need other reviewers to point out my silliness... :-) > > I am a bit worried about whether our use of a ThreadLockBlockInVM > in wait() and another in lock(self) on this code path could cause > any confusion by observing threads. I don't think so, but I haven't > traced it all out from beginning to end yet. If you mean observing a state of _thread_blocked and performing a safepoint or handshake by the VMThread, the blocked thread will still stop in the destructor of the TLBIVM anyways. If the JavaThread has to execute in a new TLBIVM jacket then the same logic applies. But that would be the same as in Monitor::lock, since by looping around you can also execute different TLBIVM jackets. Not sure if that's what you meant. I sent v03 adding your fixes and Per suggestions: Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ Thanks! Patricio > Dan > From daniel.daugherty at oracle.com Wed Jan 30 01:24:59 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 29 Jan 2019 20:24:59 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <915407dd-c7c6-a1f3-8e75-43d3836020bb@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <60daed05-61d7-053a-35b4-d5b6582ea0a1@oracle.com> <10c59adf-79b9-ba82-228d-1d87a0668769@oracle.com> <915407dd-c7c6-a1f3-8e75-43d3836020bb@oracle.com> Message-ID: On 1/29/19 7:39 PM, Patricio Chilano wrote: > Hi Dan, > > On 1/29/19 11:00 AM, Daniel D. Daugherty wrote: >> Hi Patricio, >> >> I'm stripping this down to just the possible issue in mutex.cpp... >> >> >> On 1/28/19 6:13 PM, Patricio Chilano wrote: >>> On 1/28/19 4:29 PM, Daniel D. Daugherty wrote: >>>> On 1/28/19 2:18 PM, Patricio Chilano wrote: >>>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >>>> src/hotspot/share/runtime/mutex.cpp >>>> L205: ??????? jt->java_suspend_self(); >>>> ??? L206: ????? } >>>> ??? L207: ??? } >>>> ??? L208: >>>> ??? L209: ??? if (in_flight_monitor != NULL) { >>>> ??? L210: ????? // Conceptually reestablish ownership of the lock. >>>> ??? L211: ????? assert(_owner == NULL, "should be NULL: owner=" >>>> INTPTR_FORMAT, p2i(_owner)); >>>> ??? L212: ????? set_owner(self); >>>> ??? L213: ??? } else { >>>> ??? L214: ????? lock(); >>>> ??? L215: ??? } >>>> ??????? The lock reacquire on L214 used to be done after the >>>> ??????? java_suspend_self() on L205 which is inside the block >>>> ??????? context for the ThreadLockBlockInVM and OSThreadWaitState >>>> ??????? helps. If the lock() blocks due to a racing thread, then >>>> ??????? the calling JavaThread won't have the right thread state >>>> ??????? of OS thread wait state, etc... >>>> So after java_suspend_self(), you have to re-lock() so that >>>> ??????? a potential block on that re-lock has the right states. This >>>> ??????? also means that this line has to be deleted: >>>> >>>> ??????? L204???????? in_flight_monitor = NULL; >>>> >>>> ??????? so that ThreadLockBlockInVM destructor can do the right thing >>>> ??????? if the thread is suspended and then relocks. >>> Yes, I see your point. The problem is that after executing the >>> TLBIVM destructor (which executes after the OSThreadWaitState >>> destructor with the current order)? there is always the possibility >>> that we had to release the lock, and so afterwards we will have to >>> re-acquire it with a different state. One simple way of solving this >>> could be moving the OSThreadWaitState object outside the TLBIVM >>> block. Based on David's comment about OSThreadWaitState I don't >>> think changing the order should break things, since it seems more >>> like a debugging tool. What do you think then of doing something >>> like this: (I also included a re-lock after java_suspend_self() and >>> removed in_flight_monitor=NULL as you suggested.) >>> >>> diff --git a/src/hotspot/share/runtime/mutex.cpp >>> b/src/hotspot/share/runtime/mutex.cpp >>> --- a/src/hotspot/share/runtime/mutex.cpp >>> +++ b/src/hotspot/share/runtime/mutex.cpp >>> @@ -182,9 +186,9 @@ >>> ???? JavaThread *jt = (JavaThread *)self; >>> ???? Monitor* in_flight_monitor = NULL; >>> >>> +??? OSThreadWaitState osts(self->osthread(), false /* not >>> Object.wait() */); >>> ???? { >>> ?????? ThreadLockBlockInVM tlbivm(jt, &in_flight_monitor); >>> -????? OSThreadWaitState osts(self->osthread(), false /* not >>> Object.wait() */); >>> ?????? if (as_suspend_equivalent) { >>> ???????? jt->set_suspend_equivalent(); >>> ???????? // cleared by handle_special_suspend_equivalent_condition() or >>> @@ -201,8 +205,8 @@ >>> ???????? // want to hold the lock while suspended because that >>> ???????? // would surprise the thread that suspended us. >>> ???????? _lock.unlock(); >>> -??????? in_flight_monitor = NULL; >>> ???????? jt->java_suspend_self(); >>> +??????? _lock.lock(); >>> ?????? } >>> ???? } >>> >>>> Also after lock() on L214, you never call set_owner(self) >>>> ??????? so the ownership is not complete for that relocated code. >>>> ??????? You won't need the set_owner(self) call if you move the >>>> ??????? lock() on L214 back to after java_suspend_self(). >>> That lock() is actually Monitor::lock(). But I agree is confusing, >>> every time I look at it I say the same. Maybe I can rewrite it as >>> Monitor::lock() ? >> >> The fact that the lock() on L214 is really Monitor::lock() >> changes everything with respect to my comment. I'm sorry >> I missed that crucial detail. Please ignore the above from >> me and undo the changes that I suggested. >> >> Here's a new set of comments: >> >> ??? L195: ????? in_flight_monitor = this; >> ? ? ? ? Please consider adding a comment: >> >> ??????????????? in_flight_monitor = this;? // save for >> ~ThreadLockBlockInVM >> >> ??? L204: ??????? in_flight_monitor = NULL; >> ??????? Please consider adding a comment: >> >> ????????????????? in_flight_monitor = this;? // ~ThreadLockBlockInVM >> does not need to unlock >> >> ??? L209: ??? if (in_flight_monitor != NULL) { >> ??????? Please consider adding a comment below L209: >> >> ??????????????? // Not unlocked by ~ThreadLockBlockInVM or >> self-suspend above. >> >> ??? L214: ????? lock(); >> ??????? I missed the fact that this is Monitor::lock(). However, the >> ??????? problem with Monitor::lock() here is that if this call blocks, >> ??????? then we don't have a ThreadLockBlockInVM to change our state >> ??????? so we won't be recognized as blocking. >> >> ??????? Please consider changing to: >> >> ??????????????? Monitor::lock(self); >> >> ??????? that will give us full locking semantics. >> >> Okay, I think this new set of comments is on the right track, but >> obviously I need other reviewers to point out my silliness... :-) >> >> I am a bit worried about whether our use of a ThreadLockBlockInVM >> in wait() and another in lock(self) on this code path could cause >> any confusion by observing threads. I don't think so, but I haven't >> traced it all out from beginning to end yet. > If you mean observing a state of _thread_blocked and performing a > safepoint or handshake by the VMThread, the blocked thread will still > stop in the destructor of the TLBIVM anyways. If the JavaThread has to > execute in a new TLBIVM jacket then the same logic applies. But that > would be the same as in Monitor::lock, since by looping around you can > also execute different TLBIVM jackets. Not sure if that's what you meant. > > I sent v03 adding your fixes and Per suggestions: > > Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ src/hotspot/os/posix/os_posix.cpp ??? No comments. src/hotspot/os/posix/os_posix.hpp ??? No comments. src/hotspot/os/solaris/os_solaris.cpp ??? No comments. src/hotspot/os/solaris/os_solaris.hpp ??? No comments. src/hotspot/os/windows/os_windows.cpp ??? No comments. src/hotspot/os/windows/os_windows.hpp ??? No comments. src/hotspot/share/runtime/interfaceSupport.inline.hpp ??? No comments. src/hotspot/share/runtime/mutex.cpp ??? L158: ?????????????????? ", self=" INTPTR_FORMAT, p2i(_owner), p2i(Thread::current())); ??????? Please change 'Thread::current()' -> 'self'. ??? new L188: ??? OSThreadWaitState osts(self->osthread(), false /* not Object.wait() */); ??? old L187: ????? OSThreadWaitState osts(self->osthread(), false /* not Object.wait() */); ??????? Please move the OSThreadWaitState back where it was. It belongs ??????? after the ThreadLockBlockInVM in the wait() code path. src/hotspot/share/runtime/safepointMechanism.hpp ??? No comments. Dan > > > Thanks! > Patricio > >> Dan >> > From dean.long at oracle.com Wed Jan 30 02:32:06 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 29 Jan 2019 18:32:06 -0800 Subject: 12 RFR(M) 8195635: [Graal] nsk/jvmti/unit/ForceEarlyReturn/earlyretbase crashes with assertion "compilation level out of bounds" In-Reply-To: <9b4a4594-458e-ae18-0606-9b1ecbb400ce@oracle.com> References: <9b4a4594-458e-ae18-0606-9b1ecbb400ce@oracle.com> Message-ID: I'm withdrawing this RFR for 12. dl On 1/28/19 5:13 PM, dean.long at oracle.com wrote: > http://cr.openjdk.java.net/~dlong/8195635/webrev.5/ > https://bugs.openjdk.java.net/browse/JDK-8195635 > > Please see the bug report for all the gory details.? Here's the short > version: > > If we allow any safepoint to be a suspend point, we run into trouble > with PopFrame and ForceEarlyReturn, which reasonably expect the top > frame not to change between the suspend and when the > PopFrame/ForceEarlyReturn is executed.? Normally this is not an issue, > but certain safepoints cause problems, when we are about to call a new > Java method.? In particular, if we safepoint and suspend in > JavaCallWrapper, the top frame will still be the caller, but when we > execute the PopFrame/ForceEarlyReturn we will be in the callee. > > The solution this patch takes is to block suspend around troublesome > VM code using a new "allow_suspend" thread flag.? This means > JavaThread::java_suspend can't just ask the VMThread to safepoint and > be done.? Instead it has wait and allow threads to roll forward to an > allowed suspend point. > > dl From david.holmes at oracle.com Wed Jan 30 05:29:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Jan 2019 15:29:48 +1000 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> Message-ID: <7b2ebcc9-48a9-635d-5dfd-eb3b3ba8ad12@oracle.com> Hi Patricio, Just responding to one part about using a separate file ... On 30/01/2019 10:24 am, Patricio Chilano wrote: > Hi Per, > > On 1/29/19 4:22 AM, Per Liden wrote: >> Hi Patricio, >> >> On 01/28/2019 08:18 PM, Patricio Chilano wrote: >>> Hi Robbin, >>> >>> Thanks for reviewing this! Removing the block_in_safepoint_check >>> thread local attribute is a great idea, here is v02: >>> >>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >> >> I really like that we're ditching our old locking code in favor of >> using pthread_mutex, et al. Nice work! > Thanks!?? : ) > >> General comment >> ---------------- >> I think Mutex to be a plain mutex and not come with the baggage of >> having a conditional variable. With this new code, it seems we're in a >> really good position to make that happen. I.e. something like this: >> >> class PlatformMutex { >> protected: >> ? pthread_mutex_t _mutex; >> >> public: >> ? PlatformMutex(); >> ? ~PlatformMutex(); >> >> ? void lock(); >> ? void unlock(); >> ? bool try_lock(); >> }; >> >> class PlatformMonitor : public PlatformMutex { >> private: >> ? pthread_cond_t _cond; >> >> public: >> ? PlatformMonitor(); >> ? ~PlatformMonitor(); >> >> ? int wait(jlong millis); >> ? void notify(); >> ? void notify_all(); >> }; >> >> It might be that we want to do that as a separate step later instead >> of including it in this patch. But I think we should try to get there. > I agree this is a good idea, but since it would make sense to also > rework them at the high-level Monitor/Mutex as David pointed out (this > idea is actually also proposed in the comments of mutex.hpp) what do you > think if I file this as a separate bugid to be worked after we pushed > this patch ? > >> src/hotspot/os/posix/os_*.[ch]pp >> --------------------------------- >> * I'd suggest that we place the PlatformMonitor class in a separate >> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we have >> done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). > I tried to moved them but there is a small issue in that PlatformMonitor > code needs static methods defined in their current os_*.cpp files > (methods that parse timing structs). I can declare them as public > (cannot move them since they are also used by PlatformEvent and Parker), > but for the Posix version of PlatformMonitor I would also need to do > that with _condAttr and _mutexAttr which are also defined static in that > file and are needed by PlatformMonitor::PlatformMonitor. So not sure > what the right approach is here. os_posix.cpp currently utilises file statics for utility functions and shared variables (the attr objects) used by code within that file across four classes (now five with PlatformMonitor). To move PlatformMonitor to a separate file will require that those file statics all become static members of os::Posix instead (and probably provide some inline accessors for the attr objects to preserve encapsulation). I'm in the process of partially doing this with JDK-8217843 because I need to move some things to a new os_posix.inline.hpp file. So I'd suggest doing this also as a follow up RFE rather than now. > In any case shouldn't we aim to have all synchronization-like classes in > the same file for each platform (something like syncro_posix, > syncro_windows, etc) instead of a separate file for each of them > (semaphore_*, monitor_*, waitbarrier_*, etc). Otherwise seems > PlatformParker and PlatformEvent should also be in their own file. Right. I'm not a fan of one class per file in this case, as logically these could be seen as nested classes of os and benefit from some shared file-static implementation details. Certainly if we think these should be distinct files then its an all or none proposition IMHO. >> src/hotspot/os/posix/os_posix.hpp >> src/hotspot/os/solaris/os_solaris.hpp >> src/hotspot/os/windows/os_windows.hpp >> ------------------------------------- >> * Please make _mutex/_cond plain variables, instead of arrays of 1. >> That's just ugly ;) > Done! There was a reason, told to me many years ago, as to why single element arrays were used, but I can't recall it or locate it unfortunately. Cheers, David ----- > >> src/hotspot/os/posix/os_posix.cpp >> --------------------------------- >> * Destructor missing, to call pthread_(mutex|cond)_destroy(). > Done! > >> src/hotspot/os/solaris/os_solaris.hpp >> ------------------------------------- >> * Not sure if there's a good reason to have the constructor be inlined >> here. I'd suggest moving it to the cpp file. >> >> * Destructor missing. > Done! > >> src/hotspot/os/windows/os_windows.cpp >> ------------------------------------- >> * Destructor missing (I'm not too familiar with the windows API but I >> assume there's a destroy function we should call here). > Done! (There is a destroy function for mutexes but not for condition > variables which apparently do not need to free anything explicitly). > >> src/hotspot/share/runtime/interfaceSupport.inline.hpp >> ----------------------------------------------------- >> * Move "private:" above monitor_adr; >> >> ?289 class ThreadLockBlockInVM : public ThreadStateTransition { >> ?290?? Monitor** monitor_adr; >> ?291? private: >> ?292?? void do_preempted(Monitor** in_flight_monitor_adr) { >> >> * monitor_adr should be _monitor_adr, or maybe even >> _in_flight_monitor_adr to better match the name of the argument. > Done! I realized there is no need for passing a parameter to > do_preempted() since we already have the in_flight_monitor_adr so I also > made small changes there. > > > Here is v03 including also Dan and Robbin comments about mutex.cpp and > safepointMechanism.hpp: > > Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ > > Running mach tiers1-3. Waiting though on you thoughts about file > organization and deferring Mutex/Monitor rework. > > Thanks for looking into this Per! > > Thanks, > Patricio >> cheers, >> Per >> >>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >>> >>> Running mach5 again. >>> >>> Thanks, >>> Patricio >>> >>> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>>> Hi Patricio, >>>> >>>> Mostly looks good! >>>> >>>> block_at_safepoint is always called with block_in_safepoint_check = >>>> true. (correct?) >>>> Changing that to a local state instead of global simplifies the code. >>>> >>>> So I'm suggesting something like below. >>>> >>>> Thanks, Robbin >>>> >>>> diff -r e65cc445234c >>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>>> 28 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>>> 28 14:10:59 2019 +0100 >>>> @@ -308,2 +308,1 @@ >>>> -??? thread->block_in_safepoint_check = false; >>>> -??? SafepointMechanism::block_at_safepoint(thread); >>>> +??? SafepointMechanism::callback_if_safepoint(thread); >>>> @@ -323,2 +322,1 @@ >>>> -????? SafepointMechanism::block_at_safepoint(_thread); >>>> -????? _thread->block_in_safepoint_check = true; >>>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>>> @@ -335,2 +332,0 @@ >>>> -??? } else { >>>> -????? _thread->block_in_safepoint_check = true; >>>> @@ -337,0 +334,1 @@ >>>> + >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -795,1 +795,1 @@ >>>> -void SafepointSynchronize::block(JavaThread *thread) { >>>> +void SafepointSynchronize::block(JavaThread *thread, bool >>>> block_in_safepoint_check) { >>>> @@ -850,1 +850,1 @@ >>>> -????? if (thread->block_in_safepoint_check) { >>>> +????? if (block_in_safepoint_check) { >>>> @@ -880,1 +880,1 @@ >>>> -????????? thread->block_in_safepoint_check) { >>>> +????????? block_in_safepoint_check) { >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -146,1 +146,1 @@ >>>> -? static void?? block(JavaThread *thread); >>>> +? static void?? block(JavaThread *thread, bool >>>> block_in_safepoint_check = true); >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>>> 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp??? Mon Jan 28 >>>> 14:10:59 2019 +0100 >>>> @@ -82,1 +82,1 @@ >>>> -? static inline void block_at_safepoint(JavaThread* thread); >>>> +? static inline void callback_if_safepoint(JavaThread* thread); >>>> diff -r e65cc445234c >>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>> Jan 28 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>> Jan 28 14:10:59 2019 +0100 >>>> @@ -82,1 +82,1 @@ >>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>>> @@ -84,1 +84,1 @@ >>>> -??? SafepointSynchronize::block(thread); >>>> +??? SafepointSynchronize::block(thread, false); >>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -298,2 +297,0 @@ >>>> -? block_in_safepoint_check = true; >>>> - >>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -788,2 +787,0 @@ >>>> -? bool block_in_safepoint_check;????????????? // to decide whether >>>> to block in SS::block or not >>>> - >>>> >>>> >>>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>>> Hi all, >>>>> >>>>> Please review the following patch: >>>>> >>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>>> >>>>> The current implementation of native monitors uses a technique that >>>>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>>>> during safepoints. The implementation of this technique though >>>>> introduces a race when a monitor is shared between the VMThread and >>>>> non-JavaThreads. This patch aims to solve that problem and at the >>>>> same time simplify the code. >>>>> >>>>> The proposal is based on the introduction of the new class >>>>> PlatformMonitor, which serves as a wrapper for the actual >>>>> synchronization primitives in each platform (mutexes and condition >>>>> variables). Most of the API calls can thus be implemented as simple >>>>> wrappers around PlatformMonitor, adding more assertions and very >>>>> little extra metadata. >>>>> To be able to remove the lock sneaking code and at the same time >>>>> avoid deadlocking scenarios, we combine two techniques: >>>>> >>>>> -When a JavaThread that has just acquired the lock, detects there >>>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>>> releases the lock before blocking at the safepoint. After resuming >>>>> from it, the JavaThread will have to acquire the lock again. >>>>> >>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>>> method, in order to avoid blocking we allow for a possible >>>>> safepoint request to make progress but without letting the >>>>> JavaThread block for it (since we would be stopped by the >>>>> destructor anyways). We also do that for the Monitor::lock() case >>>>> although no deadlock is being prevented there. >>>>> >>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>>>> used instead of the ThreadBlockInVM one. This allowed more >>>>> flexibility to handle the two techniques mentioned above. Also, >>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>>> which creates some problems when trying to allow safepoints to >>>>> continue without stopping, since that method not only checks for >>>>> safepoints but also processes handshakes. >>>>> >>>>> In terms of performance, benchmarks show very similar results to >>>>> what we have now. >>>>> >>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>>> been tested. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>> > From david.holmes at oracle.com Wed Jan 30 05:41:41 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Jan 2019 15:41:41 +1000 Subject: RFR 8217647: JFR: wrong recording on 32-bit systems In-Reply-To: References: Message-ID: <066f7b55-8261-8c0e-eed3-c3a9d578943f@oracle.com> Hi Boris, Redirecting to hotspot-jfr-dev. I can't tell from all the changes exactly where the conflict arises, but the JFR folk are in the best position to identify what should always be 64-bit what should follow the pointer size. Cheers, David ----- On 30/01/2019 2:43 am, Boris Ulasevich wrote: > Hi, > > Can I please have a review for the following FlightRecorder fix: > > http://cr.openjdk.java.net/~bulasevich/8217647/webrev.01 > https://bugs.openjdk.java.net/browse/JDK-8217647 > > The issue is about intptr_t/jlong size mismatch on 32-bit systems. The > essence of the fix is is to specify jlong type implicitly when storing > data to jfr recording, plus minor types rearrangement was done to avoid > unnecessary type conversions. > > Testing: JFR tests on Linux x64/x32/arm64/arm32, Mach5, Mission Control. > > thanks, > Boris From david.holmes at oracle.com Wed Jan 30 05:49:01 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Jan 2019 15:49:01 +1000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> Message-ID: <032db50d-0086-c44d-0655-a2fd100dce31@oracle.com> Hi Matthias, Thanks for reworking this. On 30/01/2019 2:56 am, Baesken, Matthias wrote: > Hello, I added a break to avoid potential printing lines multiple times, > and removed the comment line : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.3/ A couple of minor comments: src/hotspot/os/linux/os_linux.cpp + while (keywords_to_match[i]) { Style nit: avoid implicit booleans, explicitly check != NULL + void os::Linux::print_virtualization_info(outputStream* st) { Don't you want an initial print of some introductory text eg: "Virtualization Information" No need for updated webrev. Thanks, David ----- > Best regards, Matthias > > *From:*Thomas St?fe > *Sent:* Dienstag, 29. Januar 2019 15:22 > *To:* Baesken, Matthias > *Cc:* David Holmes ; hotspot-dev at openjdk.java.net > *Subject:* Re: RFR : 8217786: Provide virtualization related info in the > hs_error file on linux s390x > > Hi Matthias, > + ? ? if (strncmp(line, keywords_to_match[i], > strlen(keywords_to_match[i])) == 0) { > + ? ? ? st->print("%s", line); > + ? ? } > > you should break here otherwise a line containing multiple keywords will > be printed multiple times. > > + ? ?// the LPAR / CPUs / VM - related infos usually come in blocks > > This comment can be removed. > > +#if defined(S390) > > I still do not like the arch specific code here, but for now I can live > with it. Should this section grow and cover other architectures as well, > we should fan out into os_linux_.cpp. > > Thanks, Thomas > > On Tue, Jan 29, 2019 at 12:22 PM Baesken, Matthias > > wrote: > > Hello ?here is a 2nd webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.2/ > > * Introduced??? static bool > print_matching_lines_from_sysinfo_file(outputStream* st, const > char* keywords_to_match[]) > * Moved ?call to Linux-only? print_os_info > > Best regards, Matthias > > *From:*Thomas St?fe > > *Sent:* Dienstag, 29. Januar 2019 09:23 > *To:* Baesken, Matthias >; David Holmes > > > *Cc:* hotspot-dev at openjdk.java.net > *Subject:* Re: RFR : 8217786: Provide virtualization related info in > the hs_error file on linux s390x > > I'm still unhappy with that solution, since we have fanned out this > coding for all architectures into the architecture independent > os_linux.cpp. A generic "Show matching lines from given file" would > be a better (slimmer, better reusable) solution IMHO. > > Side note: Could you please exchange strstr() .. with strncmp() > since you require the start of the string to match. So no reason to > parse the whole line if the start does not match. > > Cheers, Thomas > > On Tue, Jan 29, 2019 at 9:03 AM Baesken, Matthias > > wrote: > > > > > > No I was thinking more about just adding the virtualization > info to an > > existing step like print_os_info or print_cpu_info. > > > > Hi? David ,? print_cpu_info? does not sound like a great fit . > Some info? like > > LPAR Number: 14 > LPAR Characteristics: Shared > LPAR Name: VM12 > > ?Does not really belong there . > > print_os_info? ?looks? ?better ,? it already contains > "container_info"? on Linux, so? I think this might fit . > > > Best regards, Matthias > > > > -----Original Message----- > > From: David Holmes > > > Sent: Dienstag, 29. Januar 2019 05:17 > > To: Baesken, Matthias >; 'hotspot- > > dev at openjdk.java.net ' > > > > Subject: Re: RFR : 8217786: Provide virtualization related > info in the hs_error > > file on linux s390x > > > > On 28/01/2019 10:23 pm, Baesken, Matthias wrote: > > >> > > >> Can't you include this information in an existing section > of the error > > >> processing code instead of adding a new function that is empty > > >> everywhere except Linux? > > >> > > > > > > Hi David ,? ?do you mean? something like > > > > > > > > > #if defined(S390) > > > > > >? ? STEP("printing virtualization info") > > >? ?... > > > > > > #endif > > > > No I was thinking more about just adding the virtualization > info to an > > existing step like print_os_info or print_cpu_info. > > > > Cheers, > > David > > ----- > > > > >? ?in? ?vmError.cpp? ? > > > > > > I thought about doing this. > > > > > > > > >? ?But? on the other hand ,? the now? ?still? empty > > os::pd_print_virtualization_info? ? in? ? platforms != linux > > >? ?might fill over time? ?( we could? add? [at least for > some platforms]? ?other > > virtualization? related? info ). > > > > > > > > > Best regards, Matthias > > > > > > > > >> -----Original Message----- > > >> From: David Holmes > > > >> Sent: Montag, 28. Januar 2019 12:35 > > >> To: Baesken, Matthias >; 'hotspot- > > >> dev at openjdk.java.net ' > > > > >> Subject: Re: RFR : 8217786: Provide virtualization related > info in the > > hs_error > > >> file on linux s390x > > >> > > >> Hi Matthias, > > >> > > >> On 28/01/2019 6:48 pm, Baesken, Matthias wrote: > > >>> Hello, please review? this change ; it adds > virtualization related info in > > the > > >> hs_error file on linux s390x . > > >> > > >> Can't you include this information in an existing section > of the error > > >> processing code instead of adding a new function that is empty > > >> everywhere except Linux? > > >> > > >> Thanks, > > >> David > > >> > > >>> On linux s390x, we usually? (always?)? ?run in > virtualized environments > > >> (LPAR and/or z/VM / KVM ). > > >>> > > >>> It is helpful for instance in support cases to get some > information about > > the > > >> virtualized environment in the hs_error file . > > >>> A lot of info can be taken from the /proc/sysinfo file on > linux s390x . > > >>> > > >>> > > >>> Bug/webrev : > > >>> > > >>> https://bugs.openjdk.java.net/browse/JDK-8217786 > > >>> > > >>> > > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.1/ > > >>> > > >>> > > >>> > > >>> Best regards, Matthias > > >>> > From david.holmes at oracle.com Wed Jan 30 05:52:17 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Jan 2019 15:52:17 +1000 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <0ba510c1-6a1a-68f1-811c-0f538c3a472b@oracle.com> References: <0ba510c1-6a1a-68f1-811c-0f538c3a472b@oracle.com> Message-ID: <9501ad2d-9344-0fc9-fdbf-81ba52c9287a@oracle.com> Incremental looks good. Thanks, David On 28/01/2019 11:04 pm, Robbin Ehn wrote: > Hi all, here is v05. > > http://cr.openjdk.java.net/~rehn/8203469/v05/ > http://cr.openjdk.java.net/~rehn/8203469/v05/inc/ > > I have been asked to go on-top-of: > https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036425.html > > With a small grace-period. > There will be a v06 rebase on-top of that. > > Updated after comments and changes regarding safepoint_safe(). > In JFR code path, thread is always current, so it should not be calling > safepoint_safe. It also don't control polls, so even if it returns true > it is > not safe in that case. > > Updated to a handshake_safe() private method with a friend for handshakes. > > Test t1-3, stress testing and JFR. > > Thanks, Robbin > > On 1/15/19 11:39 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >> >> Thanks to Dan for pre-reviewing a lot! >> >> Background: >> ZGC often does very short safepoint operations. For a perspective, in a >> specJBB2015 run, G1 can have young collection stops lasting about 170 >> ms. While >> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >> operation it is. The time it takes to stop and start the JavaThreads >> is relative >> very large to a ZGC safepoint. With an operation that just takes 0.2ms >> the >> overhead of stopping and starting JavaThreads is several times the >> operation. >> >> High-level functionality change: >> Serializing the starting over Threads_lock takes time. >> - Don't wait on Threads_lock use the WaitBarrier. >> Serializing the stopping over Safepoint_lock takes time. >> - Let threads stop in parallel, remove Safepoint_lock. >> >> Details: >> JavaThreads have 2 abstract logical states: unsafe or safe. >> - Safe means the JavaThread will not touch Java heap or VM internal >> structures >> ?? without doing a transition and block before doing so. >> ???????? - The safe states are: >> ???????????????? - When polls armed: _thread_in_native and >> _thread_blocked. >> ???????????????? - When Threads_lock is held: externally suspended >> flag is set. >> ???????? - VM Thread have polls armed and holds the Threads_lock during a >> ?????????? safepoint. >> - Unsafe means that either Java heap or VM internal structures can be >> accessed >> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >> ???????? - All combination that are not safe are unsafe. >> >> We cannot start a safepoint until all unsafe threads have transitioned >> to a safe >> state. To make them safe, we arm polls in compiled code and make sure any >> transition to another unsafe state will be blocked. JavaThreads which >> are unsafe >> with state _thread_in_Java may transition to _thread_in_native without >> being >> blocked, since it just became a safe thread and we can proceed. Any >> safe thread >> may try to transition at any time to an unsafe state, thus coming into >> the >> safepoint blocking code at any moment, e.g., after the safepoint is >> over, or >> even at the beginning of next safepoint. >> >> The VMThread cannot tolerate false positives from the JavaThread >> thread state >> because that would mean starting the safepoint without all JavaThreads >> being >> safe. The two locks (Threads_lock and Safepoint_lock) make sure we >> never observe >> false positives from the safepoint blocking code, if we remove them, >> how do we >> handle false positives? >> >> By first publishing which barrier tag (safepoint counter) we will call >> WaitBarrier.wait() with as the threads safepoint id and then change >> the state to >> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable >> load of >> the state. A stable load of the thread state is successful if the thread >> safepoint id is the same both before and after the load of the state and >> safepoint id is current or InactiveSafepointCounter. If the stable >> load fails, >> the thread is considered safepoint unsafe. It's no longer enough that >> thread is >> have state _thread_blocked it must also have correct safepoint id >> before and >> after we read the state. >> >> Performance: >> The result of faster safepoints is that the average CPU time for >> JavaThreads >> between safepoints is higher, thus increasing the allocation rate. The >> thread >> that stops first waits shorter time until it gets started. Even the >> thread that >> stops last also have shorter stop since we start them faster. If your >> application is using a concurrent GC it may need re-tunning since each >> java >> worker thread have an increased CPU time/allocation rate. Often this >> means max >> performance is achieved using slightly less java worker threads than >> before. >> Also the increase allocation rate means shorter time between GC >> safepoints. >> - If you are using a non-concurrent GC, you should see improved >> latency and >> ?? throughput. >> - After re-tunning with a concurrent GC throughput should be equal or >> better but >> ?? with better latency. But bear in mind this is a latency patch, not a >> ?? throughput one. >> With current code a java thread is not to guarantee to run between >> safepoint (in >> theory a java thread can be starved indefinitely), since the VM thread >> may >> re-grab the Threads_locks before it woke up from previous safepoint. >> If the >> GC/VM don't respect MMU (minimum mutator utilization) or if your >> machine is very >> over-provisioned this can happen. >> The current schema thus re-safepoint quickly if the java threads have not >> started yet at the cost of latency. Since the new code uses the >> WaitBarrier with >> the safepoint counter, all threads must roll forward to next safepoint by >> getting at least some CPU time between two safepoints. Meaning MMU >> violations >> are more obvious. >> >> Some examples on numbers: >> - On a 16 strand machine synchronization and >> un-synchronization/starting is at >> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> >> ~100us and >> ?? starting ~400->~100us. >> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on >> Linux). >> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >> ?? synchronization time on 16 strands and ~5% score increase. In this >> case the GC >> ?? op is 1ms, so we reduce the overhead of synchronization from 100% >> to 10%. >> - specJBB2015 ParGC ~9% increase in critical-jops. >> >> Thanks, Robbin From robbin.ehn at oracle.com Wed Jan 30 06:08:02 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 07:08:02 +0100 Subject: RFR(XL): 8203469: Faster safepoints In-Reply-To: <9501ad2d-9344-0fc9-fdbf-81ba52c9287a@oracle.com> References: <0ba510c1-6a1a-68f1-811c-0f538c3a472b@oracle.com> <9501ad2d-9344-0fc9-fdbf-81ba52c9287a@oracle.com> Message-ID: <3031b863-eec5-c22c-ca86-e5880bc520f8@oracle.com> Thanks David! /Robbin On 2019-01-30 06:52, David Holmes wrote: > Incremental looks good. > > Thanks, > David > > On 28/01/2019 11:04 pm, Robbin Ehn wrote: >> Hi all, here is v05. >> >> http://cr.openjdk.java.net/~rehn/8203469/v05/ >> http://cr.openjdk.java.net/~rehn/8203469/v05/inc/ >> >> I have been asked to go on-top-of: >> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-January/036425.html >> With a small grace-period. >> There will be a v06 rebase on-top of that. >> >> Updated after comments and changes regarding safepoint_safe(). >> In JFR code path, thread is always current, so it should not be calling >> safepoint_safe. It also don't control polls, so even if it returns true it is >> not safe in that case. >> >> Updated to a handshake_safe() private method with a friend for handshakes. >> >> Test t1-3, stress testing and JFR. >> >> Thanks, Robbin >> >> On 1/15/19 11:39 AM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203469 >>> Code: http://cr.openjdk.java.net/~rehn/8203469/v00/webrev/ >>> >>> Thanks to Dan for pre-reviewing a lot! >>> >>> Background: >>> ZGC often does very short safepoint operations. For a perspective, in a >>> specJBB2015 run, G1 can have young collection stops lasting about 170 ms. While >>> in the same setup ZGC does 0.2ms to 1.5 ms operations depending on which >>> operation it is. The time it takes to stop and start the JavaThreads is relative >>> very large to a ZGC safepoint. With an operation that just takes 0.2ms the >>> overhead of stopping and starting JavaThreads is several times the operation. >>> >>> High-level functionality change: >>> Serializing the starting over Threads_lock takes time. >>> - Don't wait on Threads_lock use the WaitBarrier. >>> Serializing the stopping over Safepoint_lock takes time. >>> - Let threads stop in parallel, remove Safepoint_lock. >>> >>> Details: >>> JavaThreads have 2 abstract logical states: unsafe or safe. >>> - Safe means the JavaThread will not touch Java heap or VM internal structures >>> ?? without doing a transition and block before doing so. >>> ???????? - The safe states are: >>> ???????????????? - When polls armed: _thread_in_native and _thread_blocked. >>> ???????????????? - When Threads_lock is held: externally suspended flag is set. >>> ???????? - VM Thread have polls armed and holds the Threads_lock during a >>> ?????????? safepoint. >>> - Unsafe means that either Java heap or VM internal structures can be accessed >>> ?? by the JavaThread, e.g., _thread_in_Java, _thread_in_vm. >>> ???????? - All combination that are not safe are unsafe. >>> >>> We cannot start a safepoint until all unsafe threads have transitioned to a safe >>> state. To make them safe, we arm polls in compiled code and make sure any >>> transition to another unsafe state will be blocked. JavaThreads which are unsafe >>> with state _thread_in_Java may transition to _thread_in_native without being >>> blocked, since it just became a safe thread and we can proceed. Any safe thread >>> may try to transition at any time to an unsafe state, thus coming into the >>> safepoint blocking code at any moment, e.g., after the safepoint is over, or >>> even at the beginning of next safepoint. >>> >>> The VMThread cannot tolerate false positives from the JavaThread thread state >>> because that would mean starting the safepoint without all JavaThreads being >>> safe. The two locks (Threads_lock and Safepoint_lock) make sure we never observe >>> false positives from the safepoint blocking code, if we remove them, how do we >>> handle false positives? >>> >>> By first publishing which barrier tag (safepoint counter) we will call >>> WaitBarrier.wait() with as the threads safepoint id and then change the state to >>> _thread_blocked, the VMThread can ignore JavaThreads by doing a stable load of >>> the state. A stable load of the thread state is successful if the thread >>> safepoint id is the same both before and after the load of the state and >>> safepoint id is current or InactiveSafepointCounter. If the stable load fails, >>> the thread is considered safepoint unsafe. It's no longer enough that thread is >>> have state _thread_blocked it must also have correct safepoint id before and >>> after we read the state. >>> >>> Performance: >>> The result of faster safepoints is that the average CPU time for JavaThreads >>> between safepoints is higher, thus increasing the allocation rate. The thread >>> that stops first waits shorter time until it gets started. Even the thread that >>> stops last also have shorter stop since we start them faster. If your >>> application is using a concurrent GC it may need re-tunning since each java >>> worker thread have an increased CPU time/allocation rate. Often this means max >>> performance is achieved using slightly less java worker threads than before. >>> Also the increase allocation rate means shorter time between GC safepoints. >>> - If you are using a non-concurrent GC, you should see improved latency and >>> ?? throughput. >>> - After re-tunning with a concurrent GC throughput should be equal or better but >>> ?? with better latency. But bear in mind this is a latency patch, not a >>> ?? throughput one. >>> With current code a java thread is not to guarantee to run between safepoint (in >>> theory a java thread can be starved indefinitely), since the VM thread may >>> re-grab the Threads_locks before it woke up from previous safepoint. If the >>> GC/VM don't respect MMU (minimum mutator utilization) or if your machine is very >>> over-provisioned this can happen. >>> The current schema thus re-safepoint quickly if the java threads have not >>> started yet at the cost of latency. Since the new code uses the WaitBarrier with >>> the safepoint counter, all threads must roll forward to next safepoint by >>> getting at least some CPU time between two safepoints. Meaning MMU violations >>> are more obvious. >>> >>> Some examples on numbers: >>> - On a 16 strand machine synchronization and un-synchronization/starting is at >>> ?? least 3x faster (in non-trivial test). Synchronization ~600 -> ~100us and >>> ?? starting ~400->~100us. >>> ?? (Semaphore path is a bit slower than futex in the WaitBarrier on Linux). >>> - SPECjvm2008 serial (untuned G1) gives 10x (1 ms vs 100 us) faster >>> ?? synchronization time on 16 strands and ~5% score increase. In this case >>> the GC >>> ?? op is 1ms, so we reduce the overhead of synchronization from 100% to 10%. >>> - specJBB2015 ParGC ~9% increase in critical-jops. >>> >>> Thanks, Robbin From david.holmes at oracle.com Wed Jan 30 07:29:50 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Jan 2019 17:29:50 +1000 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> Message-ID: <384133a1-c2e0-11f2-6890-798c3646222f@oracle.com> Hi Patricio, First, thanks for all the many weeks of work you've put into this, pulling together a number of ideas from different people to make it all work! I've only got a few minor comments/suggestions. On 30/01/2019 10:24 am, Patricio Chilano wrote: > Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ > src/hotspot/share/runtime/interfaceSupport.inline.hpp I'm very unclear how ThreadLockBlockInVM differs from ThreadBlockInVM. You've duplicated a lot of complex code which is masking the actual difference between the two wrappers to me. It seems to me that an extra arg to transition_and_fence should allow you to handle the new behaviour without having to duplicate so much of this code. In any case the semantics of ThreadLockBlockInVM needs to be described. Also I'm unclear what the "Lock" in ThreadLockBlockInVM actually refers to. I find the name quite jarring to read. On the subject of naming, do_preempt and preempt_by_safepoint don't really convey to me what happens - what is being "preempted" here? I would suggest a more direct Monitor::release_for_safepoint --- Logging: why "nativemonitor"? The logging in mutex.cpp doesn't relate to a "native" monitor?? Actually I'm not even sure if we need bother at all with the one logging statement that is present. --- src/hotspot/share/runtime/mutex.cpp void Monitor::lock_without_safepoint_check(Thread * self) { // Ensure that the Monitor does not require or allow safepoint checks. The comment there should only say "not require". void Monitor::preempt_by_safepoint() { _lock.unlock(); } Apart from renaming this as suggested above, aren't there any suitable assertions we should have here? safepoint-in-progress or handshake-in-progress? _owner == Thread::current? Nit: assert(_owner == Thread::current(), "should be equal: owner=" INTPTR_FORMAT ", self=" INTPTR_FORMAT, p2i(_owner), p2i(Thread::current())); with Dan's enhanced assertions there's an indentation issue. The second line should indent to the first comma, but that will make the second line extend way past 80 columns. Also you could factor that assertion for _owner==Thread::current() into its own function or macro to avoid the repetition. OSThreadWaitState osts(self->osthread(), false /* not Object.wait() */); This needs to be returned to its original place as per Dan's comments. } else { Monitor::lock(self); } You don't need Monitor:: here // Temporary JVM_RawMonitor* support. A raw monitor can just be a PlatformMonitor now. This needs to be resolved before committing. Some of the existing commentary on what raw monitors are needs to be retained. Not clear if we need to set the _owner field or can just skip it. Monitor::~Monitor() { assert(_owner == NULL, "should be NULL: owner=" INTPTR_FORMAT, p2i(_owner)); } Will this automatically result in the PlatformMonitor destructor being called? --- Thanks, David ----- From robbin.ehn at oracle.com Wed Jan 30 08:17:24 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 09:17:24 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <384133a1-c2e0-11f2-6890-798c3646222f@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> <384133a1-c2e0-11f2-6890-798c3646222f@oracle.com> Message-ID: <0c57f059-43bd-371c-9d9b-00358e6d527c@oracle.com> Hi David, > > I'm very unclear how ThreadLockBlockInVM differs from ThreadBlockInVM. You've > duplicated a lot of complex code which is masking the actual difference between > the two wrappers to me. It seems to me that an extra arg to transition_and_fence > should allow you to handle the new behaviour without having to duplicate so much > of this code. In any case the semantics of ThreadLockBlockInVM needs to be > described. The entire file is just a load of duplicated methods, e.g. transition_and_fence/transition/transition_from_native with slight variations. Not considering the ENTRY macros which we have zillion of, looking very much the same :) IMHO it much easier to read code without all those branches. I suspect it done this way to easier get it inlined. Already today transition_from_native is almost never inlined due to being to large (compared to the methods using it). For the future my suggestion is to make native and block identical regarding transition, that would help. But still suspend/async exception should only be delivered in certain transitions. Using templates for ThreadStateTransition so we can generate/eliminate code at compile time would be better regarding the duplication. Essential just generate it from a 'table', where we just list state to state and what checks apply. /Robbin > > Also I'm unclear what the "Lock" in ThreadLockBlockInVM actually refers to. I > find the name quite jarring to read. > > On the subject of naming, do_preempt and preempt_by_safepoint don't really > convey to me what happens - what is being "preempted" here? I would suggest a > more direct Monitor::release_for_safepoint > > --- > > Logging: why "nativemonitor"? The logging in mutex.cpp doesn't relate to a > "native" monitor?? Actually I'm not even sure if we need bother at all with the > one logging statement that is present. > > --- > > src/hotspot/share/runtime/mutex.cpp > > void Monitor::lock_without_safepoint_check(Thread * self) { > ? // Ensure that the Monitor does not require or allow safepoint checks. > > The comment there should only say "not require". > > void Monitor::preempt_by_safepoint() { > ? _lock.unlock(); > } > > Apart from renaming this as suggested above, aren't there any suitable > assertions we should have here? safepoint-in-progress or handshake-in-progress? > _owner == Thread::current? > > Nit: > > assert(_owner == Thread::current(), "should be equal: owner=" INTPTR_FORMAT > ?????????????????? ", self=" INTPTR_FORMAT, p2i(_owner), p2i(Thread::current())); > > with Dan's enhanced assertions there's an indentation issue. The second line > should indent to the first comma, but that will make the second line extend way > past 80 columns. > > Also you could factor that assertion for _owner==Thread::current() into its own > function or macro to avoid the repetition. > > ?OSThreadWaitState osts(self->osthread(), false /* not Object.wait() */); > > This needs to be returned to its original place as per Dan's comments. > > ??? } else { > ????? Monitor::lock(self); > ??? } > > You don't need Monitor:: here > > // Temporary JVM_RawMonitor* support. A raw monitor can just be a > PlatformMonitor now. > > This needs to be resolved before committing. Some of the existing commentary on > what raw monitors are needs to be retained. Not clear if we need to set the > _owner field or can just skip it. > > Monitor::~Monitor() { > ? assert(_owner == NULL, "should be NULL: owner=" INTPTR_FORMAT, p2i(_owner)); > } > > Will this automatically result in the PlatformMonitor destructor being called? > > --- > > Thanks, > David > ----- > From robbin.ehn at oracle.com Wed Jan 30 08:21:50 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 09:21:50 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes Message-ID: Hi all, please review. Code: http://cr.openjdk.java.net/~rehn/8218041/webrev/ Issue: https://bugs.openjdk.java.net/browse/JDK-8218041 After fixing these includes, there was a circular dependency via shenandoah code. I moved try_cancel_gc to cpp where the only use was. So it never should had been in the inline header in the first place. I listed why the include is needed below. Tier 1 and no pre-compiled. FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. gcc complains there being a local comdat symbol, forcing it to be inlined or using clang there is no issue. So it looks like a gcc bug both in 7.3 and 8.2. Thanks, Robbin src/hotspot/share/aot/aotLoader.cpp runtime/os.inline.hpp for os::dll_unload src/hotspot/share/c1/c1_Runtime1.cpp runtime/handles.inline.hpp for Handle(Thread*, oop) src/hotspot/share/gc/z/zFuture.inline.hpp runtime/interfaceSupport.inline.hpp not used. src/hotspot/share/prims/nativeLookup.cpp runtime/os.inline.hpp for os::dll_unload src/hotspot/share/runtime/handles.hpp Forward declaration Thread src/hotspot/share/runtime/handles.inline.hpp runtime/thread.hpp for Thread::current oops/oop.inline.hpp for oopDesc::is_a oops/metadata.hpp for is_valid src/hotspot/share/runtime/semaphore.inline.hpp runtime/thread.hpp for osthread src/hotspot/share/runtime/vframe.cpp runtime/thread.inline.hpp for JavaThread::class_to_be_initialized From kim.barrett at oracle.com Wed Jan 30 08:50:07 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 30 Jan 2019 03:50:07 -0500 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: Message-ID: <5C165FCF-9955-4528-BCEF-B67C6315F9B1@oracle.com> > On Jan 30, 2019, at 3:21 AM, Robbin Ehn wrote: > > Hi all, please review. > > Code: > http://cr.openjdk.java.net/~rehn/8218041/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8218041 Looks good. From erik.osterlund at oracle.com Wed Jan 30 08:54:10 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 30 Jan 2019 09:54:10 +0100 Subject: 8216541: CompiledICHolders of VM locked unloaded nmethods are released too late In-Reply-To: <4751bbf9-e4a4-a1df-52f8-e339e5e09351@oracle.com> References: <0345dd70-fa19-a7e3-7892-8f4500f8339f@oracle.com> <4195c9a1-e91c-1014-8fa5-2b0d3f6dfc30@oracle.com> <44b1aa28-81b0-6e9a-1b13-bb973bb5bd30@oracle.com> <4751bbf9-e4a4-a1df-52f8-e339e5e09351@oracle.com> Message-ID: Hi Kozlov, Thanks for the review. /Erik On 2019-01-29 23:31, Vladimir Kozlov wrote: > Looks good to me too. > > Thanks, > Vladimir > > On 1/29/19 2:40 AM, Tobias Hartmann wrote: >> Hi Erik, >> >> okay, got it. Thanks for the details, your fix looks good to me! >> >> Best regards, >> Tobias >> >> On 29.01.19 11:38, Erik ?sterlund wrote: >>> Hi Tobias, >>> >>> Thanks for having a look at this. >>> >>> On 2019-01-29 09:16, Tobias Hartmann wrote: >>>> Hi Erik, >>>> >>>> very nice analysis, thanks a lot for investigating! >>>> >>>> On 28.01.19 14:56, Erik ?sterlund wrote: >>>>> http://cr.openjdk.java.net/~eosterlund/8216541/webrev.00/ >>>> >>>> Why did you remove the call to >>>> thread->set_scanned_compiled_method(NULL) in sweeper.cpp? >>> >>> Because the CompiledMethodMarker destructor already nulls this out, >>> and redundantly nulling it out >>> again offers no extra protection. >>> >>> The idea of nulling it out before calling flush seems to have been to >>> prevent the GC scanning from >>> seeing this flushed nmethod in a safepoint, accidentally resurrecting >>> it from the dead. But that is >>> already impossible, because flush() is called with a never safepoint >>> checking lock (which guarantees >>> we don't have any and can't add any safepoint checks while holding >>> that lock or we will deadlock >>> badly). Therefore such safepoints will happen strictly after the >>> processing of the compiled method >>> is finished, and it is already cleared the normal way. >>> >>> By removing that pointless clearing, I could get rid of the >>> release_compiled_method() function and >>> just call flush directly instead. I get confused by there being two >>> "destroy" functions, one in the >>> sweeper and one in the nmethod, so I wanted it gone. >>> >>>> >>>>> The proposed change has survived 200 rounds of kitchensink, >>>>> hs-tier1-3 and hs-precheckin-comp. >>>> >>>> In the meanwhile, could you please run some more 100x iterations of >>>> kitchensink? >>> >>> Sure, running some more as we speak. >>> >>> Thanks, >>> /Erik >>> >>>> >>>> Thanks, >>>> Tobias >>>> From matthias.baesken at sap.com Wed Jan 30 09:00:44 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 30 Jan 2019 09:00:44 +0000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: <032db50d-0086-c44d-0655-a2fd100dce31@oracle.com> References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> <032db50d-0086-c44d-0655-a2fd100dce31@oracle.com> Message-ID: Hi David, > > Style nit: avoid implicit booleans, explicitly check != NULL I added the explicit "!= NULL" check and an add a line with an introductory text . @Thomas - may I add you as reviewer ? Thanks, Matthias > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 30. Januar 2019 06:49 > To: Baesken, Matthias ; Thomas St?fe > > Cc: hotspot-dev at openjdk.java.net > Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error > file on linux s390x > > Hi Matthias, > > Thanks for reworking this. > > On 30/01/2019 2:56 am, Baesken, Matthias wrote: > > Hello, I added a break to avoid potential printing lines multiple times, > > and removed the comment line : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.3/ > > A couple of minor comments: > > src/hotspot/os/linux/os_linux.cpp > > + while (keywords_to_match[i]) { > > Style nit: avoid implicit booleans, explicitly check != NULL > > + void os::Linux::print_virtualization_info(outputStream* st) { > > Don't you want an initial print of some introductory text eg: > > "Virtualization Information" > > No need for updated webrev. > > Thanks, > David > ----- From thomas.stuefe at gmail.com Wed Jan 30 09:10:18 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 30 Jan 2019 10:10:18 +0100 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> <032db50d-0086-c44d-0655-a2fd100dce31@oracle.com> Message-ID: On Wed, Jan 30, 2019 at 10:00 AM Baesken, Matthias wrote: > Hi David, > > > > Style nit: avoid implicit booleans, explicitly check != NULL > > I added the explicit "!= NULL" check and an add a line with an > introductory text . > > > @Thomas - may I add you as reviewer ? > > Yes. I read up yesterday on /proc/sysinfo and it turns out that is a s390 specific /proc extension? So maybe this whole function should be even more generalized - maybe take a filename too as input - or just moved to s390 speciifc code. However, since we put you thru too many iterations already with this patch I am fine with this version. We can improve it later should we add virtualization info for other architectures as well. ..Thomas > Thanks, Matthias > > > > -----Original Message----- > > From: David Holmes > > Sent: Mittwoch, 30. Januar 2019 06:49 > > To: Baesken, Matthias ; Thomas St?fe > > > > Cc: hotspot-dev at openjdk.java.net > > Subject: Re: RFR : 8217786: Provide virtualization related info in the > hs_error > > file on linux s390x > > > > Hi Matthias, > > > > Thanks for reworking this. > > > > On 30/01/2019 2:56 am, Baesken, Matthias wrote: > > > Hello, I added a break to avoid potential printing lines multiple > times, > > > and removed the comment line : > > > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.3/ > > > > A couple of minor comments: > > > > src/hotspot/os/linux/os_linux.cpp > > > > + while (keywords_to_match[i]) { > > > > Style nit: avoid implicit booleans, explicitly check != NULL > > > > + void os::Linux::print_virtualization_info(outputStream* st) { > > > > Don't you want an initial print of some introductory text eg: > > > > "Virtualization Information" > > > > No need for updated webrev. > > > > Thanks, > > David > > ----- > > From dawid.weiss at gmail.com Wed Jan 30 09:27:32 2019 From: dawid.weiss at gmail.com (Dawid Weiss) Date: Wed, 30 Jan 2019 10:27:32 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? Message-ID: Hello, There's quite a few of those JVM errors that popped up recently on one of Lucene's CI machines: https://issues.apache.org/jira/browse/LUCENE-8668 Happens on various JVMs (see the above issue). Would it be something familiar to any of you? A known issue or should we try to keep digging (for a repro, for example)? Dawid From robbin.ehn at oracle.com Wed Jan 30 09:32:10 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 10:32:10 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <5C165FCF-9955-4528-BCEF-B67C6315F9B1@oracle.com> References: <5C165FCF-9955-4528-BCEF-B67C6315F9B1@oracle.com> Message-ID: <06551b6f-57ff-4665-1394-e53b2dc13827@oracle.com> Hi Kim, thanks! Unfortunately I found an issue when building with clang 7. It seem to do much better inlning, so missing symbol at link (HandleMark()); I'm going over HandleMark constructor useage. I found so far 24 missing includes for handles.inline.hpp. Compile testing this with all sort of compilers and options, now clang works. But the inline constructor just calls a non-inlined method: inline HandleMark::HandleMark() { initialize(Thread::current()); } At least now Thread::current() is in thread.hpp, so the reason for inline seem not true anymore. I'm thinking that maybe I should change it and just include handles.hpp on the missing places. But since HandleMarkCleaner needs the inline file it's very asymmetric with HandleMark is fine with hpp but if you also use HandleMarkCleaner you need the inline. So maybe I'll just leave the 'dummy' inline constructor. So sending out an v2 of some sort. /Robbin On 2019-01-30 09:50, Kim Barrett wrote: >> On Jan 30, 2019, at 3:21 AM, Robbin Ehn wrote: >> >> Hi all, please review. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8218041 > > Looks good. > From tobias.hartmann at oracle.com Wed Jan 30 09:45:07 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 30 Jan 2019 10:45:07 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: Message-ID: <38a0c766-b1c9-f2f1-4611-0872f185a6e8@oracle.com> Hi Dawid, thanks for reporting this issue! I'm not aware of any related bugs that we've fixed lately. A reproducer would be very nice. Did you try to reproduce with Replay Compilation? java -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+ReplayIgnoreInitErrors -XX:+ReplayCompiles -XX:+PrintCompilation -XX:ReplayDataFile=replay_pid10534.log You just need to make sure that the required classes/jars are on the classpath. Best regards, Tobias On 30.01.19 10:27, Dawid Weiss wrote: > Hello, > > There's quite a few of those JVM errors that popped up recently on one > of Lucene's CI machines: > > https://issues.apache.org/jira/browse/LUCENE-8668 > > Happens on various JVMs (see the above issue). Would it be something > familiar to any of you? A known issue or should we try to keep digging > (for a repro, for example)? > > Dawid > From nils.eliasson at oracle.com Wed Jan 30 09:57:51 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 30 Jan 2019 10:57:51 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: Message-ID: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> Hi Dawid, The hs_err-file is from a JDK 10 build. Would you mind testing with JDK 11 or JDK 12-ea? What build of Lucene was this run against? Can point me to the relevant jar? I will try reproducing with 7.6.0. Regards, Nils On 2019-01-30 10:27, Dawid Weiss wrote: > Hello, > > There's quite a few of those JVM errors that popped up recently on one > of Lucene's CI machines: > > https://issues.apache.org/jira/browse/LUCENE-8668 > > Happens on various JVMs (see the above issue). Would it be something > familiar to any of you? A known issue or should we try to keep digging > (for a repro, for example)? > > Dawid From shade at redhat.com Wed Jan 30 10:17:44 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 11:17:44 +0100 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> Message-ID: <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> On 1/29/19 8:39 PM, coleen.phillimore at oracle.com wrote: > Summary: remove gc timing for short runtime cleanup triggering; make symbol table cleaning triggered > automatically on unloading > > Ran runThese with all Oracle GCs and got similar numbers of symbols unloaded.? Also ran tier1-5. > > See bug for more information. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213753.01/webrev This looks fine. I tested Shenandoah just in case (it should not be affected, because it uses shared/parallelCleaning.* and SystemDictionary::do_unloading, like G1), it is still okay. Minor nits: *) Add spaces before format specifiers here? 698 log_debug(symboltable)("Concurrent work triggered, load factor:%f, items to clean:%s", 699 get_load_factor(), has_items_to_clean() ? "true" : "false"); *) In SystemDictionary::do_unloading, do we think that trigger_cleanup() are cheap? Otherwise it makes sense to retain a single GCTraceTime block around all three trigger_cleanups? -Aleksey From nils.eliasson at oracle.com Wed Jan 30 10:13:19 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 30 Jan 2019 11:13:19 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> Message-ID: Sorry, too fast. You had already tested on various builds. Regards, Nils On 2019-01-30 10:57, Nils Eliasson wrote: > Hi Dawid, > > The hs_err-file is from a JDK 10 build. Would you mind testing with > JDK 11 or JDK 12-ea? > > What build of Lucene was this run against? Can point me to the > relevant jar? I will try reproducing with 7.6.0. > > Regards, > > Nils > > On 2019-01-30 10:27, Dawid Weiss wrote: >> Hello, >> >> There's quite a few of those JVM errors that popped up recently on one >> of Lucene's CI machines: >> >> https://issues.apache.org/jira/browse/LUCENE-8668 >> >> Happens on various JVMs (see the above issue). Would it be something >> familiar to any of you? A known issue or should we try to keep digging >> (for a repro, for example)? >> >> Dawid From dawid.weiss at gmail.com Wed Jan 30 10:43:32 2019 From: dawid.weiss at gmail.com (Dawid Weiss) Date: Wed, 30 Jan 2019 11:43:32 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> Message-ID: Hi guys, Let me reply to both e-mails at once. > A reproducer would be very nice. Did you try to reproduce with Replay Compilation? I haven't tried to reproduce it, but it's popping up quite a bit recently, see here for a backlog: https://lucene.markmail.org/search/%22jenkins+server%22+PhaseIdealLoop::split_up+list:org.apache.lucene.java-dev+order:date-backward For example this one https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/ is: [junit4] # JRE version: OpenJDK Runtime Environment (11.0+28) (build 11+28) [junit4] # Java VM: OpenJDK 64-Bit Server VM (11+28, mixed mode, tiered, g1 gc, linux-amd64) Some of those builds are still on the server (and contain hs logs). What worries me is that this only happens on Uwe's machine -- may be related to particular hardware config it happens on. A repro isn't going to be easy (are they ever? ;) as those tests run pretty much at random within a single forked JVM and I bet it's just some unusual pattern that tiggers the problem. Looking at where the problem occurs it seems there is a common core related to compiling this method: Current CompileTask: C2:1534619 50541 s! 4 org.apache.lucene.index.ConcurrentMergeScheduler::merge (280 bytes) The path leading to it may differ (when you diff those different hs_err logs against each other), but it seems to be caused by merge compilation in all cases I looked at. I can monitor this and attach new logs to the Jira issue (LUCENE-8668). Uwe will be at Fosdem so I'm sure he'll be ready to figure it out together with you, should you be there. Dawid On Wed, Jan 30, 2019 at 11:22 AM Nils Eliasson wrote: > > Sorry, too fast. You had already tested on various builds. > > Regards, > > Nils > > On 2019-01-30 10:57, Nils Eliasson wrote: > > Hi Dawid, > > > > The hs_err-file is from a JDK 10 build. Would you mind testing with > > JDK 11 or JDK 12-ea? > > > > What build of Lucene was this run against? Can point me to the > > relevant jar? I will try reproducing with 7.6.0. > > > > Regards, > > > > Nils > > > > On 2019-01-30 10:27, Dawid Weiss wrote: > >> Hello, > >> > >> There's quite a few of those JVM errors that popped up recently on one > >> of Lucene's CI machines: > >> > >> https://issues.apache.org/jira/browse/LUCENE-8668 > >> > >> Happens on various JVMs (see the above issue). Would it be something > >> familiar to any of you? A known issue or should we try to keep digging > >> (for a repro, for example)? > >> > >> Dawid From lutz.schmidt at sap.com Wed Jan 30 10:48:50 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 30 Jan 2019 10:48:50 +0000 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> Message-ID: <0034E860-F7DE-4565-B157-93418E952388@sap.com> Hi Aleksey, I like the added resilience! I'm not a reviewer, though. When looking at the newly introduced else branch, I had the idea of replacing the entire switch by st->print("%*s", 2*unitsize, "????????????????"); Like it? I do! Thanks, Lutz ?On 29.01.19, 18:46, "hotspot-dev on behalf of Thomas St?fe" wrote: Looks good. Note that this coding assumes (always did) that the input pointer is aligned to the unitsize, otherwise the printing would not work on platforms which do not allow unaligned loads. This means that if the pc in the ucontext is unaligned rubbish we may crash on platforms where we print with a unitsize > 1 and unaligned access is not allowed. But your patch does not make the problem worse, so it is fine to me. Cheers, Thomas On Tue, Jan 29, 2019 at 5:54 PM Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217994 > > Fix: > http://cr.openjdk.java.net/~shade/8217994/webrev.01/ > > This is related to JDK-8217879 (hs_err should print more instructions in > hex dump), and this more > generic fix should cover more cases in error handler. New gtest verifies > we can call os::p_h_d on > bad memory now. It also implicitly verifies that SafeFetch machinery works > fine. Consider running > that gtest (make images run-test TEST=gtest:os) on your platform if you > suspect it does not. > > Testing: (Linux, Windows) x86_64 build, gtest, eyeballing gtest output > > Thanks, > -Aleksey > > From shade at redhat.com Wed Jan 30 10:51:07 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 11:51:07 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> Message-ID: <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> On 1/30/19 11:43 AM, Dawid Weiss wrote: > Current CompileTask: > C2:1534619 50541 s! 4 > org.apache.lucene.index.ConcurrentMergeScheduler::merge (280 bytes) > > The path leading to it may differ (when you diff those different > hs_err logs against each other), but it seems to be caused by merge > compilation in all cases I looked at. This is release build, right? fastdebug build probably asserts somewhere? > I can monitor this and attach new logs to the Jira issue > (LUCENE-8668). Uwe will be at Fosdem so I'm sure he'll be ready to > figure it out together with you, should you be there. If Nils is not there, let Uwe find me at FOSDEM? -Aleksey From david.holmes at oracle.com Wed Jan 30 10:52:56 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Jan 2019 20:52:56 +1000 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <06551b6f-57ff-4665-1394-e53b2dc13827@oracle.com> References: <5C165FCF-9955-4528-BCEF-B67C6315F9B1@oracle.com> <06551b6f-57ff-4665-1394-e53b2dc13827@oracle.com> Message-ID: <9945fe3a-9a24-a6e5-7bbd-4dc4282fa0f5@oracle.com> Hi Robbin, On 30/01/2019 7:32 pm, Robbin Ehn wrote: > Hi Kim, thanks! > > Unfortunately I found an issue when building with clang 7. > It seem to do much better inlning, so missing symbol at link > (HandleMark()); > I'm going over HandleMark constructor useage. > I found so far 24 missing includes for handles.inline.hpp. So if it compiles but there's no include of the .inline.hpp file that just means it won't get inlined - right? (ie the .hpp file must still be getting included) David > Compile testing this with all sort of compilers and options, now clang > works. > > But the inline constructor just calls a non-inlined method: > inline HandleMark::HandleMark() { > ? initialize(Thread::current()); > } > > At least now Thread::current() is in thread.hpp, so the reason for > inline seem not true anymore. I'm thinking that maybe I should change it > and just include handles.hpp on the missing places. But since > HandleMarkCleaner needs the inline file it's very asymmetric with > HandleMark is fine with hpp but if you also use HandleMarkCleaner you > need the inline. So maybe I'll just leave the 'dummy' inline constructor. > > So sending out an v2 of some sort. > > /Robbin > > On 2019-01-30 09:50, Kim Barrett wrote: >>> On Jan 30, 2019, at 3:21 AM, Robbin Ehn wrote: >>> >>> Hi all, please review. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8218041 >> >> Looks good. >> From shade at redhat.com Wed Jan 30 10:54:02 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 11:54:02 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: Message-ID: <6f79c5d4-3da3-20fb-1440-d771f8392de0@redhat.com> On 1/30/19 9:21 AM, Robbin Ehn wrote: > Code: > http://cr.openjdk.java.net/~rehn/8218041/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8218041 > > After fixing these includes, there was a circular dependency via shenandoah > code. I moved try_cancel_gc to cpp where the only use was. So it never should > had been in the inline header in the first place. I agree with this move. try_cancel is not very performance-sensitive, and it is called when other, more heavy-weight stuff is happening. I marked the bug with "gc-shenandoah", so we get it in our backporting queue. Shenandoah still builds fine with/without PCH, and passes tests after this change. Other changes look good too. -Aleksey From dawid.weiss at gmail.com Wed Jan 30 10:55:35 2019 From: dawid.weiss at gmail.com (Dawid Weiss) Date: Wed, 30 Jan 2019 11:55:35 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> Message-ID: > This is release build, right? fastdebug build probably asserts somewhere? I don't think we (or Uwe) runs jobs with fastdebug builds, to be honest. This isn't a bad idea though. > If Nils is not there, let Uwe find me at FOSDEM? CCing: Uwe. D. From shade at redhat.com Wed Jan 30 11:02:58 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 12:02:58 +0100 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: <0034E860-F7DE-4565-B157-93418E952388@sap.com> References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> <0034E860-F7DE-4565-B157-93418E952388@sap.com> Message-ID: <9dff8727-9bfb-41e3-8be6-8e8d6421e3d1@redhat.com> On 1/30/19 11:48 AM, Schmidt, Lutz wrote: > When looking at the newly introduced else branch, I had the idea of replacing the entire switch by > > st->print("%*s", 2*unitsize, "????????????????"); > > Like it? I do! That's a cute trick, but I don't think it works: the format width is the _minimal_ width, and the that 16-wide string argument would not be truncated to 2*unitsize. -Aleksey From nils.eliasson at oracle.com Wed Jan 30 10:55:07 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 30 Jan 2019 11:55:07 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> Message-ID: <6ce2d1d2-a3ca-1d8f-56eb-5a76c77754d1@oracle.com> Hi, With the help of the replay-file I manange to compile the right Class, but the inlining doesn't match. The latest release i see is 7.6, but it looks like the crash is from a 9.0? Is that master? Do you have a link to a build that I can download? Regards, Nils On 2019-01-30 11:55, Dawid Weiss wrote: >> This is release build, right? fastdebug build probably asserts somewhere? > I don't think we (or Uwe) runs jobs with fastdebug builds, to be > honest. This isn't a bad idea though. > >> If Nils is not there, let Uwe find me at FOSDEM? > CCing: Uwe. > > D. From robbin.ehn at oracle.com Wed Jan 30 11:06:56 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 12:06:56 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <9945fe3a-9a24-a6e5-7bbd-4dc4282fa0f5@oracle.com> References: <5C165FCF-9955-4528-BCEF-B67C6315F9B1@oracle.com> <06551b6f-57ff-4665-1394-e53b2dc13827@oracle.com> <9945fe3a-9a24-a6e5-7bbd-4dc4282fa0f5@oracle.com> Message-ID: <2fbc7967-47fb-e6e5-5e80-87a7f7da391d@oracle.com> Hi David, On 2019-01-30 11:52, David Holmes wrote: > > So if it compiles but there's no include of the .inline.hpp file that just means > it won't get inlined - right? (ie the .hpp file must still be getting included) If the method is inlined everywhere, there will be no symbol at link time. Since we do not generate any object file from the inline.hpp at least one object file from a cpp file must generate a out-line version. Since gcc seems to have issue with inlining, we find symbol at link time. Using clang or forcing inline via attribute linker fails if not the inline method is included either directly or via some other include. So you must include the inline.hpp in your cpp if you use a method from there since no-one else is guaranteed to generate that symbol. In this case clang manage to inline all the HandleMark(). I have now fixed all includes regarding handles.inline.hpp in cpp/inline.hpp files, sending out a v02 in a few minutes. Thanks, Robbin > > David > >> Compile testing this with all sort of compilers and options, now clang works. >> >> But the inline constructor just calls a non-inlined method: >> inline HandleMark::HandleMark() { >> ?? initialize(Thread::current()); >> } >> >> At least now Thread::current() is in thread.hpp, so the reason for inline seem >> not true anymore. I'm thinking that maybe I should change it and just include >> handles.hpp on the missing places. But since HandleMarkCleaner needs the >> inline file it's very asymmetric with HandleMark is fine with hpp but if you >> also use HandleMarkCleaner you need the inline. So maybe I'll just leave the >> 'dummy' inline constructor. >> >> So sending out an v2 of some sort. >> >> /Robbin >> >> On 2019-01-30 09:50, Kim Barrett wrote: >>>> On Jan 30, 2019, at 3:21 AM, Robbin Ehn wrote: >>>> >>>> Hi all, please review. >>>> >>>> Code: >>>> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8218041 >>> >>> Looks good. >>> From dawid.weiss at gmail.com Wed Jan 30 11:11:27 2019 From: dawid.weiss at gmail.com (Dawid Weiss) Date: Wed, 30 Jan 2019 12:11:27 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: <6ce2d1d2-a3ca-1d8f-56eb-5a76c77754d1@oracle.com> References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> <6ce2d1d2-a3ca-1d8f-56eb-5a76c77754d1@oracle.com> Message-ID: Hi Nils, Those builds are made straight up from git (from various branches). For example this failure: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/ with these hs_err and replay files: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/artifact/solr/build/solr-core/test/J1/hs_err_pid27685.log https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/artifact/solr/build/solr-core/test/J1/replay_pid27685.log comes from rev db57468242 of git at github.com:apache/lucene-solr.git (the revision is mentioned on jenkins and in the full log), so: git clone git at github.com:apache/lucene-solr.git cd lucene-solr git checkout db57468242 then you can compile with: cd lucene ant jar Dawid On Wed, Jan 30, 2019 at 12:03 PM Nils Eliasson wrote: > > Hi, > > With the help of the replay-file I manange to compile the right Class, > but the inlining doesn't match. The latest release i see is 7.6, but it > looks like the crash is from a 9.0? Is that master? Do you have a link > to a build that I can download? > > Regards, > > Nils > > On 2019-01-30 11:55, Dawid Weiss wrote: > >> This is release build, right? fastdebug build probably asserts somewhere? > > I don't think we (or Uwe) runs jobs with fastdebug builds, to be > > honest. This isn't a bad idea though. > > > >> If Nils is not there, let Uwe find me at FOSDEM? > > CCing: Uwe. > > > > D. From robbin.ehn at oracle.com Wed Jan 30 11:17:26 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 12:17:26 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <6f79c5d4-3da3-20fb-1440-d771f8392de0@redhat.com> References: <6f79c5d4-3da3-20fb-1440-d771f8392de0@redhat.com> Message-ID: <9e4f33de-3c05-d5b2-860d-0c17b0e25b66@oracle.com> > I agree with this move. try_cancel is not very performance-sensitive, and it is called when other, > more heavy-weight stuff is happening. I marked the bug with "gc-shenandoah", so we get it in our > backporting queue. Shenandoah still builds fine with/without PCH, and passes tests after this change. Great, and it still can be inlined since it's in the same compilation unit and you do not take the address of it. > > Other changes look good too. Thanks, v02 in a sec or two! /Robbin > > -Aleksey > From robbin.ehn at oracle.com Wed Jan 30 11:17:42 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 12:17:42 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: Message-ID: Hi, here is v02. Add includes to handles.inline.hpp in all files we use a method in there. Compiles on my 7 different configs including gcc 7.3/8.2, clang 7, no pre-compiled headers. And tier-1 which includes our std builds. http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/inc http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/ Also this seems to signification reduce gcc inline warnings about local comdat symbol for Handle(Thread*, oop). Thanks, Robbin On 2019-01-30 09:21, Robbin Ehn wrote: > Hi all, please review. > > Code: > http://cr.openjdk.java.net/~rehn/8218041/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8218041 > > After fixing these includes, there was a circular dependency via shenandoah > code. I moved try_cancel_gc to cpp where the only use was. So it never should > had been in the inline header in the first place. > > I listed why the include is needed below. > > Tier 1 and no pre-compiled. > > FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. > gcc complains there being a local comdat symbol, forcing it to be inlined or > using clang there is no issue. So it looks like a gcc bug both in 7.3 and 8.2. > > Thanks, Robbin > > src/hotspot/share/aot/aotLoader.cpp > runtime/os.inline.hpp????? for os::dll_unload > > src/hotspot/share/c1/c1_Runtime1.cpp > runtime/handles.inline.hpp for Handle(Thread*, oop) > > src/hotspot/share/gc/z/zFuture.inline.hpp > runtime/interfaceSupport.inline.hpp not used. > > src/hotspot/share/prims/nativeLookup.cpp > runtime/os.inline.hpp????? for os::dll_unload > > src/hotspot/share/runtime/handles.hpp > Forward declaration??????????? Thread > > src/hotspot/share/runtime/handles.inline.hpp > runtime/thread.hpp?????? for Thread::current > oops/oop.inline.hpp??????? for oopDesc::is_a > oops/metadata.hpp????????? for is_valid > > src/hotspot/share/runtime/semaphore.inline.hpp > runtime/thread.hpp???????? for osthread > > src/hotspot/share/runtime/vframe.cpp > runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From lutz.schmidt at sap.com Wed Jan 30 11:21:18 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 30 Jan 2019 11:21:18 +0000 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> <2dff1a27-289f-82b2-0042-1fedd74d0593@redhat.com> Message-ID: <854D8A83-8F58-4B69-B99F-5B78E64EFA87@sap.com> Hi, is it really necessary to probe each and every byte in the range? 1) is_readable_pointer() tests four subsequent bytes. That suggests a stride of +/-4. 2) Assume a and b are two addresses on the same page. Are there platforms where a is accessible and b is not? If the answer to 2) is yes, then os::is_readable_range(const void* from, const void* to) needs to be fixed. Otherwise, at most one is_readable_pointer() call per page is necessary. Thanks for considering! Lutz ?On 29.01.19, 18:48, "hotspot-dev on behalf of Thomas St?fe" wrote: On Tue, Jan 29, 2019 at 6:33 PM Aleksey Shipilev wrote: > On 1/29/19 6:27 PM, Thomas St?fe wrote: > > On Tue, Jan 29, 2019 at 6:12 PM Aleksey Shipilev > wrote: > > > > On 1/29/19 5:34 PM, Thomas St?fe wrote: > > > Even better. No need to store those bytes on the first leg. > > > > This would be webrev.05: > > http://cr.openjdk.java.net/~shade/8217879/webrev.05/ > > > > Looks fine. You could move calculation of low/high out of the loops > though. > > If you want to go with this one, I do not need another webrev. > > You cannot that easily? Having calculation is the loop guarantees low/high > are definitely readable. > You can do this outside the loop, with +1/-1 to delta, but that sets us up > for the off-by-one errors... > > Oh, okay. This is fine to me then. ..Thomas > -Aleksey > > From shade at redhat.com Wed Jan 30 11:20:43 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 12:20:43 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: Message-ID: On 1/30/19 12:17 PM, Robbin Ehn wrote: > http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/inc > http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/ $ ping rehn-ws.se.oracle.com -Aleksey From robbin.ehn at oracle.com Wed Jan 30 11:22:32 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 12:22:32 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: Message-ID: <0ef461f0-e49c-8088-9018-1a1a943b3ff0@oracle.com> Sorry wrong url: http://cr.openjdk.java.net/~rehn/8218041/v02/ http://cr.openjdk.java.net/~rehn/8218041/v02/inc/ Thanks Aleksey for bringing to my attention! /Robbin On 2019-01-30 12:17, Robbin Ehn wrote: > Hi, here is v02. > > Add includes to handles.inline.hpp in all files we use a method in there. > Compiles on my 7 different configs including gcc 7.3/8.2, clang 7, no > pre-compiled headers. And tier-1 which includes our std builds. > > http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/inc > http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/ > > Also this seems to signification reduce gcc inline warnings about local comdat > symbol for Handle(Thread*, oop). > > Thanks, Robbin > > On 2019-01-30 09:21, Robbin Ehn wrote: >> Hi all, please review. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8218041 >> >> After fixing these includes, there was a circular dependency via shenandoah >> code. I moved try_cancel_gc to cpp where the only use was. So it never should >> had been in the inline header in the first place. >> >> I listed why the include is needed below. >> >> Tier 1 and no pre-compiled. >> >> FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. >> gcc complains there being a local comdat symbol, forcing it to be inlined or >> using clang there is no issue. So it looks like a gcc bug both in 7.3 and 8.2. >> >> Thanks, Robbin >> >> src/hotspot/share/aot/aotLoader.cpp >> runtime/os.inline.hpp????? for os::dll_unload >> >> src/hotspot/share/c1/c1_Runtime1.cpp >> runtime/handles.inline.hpp for Handle(Thread*, oop) >> >> src/hotspot/share/gc/z/zFuture.inline.hpp >> runtime/interfaceSupport.inline.hpp not used. >> >> src/hotspot/share/prims/nativeLookup.cpp >> runtime/os.inline.hpp????? for os::dll_unload >> >> src/hotspot/share/runtime/handles.hpp >> Forward declaration??????????? Thread >> >> src/hotspot/share/runtime/handles.inline.hpp >> runtime/thread.hpp?????? for Thread::current >> oops/oop.inline.hpp??????? for oopDesc::is_a >> oops/metadata.hpp????????? for is_valid >> >> src/hotspot/share/runtime/semaphore.inline.hpp >> runtime/thread.hpp???????? for osthread >> >> src/hotspot/share/runtime/vframe.cpp >> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From robbin.ehn at oracle.com Wed Jan 30 11:22:55 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 12:22:55 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: Message-ID: <59a148e0-d892-8185-6502-529de94b1c43@oracle.com> Thanks! On 2019-01-30 12:20, Aleksey Shipilev wrote: > On 1/30/19 12:17 PM, Robbin Ehn wrote: >> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/inc >> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/ > > $ ping rehn-ws.se.oracle.com > :) /Robbin > > -Aleksey > From per.liden at oracle.com Wed Jan 30 11:26:30 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 30 Jan 2019 12:26:30 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> Message-ID: <780d1589-12a3-d82d-ba29-58f77e5cc840@oracle.com> Hi Patricio, On 01/30/2019 01:24 AM, Patricio Chilano wrote: > Hi Per, > > On 1/29/19 4:22 AM, Per Liden wrote: >> Hi Patricio, >> >> On 01/28/2019 08:18 PM, Patricio Chilano wrote: >>> Hi Robbin, >>> >>> Thanks for reviewing this! Removing the block_in_safepoint_check >>> thread local attribute is a great idea, here is v02: >>> >>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >> >> I really like that we're ditching our old locking code in favor of >> using pthread_mutex, et al. Nice work! > Thanks! : ) > >> General comment >> ---------------- >> I think Mutex to be a plain mutex and not come with the baggage of >> having a conditional variable. With this new code, it seems we're in a >> really good position to make that happen. I.e. something like this: >> >> class PlatformMutex { >> protected: >> pthread_mutex_t _mutex; >> >> public: >> PlatformMutex(); >> ~PlatformMutex(); >> >> void lock(); >> void unlock(); >> bool try_lock(); >> }; >> >> class PlatformMonitor : public PlatformMutex { >> private: >> pthread_cond_t _cond; >> >> public: >> PlatformMonitor(); >> ~PlatformMonitor(); >> >> int wait(jlong millis); >> void notify(); >> void notify_all(); >> }; >> >> It might be that we want to do that as a separate step later instead >> of including it in this patch. But I think we should try to get there. > I agree this is a good idea, but since it would make sense to also > rework them at the high-level Monitor/Mutex as David pointed out (this > idea is actually also proposed in the comments of mutex.hpp) what do you > think if I file this as a separate bugid to be worked after we pushed > this patch ? Sure, that can be done in a separate follow up patch. > >> src/hotspot/os/posix/os_*.[ch]pp >> --------------------------------- >> * I'd suggest that we place the PlatformMonitor class in a separate >> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we have >> done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). > I tried to moved them but there is a small issue in that PlatformMonitor > code needs static methods defined in their current os_*.cpp files > (methods that parse timing structs). I can declare them as public > (cannot move them since they are also used by PlatformEvent and Parker), > but for the Posix version of PlatformMonitor I would also need to do > that with _condAttr and _mutexAttr which are also defined static in that > file and are needed by PlatformMonitor::PlatformMonitor. So not sure > what the right approach is here. > In any case shouldn't we aim to have all synchronization-like classes in > the same file for each platform (something like syncro_posix, > syncro_windows, etc) instead of a separate file for each of them > (semaphore_*, monitor_*, waitbarrier_*, etc). Otherwise seems > PlatformParker and PlatformEvent should also be in their own file. Keeping things in separate files can make sense if these things can be used standalone. A plain mutex (just like the plain semaphore we have) can come handy in many places where you just want that mutex, without having to drag in other classes or the whole os layer. Keeps dependencies under control, reduces compile times, etc. > >> src/hotspot/os/posix/os_posix.hpp >> src/hotspot/os/solaris/os_solaris.hpp >> src/hotspot/os/windows/os_windows.hpp >> ------------------------------------- >> * Please make _mutex/_cond plain variables, instead of arrays of 1. >> That's just ugly ;) > Done! > >> src/hotspot/os/posix/os_posix.cpp >> --------------------------------- >> * Destructor missing, to call pthread_(mutex|cond)_destroy(). > Done! > >> src/hotspot/os/solaris/os_solaris.hpp >> ------------------------------------- >> * Not sure if there's a good reason to have the constructor be inlined >> here. I'd suggest moving it to the cpp file. >> >> * Destructor missing. > Done! > >> src/hotspot/os/windows/os_windows.cpp >> ------------------------------------- >> * Destructor missing (I'm not too familiar with the windows API but I >> assume there's a destroy function we should call here). > Done! (There is a destroy function for mutexes but not for condition > variables which apparently do not need to free anything explicitly). > >> src/hotspot/share/runtime/interfaceSupport.inline.hpp >> ----------------------------------------------------- >> * Move "private:" above monitor_adr; >> >> 289 class ThreadLockBlockInVM : public ThreadStateTransition { >> 290 Monitor** monitor_adr; >> 291 private: >> 292 void do_preempted(Monitor** in_flight_monitor_adr) { >> >> * monitor_adr should be _monitor_adr, or maybe even >> _in_flight_monitor_adr to better match the name of the argument. > Done! I realized there is no need for passing a parameter to > do_preempted() since we already have the in_flight_monitor_adr so I also > made small changes there. > > > Here is v03 including also Dan and Robbin comments about mutex.cpp and > safepointMechanism.hpp: > > Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ > Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ What kind of performance measurements have been done on this patch? I took your v03 patch for a spin in SPECjbb2015 (with ZGC enabled) and did not notice any obvious regressions in either throughput nor latency. cheers, Per > > Running mach tiers1-3. Waiting though on you thoughts about file > organization and deferring Mutex/Monitor rework. > > Thanks for looking into this Per! > > Thanks, > Patricio >> cheers, >> Per >> >>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >>> >>> Running mach5 again. >>> >>> Thanks, >>> Patricio >>> >>> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>>> Hi Patricio, >>>> >>>> Mostly looks good! >>>> >>>> block_at_safepoint is always called with block_in_safepoint_check = >>>> true. (correct?) >>>> Changing that to a local state instead of global simplifies the code. >>>> >>>> So I'm suggesting something like below. >>>> >>>> Thanks, Robbin >>>> >>>> diff -r e65cc445234c >>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>>> 28 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon Jan >>>> 28 14:10:59 2019 +0100 >>>> @@ -308,2 +308,1 @@ >>>> - thread->block_in_safepoint_check = false; >>>> - SafepointMechanism::block_at_safepoint(thread); >>>> + SafepointMechanism::callback_if_safepoint(thread); >>>> @@ -323,2 +322,1 @@ >>>> - SafepointMechanism::block_at_safepoint(_thread); >>>> - _thread->block_in_safepoint_check = true; >>>> + SafepointMechanism::callback_if_safepoint(_thread); >>>> @@ -335,2 +332,0 @@ >>>> - } else { >>>> - _thread->block_in_safepoint_check = true; >>>> @@ -337,0 +334,1 @@ >>>> + >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>>> --- a/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepoint.cpp Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -795,1 +795,1 @@ >>>> -void SafepointSynchronize::block(JavaThread *thread) { >>>> +void SafepointSynchronize::block(JavaThread *thread, bool >>>> block_in_safepoint_check) { >>>> @@ -850,1 +850,1 @@ >>>> - if (thread->block_in_safepoint_check) { >>>> + if (block_in_safepoint_check) { >>>> @@ -880,1 +880,1 @@ >>>> - thread->block_in_safepoint_check) { >>>> + block_in_safepoint_check) { >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>>> --- a/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepoint.hpp Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -146,1 +146,1 @@ >>>> - static void block(JavaThread *thread); >>>> + static void block(JavaThread *thread, bool >>>> block_in_safepoint_check = true); >>>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 >>>> 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 >>>> 14:10:59 2019 +0100 >>>> @@ -82,1 +82,1 @@ >>>> - static inline void block_at_safepoint(JavaThread* thread); >>>> + static inline void callback_if_safepoint(JavaThread* thread); >>>> diff -r e65cc445234c >>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>> Jan 28 13:10:15 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>> Jan 28 14:10:59 2019 +0100 >>>> @@ -82,1 +82,1 @@ >>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>>> @@ -84,1 +84,1 @@ >>>> - SafepointSynchronize::block(thread); >>>> + SafepointSynchronize::block(thread, false); >>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>>> --- a/src/hotspot/share/runtime/thread.cpp Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/thread.cpp Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -298,2 +297,0 @@ >>>> - block_in_safepoint_check = true; >>>> - >>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>>> --- a/src/hotspot/share/runtime/thread.hpp Mon Jan 28 13:10:15 >>>> 2019 +0100 >>>> +++ b/src/hotspot/share/runtime/thread.hpp Mon Jan 28 14:10:59 >>>> 2019 +0100 >>>> @@ -788,2 +787,0 @@ >>>> - bool block_in_safepoint_check; // to decide whether >>>> to block in SS::block or not >>>> - >>>> >>>> >>>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>>> Hi all, >>>>> >>>>> Please review the following patch: >>>>> >>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>>> >>>>> The current implementation of native monitors uses a technique that >>>>> we name "sneaky locking" to prevent possible deadlocks of the JVM >>>>> during safepoints. The implementation of this technique though >>>>> introduces a race when a monitor is shared between the VMThread and >>>>> non-JavaThreads. This patch aims to solve that problem and at the >>>>> same time simplify the code. >>>>> >>>>> The proposal is based on the introduction of the new class >>>>> PlatformMonitor, which serves as a wrapper for the actual >>>>> synchronization primitives in each platform (mutexes and condition >>>>> variables). Most of the API calls can thus be implemented as simple >>>>> wrappers around PlatformMonitor, adding more assertions and very >>>>> little extra metadata. >>>>> To be able to remove the lock sneaking code and at the same time >>>>> avoid deadlocking scenarios, we combine two techniques: >>>>> >>>>> -When a JavaThread that has just acquired the lock, detects there >>>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>>> releases the lock before blocking at the safepoint. After resuming >>>>> from it, the JavaThread will have to acquire the lock again. >>>>> >>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>>> method, in order to avoid blocking we allow for a possible >>>>> safepoint request to make progress but without letting the >>>>> JavaThread block for it (since we would be stopped by the >>>>> destructor anyways). We also do that for the Monitor::lock() case >>>>> although no deadlock is being prevented there. >>>>> >>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition class >>>>> used instead of the ThreadBlockInVM one. This allowed more >>>>> flexibility to handle the two techniques mentioned above. Also, >>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>>> which creates some problems when trying to allow safepoints to >>>>> continue without stopping, since that method not only checks for >>>>> safepoints but also processes handshakes. >>>>> >>>>> In terms of performance, benchmarks show very similar results to >>>>> what we have now. >>>>> >>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>>> been tested. >>>>> >>>>> Thanks, >>>>> Patricio >>>>> >>> > From lutz.schmidt at sap.com Wed Jan 30 11:35:05 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 30 Jan 2019 11:35:05 +0000 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: <9dff8727-9bfb-41e3-8be6-8e8d6421e3d1@redhat.com> References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> <0034E860-F7DE-4565-B157-93418E952388@sap.com> <9dff8727-9bfb-41e3-8be6-8e8d6421e3d1@redhat.com> Message-ID: OK, my fault. Didn't RTFM before typing. How about this version: st->print("%*.*s", 2*unitsize, 2*unitsize, "????????????????"); For strings, the .precision sub-specifier serves as "max. #characters to be printed". (see http://www.cplusplus.com/reference/cstdio/printf/) Regards, Lutz ?On 30.01.19, 12:02, "Aleksey Shipilev" wrote: On 1/30/19 11:48 AM, Schmidt, Lutz wrote: > When looking at the newly introduced else branch, I had the idea of replacing the entire switch by > > st->print("%*s", 2*unitsize, "????????????????"); > > Like it? I do! That's a cute trick, but I don't think it works: the format width is the _minimal_ width, and the that 16-wide string argument would not be truncated to 2*unitsize. -Aleksey From stefan.karlsson at oracle.com Wed Jan 30 12:05:39 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 30 Jan 2019 13:05:39 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <0ef461f0-e49c-8088-9018-1a1a943b3ff0@oracle.com> References: <0ef461f0-e49c-8088-9018-1a1a943b3ff0@oracle.com> Message-ID: Hi Robbin, Thanks for cleaning this up! https://cr.openjdk.java.net/~rehn/8218041/v02/webrev/src/hotspot/share/runtime/handles.inline.hpp.udiff.html An alternative would be to somehow call a non-inlined version of oopDesc::is_a, given that it's only used in an assert. That way we wouldn't have to include oop.inline.hpp in handles.inline.hpp. https://cr.openjdk.java.net/~rehn/8218041/v02/webrev/src/hotspot/share/ci/ciMethod.cpp.udiff.html https://cr.openjdk.java.net/~rehn/8218041/v02/webrev/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp.udiff.html Incorrect sort order. https://cr.openjdk.java.net/~rehn/8218041/v02/webrev/src/hotspot/share/jvmci/compilerRuntime.cpp.udiff.html Preexisting incorrect sort order for deoptimization.hpp. Maybe fix it in this patch? Thanks, StefanK On 2019-01-30 12:22, Robbin Ehn wrote: > Sorry wrong url: > http://cr.openjdk.java.net/~rehn/8218041/v02/ > http://cr.openjdk.java.net/~rehn/8218041/v02/inc/ > > Thanks Aleksey for bringing to my attention! > > /Robbin > > On 2019-01-30 12:17, Robbin Ehn wrote: >> Hi, here is v02. >> >> Add includes to handles.inline.hpp in all files we use a method in >> there. >> Compiles on my 7 different configs including gcc 7.3/8.2, clang 7, no >> pre-compiled headers. And tier-1 which includes our std builds. >> >> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/inc >> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/ >> >> Also this seems to signification reduce gcc inline warnings about >> local comdat symbol for Handle(Thread*, oop). >> >> Thanks, Robbin >> >> On 2019-01-30 09:21, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8218041 >>> >>> After fixing these includes, there was a circular dependency via >>> shenandoah >>> code. I moved try_cancel_gc to cpp where the only use was. So it >>> never should >>> had been in the inline header in the first place. >>> >>> I listed why the include is needed below. >>> >>> Tier 1 and no pre-compiled. >>> >>> FYI: I was investigating why Handle::Handle(Thread*,oop) was not >>> inlined. >>> gcc complains there being a local comdat symbol, forcing it to be >>> inlined or >>> using clang there is no issue. So it looks like a gcc bug both in >>> 7.3 and 8.2. >>> >>> Thanks, Robbin >>> >>> src/hotspot/share/aot/aotLoader.cpp >>> runtime/os.inline.hpp????? for os::dll_unload >>> >>> src/hotspot/share/c1/c1_Runtime1.cpp >>> runtime/handles.inline.hpp for Handle(Thread*, oop) >>> >>> src/hotspot/share/gc/z/zFuture.inline.hpp >>> runtime/interfaceSupport.inline.hpp not used. >>> >>> src/hotspot/share/prims/nativeLookup.cpp >>> runtime/os.inline.hpp????? for os::dll_unload >>> >>> src/hotspot/share/runtime/handles.hpp >>> Forward declaration??????????? Thread >>> >>> src/hotspot/share/runtime/handles.inline.hpp >>> runtime/thread.hpp?????? for Thread::current >>> oops/oop.inline.hpp??????? for oopDesc::is_a >>> oops/metadata.hpp????????? for is_valid >>> >>> src/hotspot/share/runtime/semaphore.inline.hpp >>> runtime/thread.hpp???????? for osthread >>> >>> src/hotspot/share/runtime/vframe.cpp >>> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From thomas.stuefe at gmail.com Wed Jan 30 12:10:08 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 30 Jan 2019 13:10:08 +0100 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> <0034E860-F7DE-4565-B157-93418E952388@sap.com> <9dff8727-9bfb-41e3-8be6-8e8d6421e3d1@redhat.com> Message-ID: Or, since this thing keeps coming up, add a helper to outputStream: +void outputStream::put(char ch, int repeat_count) { + for (int i = 0; i < repeat_count; i ++) { + put(ch); + } +} + st->put('?', 2*unitsize); ..Thomas On Wed, Jan 30, 2019 at 12:35 PM Schmidt, Lutz wrote: > OK, my fault. > Didn't RTFM before typing. How about this version: > > st->print("%*.*s", 2*unitsize, 2*unitsize, "????????????????"); > > For strings, the .precision sub-specifier serves as "max. #characters to > be printed". > (see http://www.cplusplus.com/reference/cstdio/printf/) > > Regards, Lutz > > ?On 30.01.19, 12:02, "Aleksey Shipilev" wrote: > > On 1/30/19 11:48 AM, Schmidt, Lutz wrote: > > When looking at the newly introduced else branch, I had the idea of > replacing the entire switch by > > > > st->print("%*s", 2*unitsize, "????????????????"); > > > > Like it? I do! > > That's a cute trick, but I don't think it works: the format width is > the _minimal_ width, and the > that 16-wide string argument would not be truncated to 2*unitsize. > > -Aleksey > > > > From thomas.stuefe at gmail.com Wed Jan 30 12:11:28 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 30 Jan 2019 13:11:28 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <854D8A83-8F58-4B69-B99F-5B78E64EFA87@sap.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> <2dff1a27-289f-82b2-0042-1fedd74d0593@redhat.com> <854D8A83-8F58-4B69-B99F-5B78E64EFA87@sap.com> Message-ID: I think this probing is not needed anymorewith 8217994, or? On Wed, Jan 30, 2019 at 12:21 PM Schmidt, Lutz wrote: > Hi, > > is it really necessary to probe each and every byte in the range? > 1) is_readable_pointer() tests four subsequent bytes. That suggests a > stride of +/-4. > 2) Assume a and b are two addresses on the same page. Are there platforms > where a is accessible and b is not? > > If the answer to 2) is yes, then os::is_readable_range(const void* from, > const void* to) needs to be fixed. Otherwise, at most one > is_readable_pointer() call per page is necessary. > > Thanks for considering! > Lutz > > ?On 29.01.19, 18:48, "hotspot-dev on behalf of Thomas St?fe" < > hotspot-dev-bounces at openjdk.java.net on behalf of thomas.stuefe at gmail.com> > wrote: > > On Tue, Jan 29, 2019 at 6:33 PM Aleksey Shipilev > wrote: > > > On 1/29/19 6:27 PM, Thomas St?fe wrote: > > > On Tue, Jan 29, 2019 at 6:12 PM Aleksey Shipilev > > wrote: > > > > > > On 1/29/19 5:34 PM, Thomas St?fe wrote: > > > > Even better. No need to store those bytes on the first leg. > > > > > > This would be webrev.05: > > > http://cr.openjdk.java.net/~shade/8217879/webrev.05/ > > > > > > Looks fine. You could move calculation of low/high out of the loops > > though. > > > If you want to go with this one, I do not need another webrev. > > > > You cannot that easily? Having calculation is the loop guarantees > low/high > > are definitely readable. > > You can do this outside the loop, with +1/-1 to delta, but that sets > us up > > for the off-by-one errors... > > > > > Oh, okay. This is fine to me then. > > ..Thomas > > > > > -Aleksey > > > > > > > From shade at redhat.com Wed Jan 30 12:20:39 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 13:20:39 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <7d85fc77-4e33-364f-3efe-a3757d6cfdbc@oracle.com> <2dff1a27-289f-82b2-0042-1fedd74d0593@redhat.com> <854D8A83-8F58-4B69-B99F-5B78E64EFA87@sap.com> Message-ID: <433cf341-90c2-d3c4-d3b2-8b2da7b84ff4@redhat.com> On 1/30/19 1:11 PM, Thomas St?fe wrote: > I think this probing is not needed anymorewith 8217994, or? It probably does not, but 8217994 should get in first. I implore everyone to stop with bikeshedding patches to death, meanwhile. -Aleksey From shade at redhat.com Wed Jan 30 12:31:36 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 13:31:36 +0100 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> <0034E860-F7DE-4565-B157-93418E952388@sap.com> <9dff8727-9bfb-41e3-8be6-8e8d6421e3d1@redhat.com> Message-ID: <542945f5-9c38-bea5-8a64-0f8d6ab32a13@redhat.com> On 1/30/19 1:10 PM, Thomas St?fe wrote: > Or, since this thing keeps coming up, add a helper to outputStream: > > +void outputStream::put(char ch, int repeat_count) { > + for (int i = 0; i < repeat_count; i ++) { > + put(ch); > + } > +} > + > > st->put('?', 2*unitsize); No. Let's stop here: http://cr.openjdk.java.net/~shade/8217994/webrev.02/ There are thousand ways to do things, and there are only 8 work hours per day. Current patch with unusual format specifier is already good, to my taste. -Aleksey From robbin.ehn at oracle.com Wed Jan 30 12:41:04 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 13:41:04 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: <0ef461f0-e49c-8088-9018-1a1a943b3ff0@oracle.com> Message-ID: <3e6b968f-dd5c-901c-25e9-6339d25ffd2a@oracle.com> Hi Stefan, On 2019-01-30 13:05, Stefan Karlsson wrote: > Hi Robbin, > > Thanks for cleaning this up! > > https://cr.openjdk.java.net/~rehn/8218041/v02/webrev/src/hotspot/share/runtime/handles.inline.hpp.udiff.html > > > An alternative would be to somehow call a non-inlined version of oopDesc::is_a, > given that it's only used in an assert. That way we wouldn't have to include > oop.inline.hpp in handles.inline.hpp. My bad, we actually never use that method since it's a macro argument, and macro only generates the special ones: is_instance_noinline, is_array_noinline, is_objArray_noinline, is_typeArray_noinline. So I just changed to include oop.hpp :) > > https://cr.openjdk.java.net/~rehn/8218041/v02/webrev/src/hotspot/share/ci/ciMethod.cpp.udiff.html > > https://cr.openjdk.java.net/~rehn/8218041/v02/webrev/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp.udiff.html > > > Incorrect sort order. Fixed. > > https://cr.openjdk.java.net/~rehn/8218041/v02/webrev/src/hotspot/share/jvmci/compilerRuntime.cpp.udiff.html > > > Preexisting incorrect sort order for deoptimization.hpp. Maybe fix it in this > patch? Fixed. Compile testing, sending the v3 to rfr mail if all passes. Thanks! /Robbin > > Thanks, > StefanK > > On 2019-01-30 12:22, Robbin Ehn wrote: >> Sorry wrong url: >> http://cr.openjdk.java.net/~rehn/8218041/v02/ >> http://cr.openjdk.java.net/~rehn/8218041/v02/inc/ >> >> Thanks Aleksey for bringing to my attention! >> >> /Robbin >> >> On 2019-01-30 12:17, Robbin Ehn wrote: >>> Hi, here is v02. >>> >>> Add includes to handles.inline.hpp in all files we use a method in there. >>> Compiles on my 7 different configs including gcc 7.3/8.2, clang 7, no >>> pre-compiled headers. And tier-1 which includes our std builds. >>> >>> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/inc >>> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/ >>> >>> Also this seems to signification reduce gcc inline warnings about local >>> comdat symbol for Handle(Thread*, oop). >>> >>> Thanks, Robbin >>> >>> On 2019-01-30 09:21, Robbin Ehn wrote: >>>> Hi all, please review. >>>> >>>> Code: >>>> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8218041 >>>> >>>> After fixing these includes, there was a circular dependency via shenandoah >>>> code. I moved try_cancel_gc to cpp where the only use was. So it never should >>>> had been in the inline header in the first place. >>>> >>>> I listed why the include is needed below. >>>> >>>> Tier 1 and no pre-compiled. >>>> >>>> FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. >>>> gcc complains there being a local comdat symbol, forcing it to be inlined or >>>> using clang there is no issue. So it looks like a gcc bug both in 7.3 and 8.2. >>>> >>>> Thanks, Robbin >>>> >>>> src/hotspot/share/aot/aotLoader.cpp >>>> runtime/os.inline.hpp????? for os::dll_unload >>>> >>>> src/hotspot/share/c1/c1_Runtime1.cpp >>>> runtime/handles.inline.hpp for Handle(Thread*, oop) >>>> >>>> src/hotspot/share/gc/z/zFuture.inline.hpp >>>> runtime/interfaceSupport.inline.hpp not used. >>>> >>>> src/hotspot/share/prims/nativeLookup.cpp >>>> runtime/os.inline.hpp????? for os::dll_unload >>>> >>>> src/hotspot/share/runtime/handles.hpp >>>> Forward declaration??????????? Thread >>>> >>>> src/hotspot/share/runtime/handles.inline.hpp >>>> runtime/thread.hpp?????? for Thread::current >>>> oops/oop.inline.hpp??????? for oopDesc::is_a >>>> oops/metadata.hpp????????? for is_valid >>>> >>>> src/hotspot/share/runtime/semaphore.inline.hpp >>>> runtime/thread.hpp???????? for osthread >>>> >>>> src/hotspot/share/runtime/vframe.cpp >>>> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized > From david.holmes at oracle.com Wed Jan 30 12:54:23 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Jan 2019 22:54:23 +1000 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <0ef461f0-e49c-8088-9018-1a1a943b3ff0@oracle.com> References: <0ef461f0-e49c-8088-9018-1a1a943b3ff0@oracle.com> Message-ID: Seems okay to me. Thanks, David On 30/01/2019 9:22 pm, Robbin Ehn wrote: > Sorry wrong url: > http://cr.openjdk.java.net/~rehn/8218041/v02/ > http://cr.openjdk.java.net/~rehn/8218041/v02/inc/ > > Thanks Aleksey for bringing to my attention! > > /Robbin > > On 2019-01-30 12:17, Robbin Ehn wrote: >> Hi, here is v02. >> >> Add includes to handles.inline.hpp in all files we use a method in there. >> Compiles on my 7 different configs including gcc 7.3/8.2, clang 7, no >> pre-compiled headers. And tier-1 which includes our std builds. >> >> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/inc >> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/ >> >> Also this seems to signification reduce gcc inline warnings about >> local comdat symbol for Handle(Thread*, oop). >> >> Thanks, Robbin >> >> On 2019-01-30 09:21, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8218041 >>> >>> After fixing these includes, there was a circular dependency via >>> shenandoah >>> code. I moved try_cancel_gc to cpp where the only use was. So it >>> never should >>> had been in the inline header in the first place. >>> >>> I listed why the include is needed below. >>> >>> Tier 1 and no pre-compiled. >>> >>> FYI: I was investigating why Handle::Handle(Thread*,oop) was not >>> inlined. >>> gcc complains there being a local comdat symbol, forcing it to be >>> inlined or >>> using clang there is no issue. So it looks like a gcc bug both in 7.3 >>> and 8.2. >>> >>> Thanks, Robbin >>> >>> src/hotspot/share/aot/aotLoader.cpp >>> runtime/os.inline.hpp????? for os::dll_unload >>> >>> src/hotspot/share/c1/c1_Runtime1.cpp >>> runtime/handles.inline.hpp for Handle(Thread*, oop) >>> >>> src/hotspot/share/gc/z/zFuture.inline.hpp >>> runtime/interfaceSupport.inline.hpp not used. >>> >>> src/hotspot/share/prims/nativeLookup.cpp >>> runtime/os.inline.hpp????? for os::dll_unload >>> >>> src/hotspot/share/runtime/handles.hpp >>> Forward declaration??????????? Thread >>> >>> src/hotspot/share/runtime/handles.inline.hpp >>> runtime/thread.hpp?????? for Thread::current >>> oops/oop.inline.hpp??????? for oopDesc::is_a >>> oops/metadata.hpp????????? for is_valid >>> >>> src/hotspot/share/runtime/semaphore.inline.hpp >>> runtime/thread.hpp???????? for osthread >>> >>> src/hotspot/share/runtime/vframe.cpp >>> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From robbin.ehn at oracle.com Wed Jan 30 13:07:29 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 14:07:29 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: <0ef461f0-e49c-8088-9018-1a1a943b3ff0@oracle.com> Message-ID: <7fc52fe7-3560-332c-e3ae-e21d6d5854a2@oracle.com> Thanks David, Robbin. On 2019-01-30 13:54, David Holmes wrote: > Seems okay to me. > > Thanks, > David > > On 30/01/2019 9:22 pm, Robbin Ehn wrote: >> Sorry wrong url: >> http://cr.openjdk.java.net/~rehn/8218041/v02/ >> http://cr.openjdk.java.net/~rehn/8218041/v02/inc/ >> >> Thanks Aleksey for bringing to my attention! >> >> /Robbin >> >> On 2019-01-30 12:17, Robbin Ehn wrote: >>> Hi, here is v02. >>> >>> Add includes to handles.inline.hpp in all files we use a method in there. >>> Compiles on my 7 different configs including gcc 7.3/8.2, clang 7, no >>> pre-compiled headers. And tier-1 which includes our std builds. >>> >>> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/inc >>> http://rehn-ws.se.oracle.com/cr_mirror/8218041/v02/ >>> >>> Also this seems to signification reduce gcc inline warnings about local >>> comdat symbol for Handle(Thread*, oop). >>> >>> Thanks, Robbin >>> >>> On 2019-01-30 09:21, Robbin Ehn wrote: >>>> Hi all, please review. >>>> >>>> Code: >>>> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8218041 >>>> >>>> After fixing these includes, there was a circular dependency via shenandoah >>>> code. I moved try_cancel_gc to cpp where the only use was. So it never should >>>> had been in the inline header in the first place. >>>> >>>> I listed why the include is needed below. >>>> >>>> Tier 1 and no pre-compiled. >>>> >>>> FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. >>>> gcc complains there being a local comdat symbol, forcing it to be inlined or >>>> using clang there is no issue. So it looks like a gcc bug both in 7.3 and 8.2. >>>> >>>> Thanks, Robbin >>>> >>>> src/hotspot/share/aot/aotLoader.cpp >>>> runtime/os.inline.hpp????? for os::dll_unload >>>> >>>> src/hotspot/share/c1/c1_Runtime1.cpp >>>> runtime/handles.inline.hpp for Handle(Thread*, oop) >>>> >>>> src/hotspot/share/gc/z/zFuture.inline.hpp >>>> runtime/interfaceSupport.inline.hpp not used. >>>> >>>> src/hotspot/share/prims/nativeLookup.cpp >>>> runtime/os.inline.hpp????? for os::dll_unload >>>> >>>> src/hotspot/share/runtime/handles.hpp >>>> Forward declaration??????????? Thread >>>> >>>> src/hotspot/share/runtime/handles.inline.hpp >>>> runtime/thread.hpp?????? for Thread::current >>>> oops/oop.inline.hpp??????? for oopDesc::is_a >>>> oops/metadata.hpp????????? for is_valid >>>> >>>> src/hotspot/share/runtime/semaphore.inline.hpp >>>> runtime/thread.hpp???????? for osthread >>>> >>>> src/hotspot/share/runtime/vframe.cpp >>>> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From coleen.phillimore at oracle.com Wed Jan 30 13:27:26 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 30 Jan 2019 08:27:26 -0500 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> Message-ID: On 1/30/19 5:17 AM, Aleksey Shipilev wrote: > On 1/29/19 8:39 PM, coleen.phillimore at oracle.com wrote: >> Summary: remove gc timing for short runtime cleanup triggering; make symbol table cleaning triggered >> automatically on unloading >> >> Ran runThese with all Oracle GCs and got similar numbers of symbols unloaded.? Also ran tier1-5. >> >> See bug for more information. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213753.01/webrev > This looks fine. I tested Shenandoah just in case (it should not be affected, because it uses > shared/parallelCleaning.* and SystemDictionary::do_unloading, like G1), it is still okay. Thanks Aleksey for the testing.? I noticed Shenandoah didn't have the duplicate walk. > > Minor nits: > > *) Add spaces before format specifiers here? > > 698 log_debug(symboltable)("Concurrent work triggered, load factor:%f, items to clean:%s", > 699 get_load_factor(), has_items_to_clean() ? "true" : "false"); sure. > > *) In SystemDictionary::do_unloading, do we think that trigger_cleanup() are cheap? Otherwise it > makes sense to retain a single GCTraceTime block around all three trigger_cleanups? The three trigger cleanups aren't specifically ClassLoaderData, so I didn't include them in the timing.? I think people/tools might look for ClassLoaderData so I didn't want to change the name of the timing.?? Checking now... ??? GCTraceTime(Debug, gc, phases) t("ClassLoaderData", gc_timer); I thought of removing it completely but the timing but the enclosing timers (exception in shenandoah) also include timing CodeCache::do_unloading and clean_weak_klass_links which are more expensive than ClassLoaderData. My original intention was to move the timers out of SystemDictionary completely though.? Maybe I should do that. Thanks, Coleen > -Aleksey > From robbin.ehn at oracle.com Wed Jan 30 13:30:41 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 14:30:41 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: Message-ID: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> Hi, here is v3. http://cr.openjdk.java.net/~rehn/8218041/v03/inc/ http://cr.openjdk.java.net/~rehn/8218041/v03/ Passes same compilations and t1. /Robbin On 2019-01-30 09:21, Robbin Ehn wrote: > Hi all, please review. > > Code: > http://cr.openjdk.java.net/~rehn/8218041/webrev/ > Issue: > https://bugs.openjdk.java.net/browse/JDK-8218041 > > After fixing these includes, there was a circular dependency via shenandoah > code. I moved try_cancel_gc to cpp where the only use was. So it never should > had been in the inline header in the first place. > > I listed why the include is needed below. > > Tier 1 and no pre-compiled. > > FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. > gcc complains there being a local comdat symbol, forcing it to be inlined or > using clang there is no issue. So it looks like a gcc bug both in 7.3 and 8.2. > > Thanks, Robbin > > src/hotspot/share/aot/aotLoader.cpp > runtime/os.inline.hpp????? for os::dll_unload > > src/hotspot/share/c1/c1_Runtime1.cpp > runtime/handles.inline.hpp for Handle(Thread*, oop) > > src/hotspot/share/gc/z/zFuture.inline.hpp > runtime/interfaceSupport.inline.hpp not used. > > src/hotspot/share/prims/nativeLookup.cpp > runtime/os.inline.hpp????? for os::dll_unload > > src/hotspot/share/runtime/handles.hpp > Forward declaration??????????? Thread > > src/hotspot/share/runtime/handles.inline.hpp > runtime/thread.hpp?????? for Thread::current > oops/oop.inline.hpp??????? for oopDesc::is_a > oops/metadata.hpp????????? for is_valid > > src/hotspot/share/runtime/semaphore.inline.hpp > runtime/thread.hpp???????? for osthread > > src/hotspot/share/runtime/vframe.cpp > runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From uschindler at apache.org Wed Jan 30 13:38:09 2019 From: uschindler at apache.org (Uwe Schindler) Date: Wed, 30 Jan 2019 14:38:09 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> Message-ID: <078d01d4b8a1$0b977df0$22c679d0$@apache.org> Hi Alexey, I will be on FOSDEM and we can for sure meet on Saturday. Latest at dinner! I am also in Brussels on Monday, so if we need more time we can maybe meet at Monday after-FOSDEM unconference. Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen http://www.thetaphi.de eMail: uwe at thetaphi.de > -----Original Message----- > From: hotspot-dev On Behalf Of > Aleksey Shipilev > Sent: Wednesday, January 30, 2019 11:51 AM > To: Dawid Weiss ; Nils Eliasson > > Cc: hotspot-dev > Subject: Re: SIGSEGV on PhaseIdealLoop::split_up? > > On 1/30/19 11:43 AM, Dawid Weiss wrote: > > Current CompileTask: > > C2:1534619 50541 s! 4 > > org.apache.lucene.index.ConcurrentMergeScheduler::merge (280 bytes) > > > > The path leading to it may differ (when you diff those different > > hs_err logs against each other), but it seems to be caused by merge > > compilation in all cases I looked at. > > This is release build, right? fastdebug build probably asserts somewhere? > > > I can monitor this and attach new logs to the Jira issue > > (LUCENE-8668). Uwe will be at Fosdem so I'm sure he'll be ready to > > figure it out together with you, should you be there. > > If Nils is not there, let Uwe find me at FOSDEM? > > -Aleksey From shade at redhat.com Wed Jan 30 14:01:16 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 15:01:16 +0100 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> Message-ID: <09dff7f0-54b9-1812-09ca-69666a2f2abe@redhat.com> On 1/30/19 2:27 PM, coleen.phillimore at oracle.com wrote: >> *) In SystemDictionary::do_unloading, do we think that trigger_cleanup() are cheap? Otherwise it >> makes sense to retain a single GCTraceTime block around all three trigger_cleanups? > > The three trigger cleanups aren't specifically ClassLoaderData, so I didn't include them in the > timing.? I think people/tools might look for ClassLoaderData so I didn't want to change the name of > the timing.?? Checking now... > > ??? GCTraceTime(Debug, gc, phases) t("ClassLoaderData", gc_timer); > > I thought of removing it completely but the timing but the enclosing timers (exception in > shenandoah) also include timing CodeCache::do_unloading and clean_weak_klass_links which are more > expensive than ClassLoaderData. My only (little) concern was that SD::do_unloading now has only one GCTraceTime("ClassLoaderData"). Without knowing beforehand if ::trigger_cleanup-s are cheap, it seems odd to drop GCTraceTime from them. Something like: GCTraceTime(Debug, gc, phases) t("Trigger cleanups", gc_timer); ResolvedMethodTable::trigger_cleanup(); if (unloading_occurred) { SymbolTable::trigger_cleanup(); // Oops referenced by the protection domain cache table may get unreachable independently // of the class loader (eg. cached protection domain oops). So we need to // explicitly unlink them here. // All protection domain oops are linked to the caller class, so if nothing // unloads, this is not needed. _pd_cache_table->trigger_cleanup(); } ...but that can be indeed skipped if we know that those triggers cost much less than the timer itself. -Aleksey From shade at redhat.com Wed Jan 30 14:04:47 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 15:04:47 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> References: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> Message-ID: <53d0664c-ffb2-d983-471a-cecb94e9edb2@redhat.com> On 1/30/19 2:30 PM, Robbin Ehn wrote: > http://cr.openjdk.java.net/~rehn/8218041/v03/inc/ > http://cr.openjdk.java.net/~rehn/8218041/v03/ Still good. Shenandoah tests still good. -Aleksey From uschindler at apache.org Wed Jan 30 14:06:47 2019 From: uschindler at apache.org (Uwe Schindler) Date: Wed, 30 Jan 2019 15:06:47 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> <6ce2d1d2-a3ca-1d8f-56eb-5a76c77754d1@oracle.com> Message-ID: <078e01d4b8a5$0b445810$21cd0830$@apache.org> Hi, thanks Dawid. The builds on the Jenkins servers don't run with downloadable releases of Lucene. To reproduce one build, a direct GIT checkout of Lucene branch with used commit hash needs to be done. I have to add to Dawid's explanation: When you run the testsuite of Lucene to reproduce, you should also use the "seed" of the test run. Whenever Lucene runs tests a so called random seed is used which configures the test suite to inject random data or random components during the test run. This allows us to test Lucene with many configurations -- and of course this also finds bugs in the JVM, because every run of the test suite is "unique". If you find a test failure or JVM crush, it's important to reproduce this with the same seed. The Lucene tests framework prints the hash used before it runs tests. To run it again, you can tell the build script to use the same seed. Type "ant test-help" to get usage help. Uwe ----- Uwe Schindler uschindler at apache.org ASF Member, Apache Lucene PMC / Committer Bremen, Germany http://lucene.apache.org/ > -----Original Message----- > From: hotspot-dev On Behalf Of > Dawid Weiss > Sent: Wednesday, January 30, 2019 12:11 PM > To: Nils Eliasson > Cc: hotspot-dev ; Uwe Schindler (SD > DataSolutions GmbH) > Subject: Re: SIGSEGV on PhaseIdealLoop::split_up? > > Hi Nils, > > Those builds are made straight up from git (from various branches). > For example this failure: > > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/ > > with these hs_err and replay files: > > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x- > Linux/3472/artifact/solr/build/solr-core/test/J1/hs_err_pid27685.log > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x- > Linux/3472/artifact/solr/build/solr-core/test/J1/replay_pid27685.log > > comes from rev db57468242 of git at github.com:apache/lucene-solr.git > (the revision is mentioned on jenkins and in the full log), so: > > git clone git at github.com:apache/lucene-solr.git > cd lucene-solr > git checkout db57468242 > > then you can compile with: > > cd lucene > ant jar > > Dawid > > On Wed, Jan 30, 2019 at 12:03 PM Nils Eliasson > wrote: > > > > Hi, > > > > With the help of the replay-file I manange to compile the right Class, > > but the inlining doesn't match. The latest release i see is 7.6, but it > > looks like the crash is from a 9.0? Is that master? Do you have a link > > to a build that I can download? > > > > Regards, > > > > Nils > > > > On 2019-01-30 11:55, Dawid Weiss wrote: > > >> This is release build, right? fastdebug build probably asserts > somewhere? > > > I don't think we (or Uwe) runs jobs with fastdebug builds, to be > > > honest. This isn't a bad idea though. > > > > > >> If Nils is not there, let Uwe find me at FOSDEM? > > > CCing: Uwe. > > > > > > D. From robbin.ehn at oracle.com Wed Jan 30 14:10:16 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 15:10:16 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <53d0664c-ffb2-d983-471a-cecb94e9edb2@redhat.com> References: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> <53d0664c-ffb2-d983-471a-cecb94e9edb2@redhat.com> Message-ID: <218a8306-0911-9e92-2d94-dab7ac7534ff@oracle.com> Thanks! /Robbin On 2019-01-30 15:04, Aleksey Shipilev wrote: > On 1/30/19 2:30 PM, Robbin Ehn wrote: >> http://cr.openjdk.java.net/~rehn/8218041/v03/inc/ >> http://cr.openjdk.java.net/~rehn/8218041/v03/ > > Still good. Shenandoah tests still good. > > -Aleksey > From stefan.karlsson at oracle.com Wed Jan 30 14:12:59 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 30 Jan 2019 15:12:59 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> References: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> Message-ID: Looks good. StefanK On 2019-01-30 14:30, Robbin Ehn wrote: > Hi, here is v3. > > http://cr.openjdk.java.net/~rehn/8218041/v03/inc/ > http://cr.openjdk.java.net/~rehn/8218041/v03/ > > Passes same compilations and t1. > > /Robbin > > On 2019-01-30 09:21, Robbin Ehn wrote: >> Hi all, please review. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8218041 >> >> After fixing these includes, there was a circular dependency via >> shenandoah >> code. I moved try_cancel_gc to cpp where the only use was. So it >> never should >> had been in the inline header in the first place. >> >> I listed why the include is needed below. >> >> Tier 1 and no pre-compiled. >> >> FYI: I was investigating why Handle::Handle(Thread*,oop) was not >> inlined. >> gcc complains there being a local comdat symbol, forcing it to be >> inlined or >> using clang there is no issue. So it looks like a gcc bug both in 7.3 >> and 8.2. >> >> Thanks, Robbin >> >> src/hotspot/share/aot/aotLoader.cpp >> runtime/os.inline.hpp????? for os::dll_unload >> >> src/hotspot/share/c1/c1_Runtime1.cpp >> runtime/handles.inline.hpp for Handle(Thread*, oop) >> >> src/hotspot/share/gc/z/zFuture.inline.hpp >> runtime/interfaceSupport.inline.hpp not used. >> >> src/hotspot/share/prims/nativeLookup.cpp >> runtime/os.inline.hpp????? for os::dll_unload >> >> src/hotspot/share/runtime/handles.hpp >> Forward declaration??????????? Thread >> >> src/hotspot/share/runtime/handles.inline.hpp >> runtime/thread.hpp?????? for Thread::current >> oops/oop.inline.hpp??????? for oopDesc::is_a >> oops/metadata.hpp????????? for is_valid >> >> src/hotspot/share/runtime/semaphore.inline.hpp >> runtime/thread.hpp???????? for osthread >> >> src/hotspot/share/runtime/vframe.cpp >> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From robbin.ehn at oracle.com Wed Jan 30 14:14:58 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 30 Jan 2019 15:14:58 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> Message-ID: Thanks StefanK! /Robbin On 2019-01-30 15:12, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2019-01-30 14:30, Robbin Ehn wrote: >> Hi, here is v3. >> >> http://cr.openjdk.java.net/~rehn/8218041/v03/inc/ >> http://cr.openjdk.java.net/~rehn/8218041/v03/ >> >> Passes same compilations and t1. >> >> /Robbin >> >> On 2019-01-30 09:21, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8218041 >>> >>> After fixing these includes, there was a circular dependency via shenandoah >>> code. I moved try_cancel_gc to cpp where the only use was. So it never should >>> had been in the inline header in the first place. >>> >>> I listed why the include is needed below. >>> >>> Tier 1 and no pre-compiled. >>> >>> FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. >>> gcc complains there being a local comdat symbol, forcing it to be inlined or >>> using clang there is no issue. So it looks like a gcc bug both in 7.3 and 8.2. >>> >>> Thanks, Robbin >>> >>> src/hotspot/share/aot/aotLoader.cpp >>> runtime/os.inline.hpp????? for os::dll_unload >>> >>> src/hotspot/share/c1/c1_Runtime1.cpp >>> runtime/handles.inline.hpp for Handle(Thread*, oop) >>> >>> src/hotspot/share/gc/z/zFuture.inline.hpp >>> runtime/interfaceSupport.inline.hpp not used. >>> >>> src/hotspot/share/prims/nativeLookup.cpp >>> runtime/os.inline.hpp????? for os::dll_unload >>> >>> src/hotspot/share/runtime/handles.hpp >>> Forward declaration??????????? Thread >>> >>> src/hotspot/share/runtime/handles.inline.hpp >>> runtime/thread.hpp?????? for Thread::current >>> oops/oop.inline.hpp??????? for oopDesc::is_a >>> oops/metadata.hpp????????? for is_valid >>> >>> src/hotspot/share/runtime/semaphore.inline.hpp >>> runtime/thread.hpp???????? for osthread >>> >>> src/hotspot/share/runtime/vframe.cpp >>> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized > From uschindler at apache.org Wed Jan 30 14:21:58 2019 From: uschindler at apache.org (Uwe Schindler) Date: Wed, 30 Jan 2019 15:21:58 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> Message-ID: <079501d4b8a7$2a831570$7f894050$@apache.org> Hi, > > A reproducer would be very nice. Did you try to reproduce with Replay > Compilation? > > I haven't tried to reproduce it, but it's popping up quite a bit > recently, see here for a backlog: > > https://lucene.markmail.org/search/%22jenkins+server%22+PhaseIdealLoop > ::split_up+list:org.apache.lucene.java-dev+order:date-backward > > For example this one > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/ > > is: > > [junit4] # JRE version: OpenJDK Runtime Environment (11.0+28) (build > 11+28) > [junit4] # Java VM: OpenJDK 64-Bit Server VM (11+28, mixed mode, > tiered, g1 gc, linux-amd64) > > Some of those builds are still on the server (and contain hs logs). > What worries me is that this only happens on Uwe's machine -- may be > related to particular hardware config it happens on. I don't think there is a hardware fault. The machine is quite stable and its also running virtual machines (Oracle Virtualbox) with Windows, MacOSX, and Solaris to test Lucene also on those platforms. If there would be a hardware issue, this would hardly work correct. But as the issues we see don't happen on the inner virtual machines that remove some advanced CPU features, maybe that's special here. So it could be caused by some special CPU feature of this machine that is not be used on the VBOX machines also testing. Another important thing here is: This machine is the only Lucene Test machine, that checks recent JDK versions. The other Jenkins machines only run with JDK 8 (the minimum requirement of Lucene/Solr). From the statistics: The recent failures don't happen with Java 8 and Java 9, but started with Java 10 or later! (because we only see the bug on runs using those versions). > A repro isn't going to be easy (are they ever? ;) as those tests run > pretty much at random within a single forked JVM and I bet it's just > some unusual pattern that tiggers the problem. Looking at where the > problem occurs it seems there is a common core related to compiling > this method: > > Current CompileTask: > C2:1534619 50541 s! 4 > org.apache.lucene.index.ConcurrentMergeScheduler::merge (280 bytes) It's also easy to reproduce, because the exact JDK version and test params are printed at the beginning of the build log, for the example mentioned before: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/consoleText -print-java-info: [java-info] java version "11" [java-info] OpenJDK Runtime Environment (11+28, Oracle Corporation) [java-info] OpenJDK 64-Bit Server VM (11+28, Oracle Corporation) [java-info] Test args: [-XX:-UseCompressedOops -XX:+UseG1GC] I thinks that's all needed to reproduce. Test args is some additional JVM options we use when running test suite (here we disbale compressed oops and we use G1GC). To run testsuite you can pass this to ant's command line. > The path leading to it may differ (when you diff those different > hs_err logs against each other), but it seems to be caused by merge > compilation in all cases I looked at. > > I can monitor this and attach new logs to the Jira issue > (LUCENE-8668). Uwe will be at Fosdem so I'm sure he'll be ready to > figure it out together with you, should you be there. > > Dawid > > > On Wed, Jan 30, 2019 at 11:22 AM Nils Eliasson > wrote: > > > > Sorry, too fast. You had already tested on various builds. > > > > Regards, > > > > Nils > > > > On 2019-01-30 10:57, Nils Eliasson wrote: > > > Hi Dawid, > > > > > > The hs_err-file is from a JDK 10 build. Would you mind testing with > > > JDK 11 or JDK 12-ea? > > > > > > What build of Lucene was this run against? Can point me to the > > > relevant jar? I will try reproducing with 7.6.0. > > > > > > Regards, > > > > > > Nils > > > > > > On 2019-01-30 10:27, Dawid Weiss wrote: > > >> Hello, > > >> > > >> There's quite a few of those JVM errors that popped up recently on one > > >> of Lucene's CI machines: > > >> > > >> https://issues.apache.org/jira/browse/LUCENE-8668 > > >> > > >> Happens on various JVMs (see the above issue). Would it be something > > >> familiar to any of you? A known issue or should we try to keep digging > > >> (for a repro, for example)? > > >> > > >> Dawid From lutz.schmidt at sap.com Wed Jan 30 14:25:13 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 30 Jan 2019 14:25:13 +0000 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: <542945f5-9c38-bea5-8a64-0f8d6ab32a13@redhat.com> References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> <0034E860-F7DE-4565-B157-93418E952388@sap.com> <9dff8727-9bfb-41e3-8be6-8e8d6421e3d1@redhat.com> <542945f5-9c38-bea5-8a64-0f8d6ab32a13@redhat.com> Message-ID: <42A9BA8C-755E-4D09-93CE-BD470285A76C@sap.com> Aleksey, I like your revision 02. Very compact. I do not like re-inventing language features by own code. Sorry for spamming you with suggestions/comments. Regards, Lutz ?On 30.01.19, 13:31, "Aleksey Shipilev" wrote: On 1/30/19 1:10 PM, Thomas St?fe wrote: > Or, since this thing keeps coming up, add a helper to outputStream: > > +void outputStream::put(char ch, int repeat_count) { > + for (int i = 0; i < repeat_count; i ++) { > + put(ch); > + } > +} > + > > st->put('?', 2*unitsize); No. Let's stop here: http://cr.openjdk.java.net/~shade/8217994/webrev.02/ There are thousand ways to do things, and there are only 8 work hours per day. Current patch with unusual format specifier is already good, to my taste. -Aleksey From daniel.daugherty at oracle.com Wed Jan 30 14:46:33 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 30 Jan 2019 09:46:33 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <384133a1-c2e0-11f2-6890-798c3646222f@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> <384133a1-c2e0-11f2-6890-798c3646222f@oracle.com> Message-ID: <2d2bba26-78b4-61ef-0610-9f8ea5b489ac@oracle.com> Trimming to respond to one comment... On 1/30/19 2:29 AM, David Holmes wrote: > Nit: > > assert(_owner == Thread::current(), "should be equal: owner=" > INTPTR_FORMAT > ?????????????????? ", self=" INTPTR_FORMAT, p2i(_owner), > p2i(Thread::current())); > > with Dan's enhanced assertions there's an indentation issue. The > second line should indent to the first comma, but that will make the > second line extend way past 80 columns. I don't think I've seen an indent to the first comma before... I would indent this as: ? assert(_owner == Thread::current(), "should be equal: owner=" INTPTR_FORMAT ???????? ", self=" INTPTR_FORMAT, p2i(_owner), p2i(Thread::current())); or ? assert(_owner == Thread::current(), ???????? "should be equal: owner=" INTPTR_FORMAT ", self=" INTPTR_FORMAT, ???????? p2i(_owner), p2i(Thread::current())); if you want all of the format string together... Dan From patricio.chilano.mateo at oracle.com Wed Jan 30 14:49:56 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 30 Jan 2019 09:49:56 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <780d1589-12a3-d82d-ba29-58f77e5cc840@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> <780d1589-12a3-d82d-ba29-58f77e5cc840@oracle.com> Message-ID: <4b4b845b-e235-7f42-a933-6a9a73ef798b@oracle.com> Hi Per, On 1/30/19 6:26 AM, Per Liden wrote: > Hi Patricio, > > On 01/30/2019 01:24 AM, Patricio Chilano wrote: >> Hi Per, >> >> On 1/29/19 4:22 AM, Per Liden wrote: >>> Hi Patricio, >>> >>> On 01/28/2019 08:18 PM, Patricio Chilano wrote: >>>> Hi Robbin, >>>> >>>> Thanks for reviewing this! Removing the block_in_safepoint_check >>>> thread local attribute is a great idea, here is v02: >>>> >>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev >>> >>> I really like that we're ditching our old locking code in favor of >>> using pthread_mutex, et al. Nice work! >> Thanks!?? : ) >> >>> General comment >>> ---------------- >>> I think Mutex to be a plain mutex and not come with the baggage of >>> having a conditional variable. With this new code, it seems we're in >>> a really good position to make that happen. I.e. something like this: >>> >>> class PlatformMutex { >>> protected: >>> ? pthread_mutex_t _mutex; >>> >>> public: >>> ? PlatformMutex(); >>> ? ~PlatformMutex(); >>> >>> ? void lock(); >>> ? void unlock(); >>> ? bool try_lock(); >>> }; >>> >>> class PlatformMonitor : public PlatformMutex { >>> private: >>> ? pthread_cond_t _cond; >>> >>> public: >>> ? PlatformMonitor(); >>> ? ~PlatformMonitor(); >>> >>> ? int wait(jlong millis); >>> ? void notify(); >>> ? void notify_all(); >>> }; >>> >>> It might be that we want to do that as a separate step later instead >>> of including it in this patch. But I think we should try to get there. >> I agree this is a good idea, but since it would make sense to also >> rework them at the high-level Monitor/Mutex as David pointed out >> (this idea is actually also proposed in the comments of mutex.hpp) >> what do you think if I file this as a separate bugid to be worked >> after we pushed this patch ? > > Sure, that can be done in a separate follow up patch. > >> >>> src/hotspot/os/posix/os_*.[ch]pp >>> --------------------------------- >>> * I'd suggest that we place the PlatformMonitor class in a separate >>> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we >>> have done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). >> I tried to moved them but there is a small issue in that >> PlatformMonitor code needs static methods defined in their current >> os_*.cpp files (methods that parse timing structs). I can declare >> them as public (cannot move them since they are also used by >> PlatformEvent and Parker), but for the Posix version of >> PlatformMonitor I would also need to do that with _condAttr and >> _mutexAttr which are also defined static in that file and are needed >> by PlatformMonitor::PlatformMonitor. So not sure what the right >> approach is here. >> In any case shouldn't we aim to have all synchronization-like classes >> in the same file for each platform (something like syncro_posix, >> syncro_windows, etc) instead of a separate file for each of them >> (semaphore_*, monitor_*, waitbarrier_*, etc). Otherwise seems >> PlatformParker and PlatformEvent should also be in their own file. > > Keeping things in separate files can make sense if these things can be > used standalone. A plain mutex (just like the plain semaphore we have) > can come handy in many places where you just want that mutex, without > having to drag in other classes or the whole os layer. Keeps > dependencies under control, reduces compile times, etc. Ok. If you don't mind then I can do that in the follow up RFE for the Monitor/Mutex rework after JDK-8217843 since David mentioned is working on moving things too. >>> src/hotspot/os/posix/os_posix.hpp >>> src/hotspot/os/solaris/os_solaris.hpp >>> src/hotspot/os/windows/os_windows.hpp >>> ------------------------------------- >>> * Please make _mutex/_cond plain variables, instead of arrays of 1. >>> That's just ugly ;) >> Done! >> >>> src/hotspot/os/posix/os_posix.cpp >>> --------------------------------- >>> * Destructor missing, to call pthread_(mutex|cond)_destroy(). >> Done! >> >>> src/hotspot/os/solaris/os_solaris.hpp >>> ------------------------------------- >>> * Not sure if there's a good reason to have the constructor be >>> inlined here. I'd suggest moving it to the cpp file. >>> >>> * Destructor missing. >> Done! >> >>> src/hotspot/os/windows/os_windows.cpp >>> ------------------------------------- >>> * Destructor missing (I'm not too familiar with the windows API but >>> I assume there's a destroy function we should call here). >> Done! (There is a destroy function for mutexes but not for condition >> variables which apparently do not need to free anything explicitly). >> >>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>> ----------------------------------------------------- >>> * Move "private:" above monitor_adr; >>> >>> ?289 class ThreadLockBlockInVM : public ThreadStateTransition { >>> ?290?? Monitor** monitor_adr; >>> ?291? private: >>> ?292?? void do_preempted(Monitor** in_flight_monitor_adr) { >>> >>> * monitor_adr should be _monitor_adr, or maybe even >>> _in_flight_monitor_adr to better match the name of the argument. >> Done! I realized there is no need for passing a parameter to >> do_preempted() since we already have the in_flight_monitor_adr so I >> also made small changes there. >> >> >> Here is v03 including also Dan and Robbin comments about mutex.cpp >> and safepointMechanism.hpp: >> >> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ >> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ > > What kind of performance measurements have been done on this patch? > > I took your v03 patch for a spin in SPECjbb2015 (with ZGC enabled) and > did not notice any obvious regressions in either throughput nor latency. I run SPECjbb2005, SPECjvm2008, SPECjvm98, SPECjbb2015 and a couple of other small benchmarks in Linux_x64 and Windows_x64 with the version I had before the reviews and as you said performance overall seems to stay the same. Thanks, Patricio > cheers, > Per > >> >> Running mach tiers1-3. Waiting though on you thoughts about file >> organization and deferring Mutex/Monitor rework. >> >> Thanks for looking into this Per! >> >> Thanks, >> Patricio >>> cheers, >>> Per >>> >>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/inc/webrev/ >>>> >>>> Running mach5 again. >>>> >>>> Thanks, >>>> Patricio >>>> >>>> On 1/28/19 8:31 AM, Robbin Ehn wrote: >>>>> Hi Patricio, >>>>> >>>>> Mostly looks good! >>>>> >>>>> block_at_safepoint is always called with block_in_safepoint_check >>>>> = true. (correct?) >>>>> Changing that to a local state instead of global simplifies the code. >>>>> >>>>> So I'm suggesting something like below. >>>>> >>>>> Thanks, Robbin >>>>> >>>>> diff -r e65cc445234c >>>>> src/hotspot/share/runtime/interfaceSupport.inline.hpp >>>>> --- a/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon >>>>> Jan 28 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/interfaceSupport.inline.hpp Mon >>>>> Jan 28 14:10:59 2019 +0100 >>>>> @@ -308,2 +308,1 @@ >>>>> -??? thread->block_in_safepoint_check = false; >>>>> -??? SafepointMechanism::block_at_safepoint(thread); >>>>> +??? SafepointMechanism::callback_if_safepoint(thread); >>>>> @@ -323,2 +322,1 @@ >>>>> -????? SafepointMechanism::block_at_safepoint(_thread); >>>>> -????? _thread->block_in_safepoint_check = true; >>>>> +????? SafepointMechanism::callback_if_safepoint(_thread); >>>>> @@ -335,2 +332,0 @@ >>>>> -??? } else { >>>>> -????? _thread->block_in_safepoint_check = true; >>>>> @@ -337,0 +334,1 @@ >>>>> + >>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.cpp >>>>> --- a/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 >>>>> 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/safepoint.cpp??? Mon Jan 28 >>>>> 14:10:59 2019 +0100 >>>>> @@ -795,1 +795,1 @@ >>>>> -void SafepointSynchronize::block(JavaThread *thread) { >>>>> +void SafepointSynchronize::block(JavaThread *thread, bool >>>>> block_in_safepoint_check) { >>>>> @@ -850,1 +850,1 @@ >>>>> -????? if (thread->block_in_safepoint_check) { >>>>> +????? if (block_in_safepoint_check) { >>>>> @@ -880,1 +880,1 @@ >>>>> -????????? thread->block_in_safepoint_check) { >>>>> +????????? block_in_safepoint_check) { >>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepoint.hpp >>>>> --- a/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 >>>>> 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/safepoint.hpp??? Mon Jan 28 >>>>> 14:10:59 2019 +0100 >>>>> @@ -146,1 +146,1 @@ >>>>> -? static void?? block(JavaThread *thread); >>>>> +? static void?? block(JavaThread *thread, bool >>>>> block_in_safepoint_check = true); >>>>> diff -r e65cc445234c src/hotspot/share/runtime/safepointMechanism.hpp >>>>> --- a/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 >>>>> 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.hpp Mon Jan 28 >>>>> 14:10:59 2019 +0100 >>>>> @@ -82,1 +82,1 @@ >>>>> -? static inline void block_at_safepoint(JavaThread* thread); >>>>> +? static inline void callback_if_safepoint(JavaThread* thread); >>>>> diff -r e65cc445234c >>>>> src/hotspot/share/runtime/safepointMechanism.inline.hpp >>>>> --- a/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>> Jan 28 13:10:15 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/safepointMechanism.inline.hpp Mon >>>>> Jan 28 14:10:59 2019 +0100 >>>>> @@ -82,1 +82,1 @@ >>>>> -void SafepointMechanism::block_at_safepoint(JavaThread* thread) { >>>>> +void SafepointMechanism::callback_if_safepoint(JavaThread* thread) { >>>>> @@ -84,1 +84,1 @@ >>>>> -??? SafepointSynchronize::block(thread); >>>>> +??? SafepointSynchronize::block(thread, false); >>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.cpp >>>>> --- a/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 13:10:15 >>>>> 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/thread.cpp??? Mon Jan 28 14:10:59 >>>>> 2019 +0100 >>>>> @@ -298,2 +297,0 @@ >>>>> -? block_in_safepoint_check = true; >>>>> - >>>>> diff -r e65cc445234c src/hotspot/share/runtime/thread.hpp >>>>> --- a/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 13:10:15 >>>>> 2019 +0100 >>>>> +++ b/src/hotspot/share/runtime/thread.hpp??? Mon Jan 28 14:10:59 >>>>> 2019 +0100 >>>>> @@ -788,2 +787,0 @@ >>>>> -? bool block_in_safepoint_check;????????????? // to decide >>>>> whether to block in SS::block or not >>>>> - >>>>> >>>>> >>>>> On 1/28/19 9:42 AM, Patricio Chilano wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review the following patch: >>>>>> >>>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8210832 >>>>>> Webrev: http://cr.openjdk.java.net/~pchilanomate/8210832/v01/webrev/ >>>>>> >>>>>> The current implementation of native monitors uses a technique >>>>>> that we name "sneaky locking" to prevent possible deadlocks of >>>>>> the JVM during safepoints. The implementation of this technique >>>>>> though introduces a race when a monitor is shared between the >>>>>> VMThread and non-JavaThreads. This patch aims to solve that >>>>>> problem and at the same time simplify the code. >>>>>> >>>>>> The proposal is based on the introduction of the new class >>>>>> PlatformMonitor, which serves as a wrapper for the actual >>>>>> synchronization primitives in each platform (mutexes and >>>>>> condition variables). Most of the API calls can thus be >>>>>> implemented as simple wrappers around PlatformMonitor, adding >>>>>> more assertions and very little extra metadata. >>>>>> To be able to remove the lock sneaking code and at the same time >>>>>> avoid deadlocking scenarios, we combine two techniques: >>>>>> >>>>>> -When a JavaThread that has just acquired the lock, detects there >>>>>> is a safepoint request in the ThreadLockBlockInVM destructor, it >>>>>> releases the lock before blocking at the safepoint. After >>>>>> resuming from it, the JavaThread will have to acquire the lock >>>>>> again. >>>>>> >>>>>> - In the ThreadLockBlockInVM constructor for the Monitor::wait() >>>>>> method, in order to avoid blocking we allow for a possible >>>>>> safepoint request to make progress but without letting the >>>>>> JavaThread block for it (since we would be stopped by the >>>>>> destructor anyways). We also do that for the Monitor::lock() case >>>>>> although no deadlock is being prevented there. >>>>>> >>>>>> The ThreadLockBlockInVM jacket is a new ThreadStateTransition >>>>>> class used instead of the ThreadBlockInVM one. This allowed more >>>>>> flexibility to handle the two techniques mentioned above. Also, >>>>>> ThreadBlockInVM calls SafepointMechanism::block_if_requested() >>>>>> which creates some problems when trying to allow safepoints to >>>>>> continue without stopping, since that method not only checks for >>>>>> safepoints but also processes handshakes. >>>>>> >>>>>> In terms of performance, benchmarks show very similar results to >>>>>> what we have now. >>>>>> >>>>>> So far mach5 hs-tier1-6 on Linux, OS X, Windows and Solaris have >>>>>> been tested. >>>>>> >>>>>> Thanks, >>>>>> Patricio >>>>>> >>>> >> From uschindler at apache.org Wed Jan 30 14:51:14 2019 From: uschindler at apache.org (Uwe Schindler) Date: Wed, 30 Jan 2019 15:51:14 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: Message-ID: <079a01d4b8ab$40895f10$c19c1d30$@apache.org> Hi, as many of you know: I have no access to the OpenJDK bugtracker. I am currently on travel, but later today, I'd take the current problem and I may open a new bug in the OpenJDK issue tracker about this issue. The stack traces all look very similar, so there should be enough information to figure out where exactly the SIGSEGV is happening. To my knowledge of development, this should help to get the condition, why this is happening. Together with the Lucene Source code, you may figure out the conditions. Because this bug is quite new, there must be some change in Lucene that triggers this bug. With the information about where the SIGSEGV is happening in OpenJDK code and the commit logs in Lucene, we may figure out how the Lucene code changed so it causes that issue. I will take the time and try to collect all information an OpenJDK developer needs to (hopefully) reproduce the issue. Does this sound like a plan? Uwe ----- Uwe Schindler uschindler at apache.org ASF Member, Apache Lucene PMC / Committer Bremen, Germany http://lucene.apache.org/ > -----Original Message----- > From: hotspot-dev On Behalf Of > Dawid Weiss > Sent: Wednesday, January 30, 2019 10:28 AM > To: hotspot-dev > Subject: SIGSEGV on PhaseIdealLoop::split_up? > > Hello, > > There's quite a few of those JVM errors that popped up recently on one > of Lucene's CI machines: > > https://issues.apache.org/jira/browse/LUCENE-8668 > > Happens on various JVMs (see the above issue). Would it be something > familiar to any of you? A known issue or should we try to keep digging > (for a repro, for example)? > > Dawid From thomas.schatzl at oracle.com Wed Jan 30 15:06:51 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 30 Jan 2019 16:06:51 +0100 Subject: RFR (T): 8218060: JDK-8217786 breaks build due to remaining unused function Message-ID: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> Hi all, can I have quick reviews for this change that fixes a compilation error due to an unused function? I.e. Compiling the repo after JDK-8217786 gives the following error: .../vmshare/jdk10/hs/open/src/hotspot/os/linux/os_linux.cpp:1860:13: error: 'bool print_matching_lines_from_sysinfo_file(outputStream*, const char**)' defined but not used [-Werror=unused-function] static bool print_matching_lines_from_sysinfo_file(outputStream* st, const char* keywords_to_match[]) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Compiling 305 files for jdk.javadoc cc1plus: all warnings being treated as errors make[3]: *** [[...]variant-server/libjvm/objs/os_linux.o] Error 1 lib/CompileJvm.gmk:172: recipe for target '[...]/linux- x64/hotspot/variant-server/libjvm/objs/os_linux.o' failed The change simply removes that unused function. CR: https://bugs.openjdk.java.net/browse/JDK-8218060 Webrev: http://cr.openjdk.java.net/~tschatzl/8218060/webrev/ Testing: local compilation Thanks, Thomas From coleen.phillimore at oracle.com Wed Jan 30 15:14:53 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 30 Jan 2019 10:14:53 -0500 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: <09dff7f0-54b9-1812-09ca-69666a2f2abe@redhat.com> References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> <09dff7f0-54b9-1812-09ca-69666a2f2abe@redhat.com> Message-ID: <8417d7a7-409e-7705-3d21-560cd4998b47@oracle.com> On 1/30/19 9:01 AM, Aleksey Shipilev wrote: > On 1/30/19 2:27 PM, coleen.phillimore at oracle.com wrote: >>> *) In SystemDictionary::do_unloading, do we think that trigger_cleanup() are cheap? Otherwise it >>> makes sense to retain a single GCTraceTime block around all three trigger_cleanups? >> The three trigger cleanups aren't specifically ClassLoaderData, so I didn't include them in the >> timing.? I think people/tools might look for ClassLoaderData so I didn't want to change the name of >> the timing.?? Checking now... >> >> ??? GCTraceTime(Debug, gc, phases) t("ClassLoaderData", gc_timer); >> >> I thought of removing it completely but the timing but the enclosing timers (exception in >> shenandoah) also include timing CodeCache::do_unloading and clean_weak_klass_links which are more >> expensive than ClassLoaderData. > My only (little) concern was that SD::do_unloading now has only one GCTraceTime("ClassLoaderData"). > Without knowing beforehand if ::trigger_cleanup-s are cheap, it seems odd to drop GCTraceTime from > them. Something like: > > GCTraceTime(Debug, gc, phases) t("Trigger cleanups", gc_timer); > > ResolvedMethodTable::trigger_cleanup(); > > if (unloading_occurred) { > SymbolTable::trigger_cleanup(); > > // Oops referenced by the protection domain cache table may get unreachable independently > // of the class loader (eg. cached protection domain oops). So we need to > // explicitly unlink them here. > // All protection domain oops are linked to the caller class, so if nothing > // unloads, this is not needed. > _pd_cache_table->trigger_cleanup(); > } > > ...but that can be indeed skipped if we know that those triggers cost much less than the timer itself. yeah, I can add that timer for now, and remove it if it's always zero later. In shenandoah, it appears that you have your own specific timer for SystemDictionary::do_unloading.? Did I read that correctly? Coleen > > -Aleksey > From erik.osterlund at oracle.com Wed Jan 30 15:18:02 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 30 Jan 2019 16:18:02 +0100 Subject: RFR (T): 8218060: JDK-8217786 breaks build due to remaining unused function In-Reply-To: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> References: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> Message-ID: <537419F9-A767-4F4A-B5DC-538979B64023@oracle.com> Hi Thomas, Looks good and trivial. Ship it! Thanks, /Erik > On 30 Jan 2019, at 16:06, Thomas Schatzl wrote: > > Hi all, > > can I have quick reviews for this change that fixes a compilation > error due to an unused function? > > I.e. > > Compiling the repo after JDK-8217786 gives the following error: > > .../vmshare/jdk10/hs/open/src/hotspot/os/linux/os_linux.cpp:1860:13: > error: 'bool print_matching_lines_from_sysinfo_file(outputStream*, > const char**)' defined but not used [-Werror=unused-function] > static bool print_matching_lines_from_sysinfo_file(outputStream* st, > const char* keywords_to_match[]) { > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Compiling 305 files for jdk.javadoc > cc1plus: all warnings being treated as errors > make[3]: *** [[...]variant-server/libjvm/objs/os_linux.o] Error 1 > lib/CompileJvm.gmk:172: recipe for target '[...]/linux- > x64/hotspot/variant-server/libjvm/objs/os_linux.o' failed > > The change simply removes that unused function. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8218060 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8218060/webrev/ > Testing: > local compilation > > Thanks, > Thomas > > From thomas.schatzl at oracle.com Wed Jan 30 15:20:16 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 30 Jan 2019 16:20:16 +0100 Subject: RFR (T): 8218060: JDK-8217786 breaks build due to remaining unused function In-Reply-To: <537419F9-A767-4F4A-B5DC-538979B64023@oracle.com> References: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> <537419F9-A767-4F4A-B5DC-538979B64023@oracle.com> Message-ID: <62f8be776b233cb9636e4224629686f3c9644393.camel@oracle.com> Hi Erik, On Wed, 2019-01-30 at 16:18 +0100, Erik Osterlund wrote: > Hi Thomas, > > Looks good and trivial. Ship it! done, thanks. Thomas From stefan.karlsson at oracle.com Wed Jan 30 15:22:07 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 30 Jan 2019 16:22:07 +0100 Subject: RFR (T): 8218060: JDK-8217786 breaks build due to remaining unused function In-Reply-To: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> References: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> Message-ID: <458eaa70-9a7b-4a32-9748-f95c27f09f48@oracle.com> I think this is the wrong fix. It's used here on s390: 2185 void os::Linux::print_virtualization_info(outputStream* st) { 2186 #if defined(S390) 2187 // /proc/sysinfo contains interesting information about 2188 // - LPAR 2189 // - whole "Box" (CPUs ) 2190 // - z/VM / KVM (VM); this is not available in an LPAR-only setup 2191 const char* kw[] = { "LPAR", "CPUs", "VM", NULL }; 2192 2193 if (! print_matching_lines_from_sysinfo_file(st, kw)) { 2194 st->print_cr(" "); 2195 } 2196 #endif 2197 } StefanK On 2019-01-30 16:06, Thomas Schatzl wrote: > Hi all, > > can I have quick reviews for this change that fixes a compilation > error due to an unused function? > > I.e. > > Compiling the repo after JDK-8217786 gives the following error: > > .../vmshare/jdk10/hs/open/src/hotspot/os/linux/os_linux.cpp:1860:13: > error: 'bool print_matching_lines_from_sysinfo_file(outputStream*, > const char**)' defined but not used [-Werror=unused-function] > static bool print_matching_lines_from_sysinfo_file(outputStream* st, > const char* keywords_to_match[]) { > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Compiling 305 files for jdk.javadoc > cc1plus: all warnings being treated as errors > make[3]: *** [[...]variant-server/libjvm/objs/os_linux.o] Error 1 > lib/CompileJvm.gmk:172: recipe for target '[...]/linux- > x64/hotspot/variant-server/libjvm/objs/os_linux.o' failed > > The change simply removes that unused function. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8218060 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8218060/webrev/ > Testing: > local compilation > > Thanks, > Thomas > > From thomas.schatzl at oracle.com Wed Jan 30 15:34:18 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 30 Jan 2019 16:34:18 +0100 Subject: RFR: 8218063: JDK-8218060 breaks build for S390 Message-ID: Hi all, as StefanK in the review thread noticed, the patch for JDK-8218060 was not correct in that while it fixed the Oracle platform builds, it broke the S390 build. Unfortunately I already had the change pushed. So I reinstated the method and guarded it with #if defined(S390) the same way as the method call is guarded. I hope it is okay to fix this with a single fix-CR and not backing out JDK-8218060 first. CR: https://bugs.openjdk.java.net/browse/JDK-8218063 Webrev: http://cr.openjdk.java.net/~tschatzl/8218063/webrev/ Testing: local compilation - I can't test the S390 build Sorry for the mess-up, Thomas From shade at redhat.com Wed Jan 30 15:35:50 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 16:35:50 +0100 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: <8417d7a7-409e-7705-3d21-560cd4998b47@oracle.com> References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> <09dff7f0-54b9-1812-09ca-69666a2f2abe@redhat.com> <8417d7a7-409e-7705-3d21-560cd4998b47@oracle.com> Message-ID: <7ef4177a-024c-bc84-fa43-4dfe39557985@redhat.com> On 1/30/19 4:14 PM, coleen.phillimore at oracle.com wrote: > In shenandoah, it appears that you have your own specific timer for SystemDictionary::do_unloading.? > Did I read that correctly? Yes, we have Shenandoah-specific timers for differents part of runtime, so we can have the global tabulated profile. SystemDictionary::do_unloading is one of the points that is measured. It should not conflict with your patch. -Aleksey From shade at redhat.com Wed Jan 30 15:54:38 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 16:54:38 +0100 Subject: RFR: 8218063: JDK-8218060 breaks build for S390 In-Reply-To: References: Message-ID: <0be60a09-4786-3b1c-9bd2-ca7f87bcc5b4@redhat.com> On 1/30/19 4:34 PM, Thomas Schatzl wrote: > I hope it is okay to fix this with a single fix-CR and not backing out > JDK-8218060 first. Yes, I think so. > CR: > https://bugs.openjdk.java.net/browse/JDK-8218063 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8218063/webrev/ > Testing: > local compilation - I can't test the S390 build My CI servers are able to cross-compile to s390x. Current build is indeed broken. The patch in this webrev indeed fixes it. Looks good to me. -Aleksey From erik.osterlund at oracle.com Wed Jan 30 15:53:57 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 30 Jan 2019 16:53:57 +0100 Subject: RFR: 8218063: JDK-8218060 breaks build for S390 In-Reply-To: References: Message-ID: Hi Thomas, Looks as good and trivial as the last one. Hmm... Ship it! Thanks, /Erik On 2019-01-30 16:34, Thomas Schatzl wrote: > Hi all, > > as StefanK in the review thread noticed, the patch for JDK-8218060 > was not correct in that while it fixed the Oracle platform builds, it > broke the S390 build. > > Unfortunately I already had the change pushed. > > So I reinstated the method and guarded it with #if defined(S390) the > same way as the method call is guarded. > > I hope it is okay to fix this with a single fix-CR and not backing out > JDK-8218060 first. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8218063 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8218063/webrev/ > Testing: > local compilation - I can't test the S390 build > > Sorry for the mess-up, > Thomas > > From shade at redhat.com Wed Jan 30 16:00:39 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 17:00:39 +0100 Subject: RFR (S) 8217994: os::print_hex_dump should be more resilient against unreadable memory In-Reply-To: <42A9BA8C-755E-4D09-93CE-BD470285A76C@sap.com> References: <988cff50-d9d8-29d7-2cce-148dea4fee60@redhat.com> <0034E860-F7DE-4565-B157-93418E952388@sap.com> <9dff8727-9bfb-41e3-8be6-8e8d6421e3d1@redhat.com> <542945f5-9c38-bea5-8a64-0f8d6ab32a13@redhat.com> <42A9BA8C-755E-4D09-93CE-BD470285A76C@sap.com> Message-ID: On 1/30/19 3:25 PM, Schmidt, Lutz wrote: > I like your revision 02. Very compact. > I do not like re-inventing language features by own code. > Sorry for spamming you with suggestions/comments. No problem! > ?On 30.01.19, 13:31, "Aleksey Shipilev" wrote: > No. Let's stop here: > http://cr.openjdk.java.net/~shade/8217994/webrev.02/ This passes jdk-submit. Unless there are other comments, I shall push it in a few hours. -Aleksey From thomas.schatzl at oracle.com Wed Jan 30 16:03:40 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 30 Jan 2019 17:03:40 +0100 Subject: RFR: 8218063: JDK-8218060 breaks build for S390 In-Reply-To: <0be60a09-4786-3b1c-9bd2-ca7f87bcc5b4@redhat.com> References: <0be60a09-4786-3b1c-9bd2-ca7f87bcc5b4@redhat.com> Message-ID: <0a3e7a51b5b9007a8fc7fe39e5f5dd6cafb70b09.camel@oracle.com> Hi Aleksey, Erik, On Wed, 2019-01-30 at 16:54 +0100, Aleksey Shipilev wrote: > On 1/30/19 4:34 PM, Thomas Schatzl wrote: > > I hope it is okay to fix this with a single fix-CR and not backing > > out > > JDK-8218060 first. > > Yes, I think so. > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8218063 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8218063/webrev/ > > Testing: > > local compilation - I can't test the S390 build > > My CI servers are able to cross-compile to s390x. Current build is > indeed broken. The patch in this > webrev indeed fixes it. Looks good to me. > thanks for verifying and reviewing. Apologies again for breaking S390. Pushed. Thanks, Thomas From tobias.hartmann at oracle.com Wed Jan 30 16:04:09 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 30 Jan 2019 17:04:09 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: <079a01d4b8ab$40895f10$c19c1d30$@apache.org> References: <079a01d4b8ab$40895f10$c19c1d30$@apache.org> Message-ID: <673b046a-b295-f773-9742-6c4e9516a348@oracle.com> Hi Uwe, I've filed https://bugs.openjdk.java.net/browse/JDK-8218067 for this issue. Thanks, Tobias On 30.01.19 15:51, Uwe Schindler wrote: > Hi, > > as many of you know: I have no access to the OpenJDK bugtracker. I am currently on travel, but later today, I'd take the current problem and I may open a new bug in the OpenJDK issue tracker about this issue. The stack traces all look very similar, so there should be enough information to figure out where exactly the SIGSEGV is happening. To my knowledge of development, this should help to get the condition, why this is happening. Together with the Lucene Source code, you may figure out the conditions. Because this bug is quite new, there must be some change in Lucene that triggers this bug. With the information about where the SIGSEGV is happening in OpenJDK code and the commit logs in Lucene, we may figure out how the Lucene code changed so it causes that issue. > > I will take the time and try to collect all information an OpenJDK developer needs to (hopefully) reproduce the issue. > > Does this sound like a plan? > Uwe > > ----- > Uwe Schindler > uschindler at apache.org > ASF Member, Apache Lucene PMC / Committer > Bremen, Germany > http://lucene.apache.org/ > >> -----Original Message----- >> From: hotspot-dev On Behalf Of >> Dawid Weiss >> Sent: Wednesday, January 30, 2019 10:28 AM >> To: hotspot-dev >> Subject: SIGSEGV on PhaseIdealLoop::split_up? >> >> Hello, >> >> There's quite a few of those JVM errors that popped up recently on one >> of Lucene's CI machines: >> >> https://issues.apache.org/jira/browse/LUCENE-8668 >> >> Happens on various JVMs (see the above issue). Would it be something >> familiar to any of you? A known issue or should we try to keep digging >> (for a repro, for example)? >> >> Dawid > From shade at redhat.com Wed Jan 30 16:09:43 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 30 Jan 2019 17:09:43 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: <079a01d4b8ab$40895f10$c19c1d30$@apache.org> References: <079a01d4b8ab$40895f10$c19c1d30$@apache.org> Message-ID: On 1/30/19 3:51 PM, Uwe Schindler wrote: > as many of you know: I have no access to the OpenJDK bugtracker. I am currently on travel, but > later today, I'd take the current problem and I may open a new bug in the OpenJDK issue tracker > about this issue. The stack traces all look very similar, so there should be enough information > to figure out where exactly the SIGSEGV is happening. To my knowledge of development, this should > help to get the condition, why this is happening. Together with the Lucene Source code, you may > figure out the conditions. Because this bug is quite new, there must be some change in Lucene > that triggers this bug. With the information about where the SIGSEGV is happening in OpenJDK code > and the commit logs in Lucene, we may figure out how the Lucene code changed so it causes that > issue. > > I will take the time and try to collect all information an OpenJDK developer needs to (hopefully) > reproduce the issue. > > Does this sound like a plan? Uwe Yes. And try to run with fastdebug build, it might meaningfully assert, which may point to existing bugreports or even fixes. You can build it yourself, or pull one from here: https://builds.shipilev.net/ -Aleksey From dawid.weiss at gmail.com Wed Jan 30 16:55:23 2019 From: dawid.weiss at gmail.com (Dawid Weiss) Date: Wed, 30 Jan 2019 17:55:23 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: <079501d4b8a7$2a831570$7f894050$@apache.org> References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> <079501d4b8a7$2a831570$7f894050$@apache.org> Message-ID: > [...] But as the issues we see don't happen on the inner virtual machines that remove some advanced CPU features, maybe that's special here. Yes - I meant a particular hardware configuration your setup runs on. I think we're safe to rule out hardware failure as all stack traces are (nearly) the same. > The recent failures don't happen with Java 8 and Java 9, but started with Java 10 or later! (because we only see the bug on runs using those versions). They do occur with Java 9 (see mailing list archives) and stretch as far back as July 10th, 2018. They are rare though, given how many test runs we've had since then, so it'll be a tricky one. ;) My point about not being able to reproduce the issue has more to do with the fact that the problem may not be repeatable in isolation from other, previously executed, tests that prime execution statistics for the optimizer. But I haven't tried, so it's just a gut feeling. D. From coleen.phillimore at oracle.com Wed Jan 30 19:41:58 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 30 Jan 2019 14:41:58 -0500 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: <09dff7f0-54b9-1812-09ca-69666a2f2abe@redhat.com> References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> <09dff7f0-54b9-1812-09ca-69666a2f2abe@redhat.com> Message-ID: <8851bcbe-78be-6c32-130c-a87b556dcaf7@oracle.com> I added the timing for triggering cleanup.? It's likely not interesting.? The main cleanup time is code cache cleaning still. This is my fastdebug timing: [4.992s][debug][gc,phases] GC(2) ClassLoaderData 0.016ms [4.992s][debug][gc,phases] GC(2) Trigger cleanups 0.010ms [5.039s][debug][gc,phases] GC(2) Class Unloading 47.290ms open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213753.02/webrev Retested with tier1, 2, kitchensink sanity, and specjbb2015. Thanks for the code reviews. Thanks, Coleen On 1/30/19 9:01 AM, Aleksey Shipilev wrote: > On 1/30/19 2:27 PM, coleen.phillimore at oracle.com wrote: >>> *) In SystemDictionary::do_unloading, do we think that trigger_cleanup() are cheap? Otherwise it >>> makes sense to retain a single GCTraceTime block around all three trigger_cleanups? >> The three trigger cleanups aren't specifically ClassLoaderData, so I didn't include them in the >> timing.? I think people/tools might look for ClassLoaderData so I didn't want to change the name of >> the timing.?? Checking now... >> >> ??? GCTraceTime(Debug, gc, phases) t("ClassLoaderData", gc_timer); >> >> I thought of removing it completely but the timing but the enclosing timers (exception in >> shenandoah) also include timing CodeCache::do_unloading and clean_weak_klass_links which are more >> expensive than ClassLoaderData. > My only (little) concern was that SD::do_unloading now has only one GCTraceTime("ClassLoaderData"). > Without knowing beforehand if ::trigger_cleanup-s are cheap, it seems odd to drop GCTraceTime from > them. Something like: > > GCTraceTime(Debug, gc, phases) t("Trigger cleanups", gc_timer); > > ResolvedMethodTable::trigger_cleanup(); > > if (unloading_occurred) { > SymbolTable::trigger_cleanup(); > > // Oops referenced by the protection domain cache table may get unreachable independently > // of the class loader (eg. cached protection domain oops). So we need to > // explicitly unlink them here. > // All protection domain oops are linked to the caller class, so if nothing > // unloads, this is not needed. > _pd_cache_table->trigger_cleanup(); > } > > ...but that can be indeed skipped if we know that those triggers cost much less than the timer itself. > > -Aleksey > From igor.ignatyev at oracle.com Wed Jan 30 19:46:04 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 30 Jan 2019 11:46:04 -0800 Subject: RFR(T) [12] : 8218079 : cleanup hotspot ProblemList files Message-ID: <180D2064-D4DA-4C0B-BE0B-1F6561DAA05B@oracle.com> http://cr.openjdk.java.net/~iignatyev//8218079/webrev.00/index.html > 7 lines changed: 1 ins; 1 del; 5 mod; Hi all, could you please review this small and trivial fix which cleans up hotspot problem list? the lines associated w/ the following bugs need to be updated: - JDK-7013634 -- CNR, - JDK-8217851 -- dup of JDK-8189604 - JDK-8129886 -- dup of JDK-8218049 - JDK-8177765 -- dup of JDK-8218049 webrev: http://cr.openjdk.java.net/~iignatyev//8218079/webrev.00/index.html JBS: https://bugs.openjdk.java.net/browse/JDK-8218079 Thanks, -- Igor From vladimir.kozlov at oracle.com Wed Jan 30 19:59:12 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 30 Jan 2019 11:59:12 -0800 Subject: RFR(T) [12] : 8218079 : cleanup hotspot ProblemList files In-Reply-To: <180D2064-D4DA-4C0B-BE0B-1F6561DAA05B@oracle.com> References: <180D2064-D4DA-4C0B-BE0B-1F6561DAA05B@oracle.com> Message-ID: <545aec4b-ca61-c3ee-acf6-4f39b4bd8bf8@oracle.com> Good Thanks, Vladimir On 1/30/19 11:46 AM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8218079/webrev.00/index.html >> 7 lines changed: 1 ins; 1 del; 5 mod; > > Hi all, > > could you please review this small and trivial fix which cleans up hotspot problem list? the lines associated w/ the following bugs need to be updated: > - JDK-7013634 -- CNR, > - JDK-8217851 -- dup of JDK-8189604 > - JDK-8129886 -- dup of JDK-8218049 > - JDK-8177765 -- dup of JDK-8218049 > > webrev: http://cr.openjdk.java.net/~iignatyev//8218079/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8218079 > > Thanks, > -- Igor > From igor.ignatyev at oracle.com Wed Jan 30 21:00:10 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 30 Jan 2019 13:00:10 -0800 Subject: RFR(T) [12] : 8218079 : cleanup hotspot ProblemList files In-Reply-To: <545aec4b-ca61-c3ee-acf6-4f39b4bd8bf8@oracle.com> References: <180D2064-D4DA-4C0B-BE0B-1F6561DAA05B@oracle.com> <545aec4b-ca61-c3ee-acf6-4f39b4bd8bf8@oracle.com> Message-ID: Thanks Vladimir, -- Igor > On Jan 30, 2019, at 11:59 AM, Vladimir Kozlov wrote: > > Good > > Thanks, > Vladimir > > On 1/30/19 11:46 AM, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8218079/webrev.00/index.html >>> 7 lines changed: 1 ins; 1 del; 5 mod; >> Hi all, >> could you please review this small and trivial fix which cleans up hotspot problem list? the lines associated w/ the following bugs need to be updated: >> - JDK-7013634 -- CNR, >> - JDK-8217851 -- dup of JDK-8189604 >> - JDK-8129886 -- dup of JDK-8218049 >> - JDK-8177765 -- dup of JDK-8218049 >> webrev: http://cr.openjdk.java.net/~iignatyev//8218079/webrev.00/index.html >> JBS: https://bugs.openjdk.java.net/browse/JDK-8218079 >> Thanks, >> -- Igor From jesper.wilhelmsson at oracle.com Wed Jan 30 21:22:20 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Wed, 30 Jan 2019 22:22:20 +0100 Subject: RFR(xxs): JDK-7013634 was closed and should not cause a test to be problemListed Message-ID: <4B04A037-CEA9-4812-B936-7F242B76F86B@oracle.com> Hi, Please review this tiny change to remove 7013634 from the problemList. There is another bug that causes the same test to be problemlisted so the test will still be on the list. Bug: https://bugs.openjdk.java.net/browse/JDK-8218085 Patch: --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -167,7 +167,7 @@ vmTestbase/metaspace/gc/firstGC_default/TestDescription.java 8208250 generic-all vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted003/TestDescription.java 6606767 generic-all -vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted004/TestDescription.java 7013634,6606767 generic-all +vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted004/TestDescription.java 6606767 generic-all vmTestbase/nsk/jvmti/scenarios/extension/EX03/ex03t001/TestDescription.java 8173658 generic-all vmTestbase/nsk/jvmti/AttachOnDemand/attach045/TestDescription.java 8202971 generic-all Thanks, /Jesper From jesper.wilhelmsson at oracle.com Wed Jan 30 21:26:47 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Wed, 30 Jan 2019 22:26:47 +0100 Subject: RFR(xxs): JDK-7013634 was closed and should not cause a test to be problemListed In-Reply-To: <4B04A037-CEA9-4812-B936-7F242B76F86B@oracle.com> References: <4B04A037-CEA9-4812-B936-7F242B76F86B@oracle.com> Message-ID: <889B5F0F-BFF5-44B0-B779-B032A47C1173@oracle.com> Igor beat me to it: JDK-8218079 Withdrawn. /Jesper > On 30 Jan 2019, at 22:22, jesper.wilhelmsson at oracle.com wrote: > > Hi, > > Please review this tiny change to remove 7013634 from the problemList. There is another bug that causes the same test to be problemlisted so the test will still be on the list. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8218085 > Patch: > > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -167,7 +167,7 @@ > vmTestbase/metaspace/gc/firstGC_default/TestDescription.java 8208250 generic-all > > vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted003/TestDescription.java 6606767 generic-all > -vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted004/TestDescription.java 7013634,6606767 generic-all > +vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted004/TestDescription.java 6606767 generic-all > vmTestbase/nsk/jvmti/scenarios/extension/EX03/ex03t001/TestDescription.java 8173658 generic-all > vmTestbase/nsk/jvmti/AttachOnDemand/attach045/TestDescription.java 8202971 generic-all > > > Thanks, > /Jesper > From patricio.chilano.mateo at oracle.com Wed Jan 30 21:37:21 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 30 Jan 2019 16:37:21 -0500 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <384133a1-c2e0-11f2-6890-798c3646222f@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> <384133a1-c2e0-11f2-6890-798c3646222f@oracle.com> Message-ID: <6654f80f-9932-28f4-63f6-eece68966a56@oracle.com> Hi David, On 1/30/19 2:29 AM, David Holmes wrote: > Hi Patricio, > > > > First, thanks for all the many weeks of work you've put into this, > pulling together a number of ideas from different people to make it > all work! Thanks! Credit to you for the PlatformMonitor implementation? : ) > I've only got a few minor comments/suggestions. > > On 30/01/2019 10:24 am, Patricio Chilano wrote: >> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ >> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ >> > > src/hotspot/share/runtime/interfaceSupport.inline.hpp > > I'm very unclear how ThreadLockBlockInVM differs from ThreadBlockInVM. > You've duplicated a lot of complex code which is masking the actual > difference between the two wrappers to me. It seems to me that an > extra arg to transition_and_fence should allow you to handle the new > behaviour without having to duplicate so much of this code. In any > case the semantics of ThreadLockBlockInVM needs to be described. I could do it with one extra argument, but I would need to add two extra branches in transition_and_fence(), one to decide if I'm in the Monitor case to avoid calling SafepointMechanism::block_if_requested() directly and another one to actually decide if I'm transitioning in or out, since the actions to perform are different. I think it is easier to read without adding new conditionals, and also we will save those extra branches, but if you think it's better this way I can change it. > Also I'm unclear what the "Lock" in ThreadLockBlockInVM actually > refers to. I find the name quite jarring to read. What about changing it to ThreadBlockinMonitor? > On the subject of naming, do_preempt and preempt_by_safepoint don't > really convey to me what happens - what is being "preempted" here? I > would suggest a more direct Monitor::release_for_safepoint Changed. > --- > > Logging: why "nativemonitor"? The logging in mutex.cpp doesn't relate > to a "native" monitor?? Actually I'm not even sure if we need bother > at all with the one logging statement that is present. I added it to eventually track unbounded try locks. Not sure I follow you with the name, isn't that how we name this monitors? I tried to differentiate them from Java monitors. What about just "monitor"? > --- > > src/hotspot/share/runtime/mutex.cpp > > void Monitor::lock_without_safepoint_check(Thread * self) { > ? // Ensure that the Monitor does not require or allow safepoint checks. > > The comment there should only say "not require". Done. > void Monitor::preempt_by_safepoint() { > ? _lock.unlock(); > } > > Apart from renaming this as suggested above, aren't there any suitable > assertions we should have here? safepoint-in-progress or > handshake-in-progress? _owner == Thread::current? Ok, I added an assertion that owner should be NULL. Asserting safepoint-in-progress does not really work because _state could change to _not_synchronized right after you checked for it in TLBIVM. > Nit: > > assert(_owner == Thread::current(), "should be equal: owner=" > INTPTR_FORMAT > ?????????????????? ", self=" INTPTR_FORMAT, p2i(_owner), > p2i(Thread::current())); > > with Dan's enhanced assertions there's an indentation issue. The > second line should indent to the first comma, but that will make the > second line extend way past 80 columns. > > Also you could factor that assertion for _owner==Thread::current() > into its own function or macro to avoid the repetition. Corrected indentation based on Dan's reply to align with _owner. > ?OSThreadWaitState osts(self->osthread(), false /* not Object.wait() */); > > This needs to be returned to its original place as per Dan's comments. Done. > ??? } else { > ????? Monitor::lock(self); > ??? } > > You don't need Monitor:: here Removed. > // Temporary JVM_RawMonitor* support. A raw monitor can just be a > PlatformMonitor now. > > This needs to be resolved before committing. Some of the existing > commentary on what raw monitors are needs to be retained. Not clear if > we need to set the _owner field or can just skip it. Is it okay if I keep the following comments? // Yet another degenerate version of Monitor::lock() or lock_without_safepoint_check() // jvm_raw_lock() and _unlock() can be called by non-Java threads via JVM_RawMonitorEnter. // // There's no expectation that JVM_RawMonitors will interoperate properly with the native // Mutex-Monitor constructs.? We happen to implement JVM_RawMonitors in terms of // native Mutex-Monitors simply as a matter of convenience. I could keep setting the owner as _owner = Thread::current_or_null() in jvm_raw_lock(), at least it wouldn't hurt. > Monitor::~Monitor() { > ? assert(_owner == NULL, "should be NULL: owner=" INTPTR_FORMAT, > p2i(_owner)); > } > > Will this automatically result in the PlatformMonitor destructor being > called? Yes, should I add a comment to make it clear that ~PlatformMonitor() will be executed? Thanks for looking into this! Waiting on your comments to send v04. Thanks, Patricio > --- > > Thanks, > David > ----- > From david.holmes at oracle.com Wed Jan 30 22:40:06 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 31 Jan 2019 08:40:06 +1000 Subject: RFR (T): 8218060: JDK-8217786 breaks build due to remaining unused function In-Reply-To: <458eaa70-9a7b-4a32-9748-f95c27f09f48@oracle.com> References: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> <458eaa70-9a7b-4a32-9748-f95c27f09f48@oracle.com> Message-ID: <0f13540f-b752-060b-bc8e-f05471bc7339@oracle.com> Yes the fix is wrong. :( I'll restore the S390 fix if noone else has gotten to it yet. David On 31/01/2019 1:22 am, Stefan Karlsson wrote: > I think this is the wrong fix. It's used here on s390: > > 2185 void os::Linux::print_virtualization_info(outputStream* st) { > 2186 #if defined(S390) > 2187?? // /proc/sysinfo contains interesting information about > 2188?? // - LPAR > 2189?? // - whole "Box" (CPUs ) > 2190?? // - z/VM / KVM (VM); this is not available in an LPAR-only > setup > 2191?? const char* kw[] = { "LPAR", "CPUs", "VM", NULL }; > 2192 > 2193?? if (! print_matching_lines_from_sysinfo_file(st, kw)) { > 2194???? st->print_cr("? "); > 2195?? } > 2196 #endif > 2197 } > > StefanK > > > > On 2019-01-30 16:06, Thomas Schatzl wrote: >> Hi all, >> >> ?? can I have quick reviews for this change that fixes a compilation >> error due to an unused function? >> >> I.e. >> >> Compiling the repo after JDK-8217786 gives the following error: >> >> .../vmshare/jdk10/hs/open/src/hotspot/os/linux/os_linux.cpp:1860:13: >> error: 'bool print_matching_lines_from_sysinfo_file(outputStream*, >> const char**)' defined but not used [-Werror=unused-function] >> ? static bool print_matching_lines_from_sysinfo_file(outputStream* st, >> const char* keywords_to_match[]) { >> ????????????? ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> Compiling 305 files for jdk.javadoc >> cc1plus: all warnings being treated as errors >> make[3]: *** [[...]variant-server/libjvm/objs/os_linux.o] Error 1 >> lib/CompileJvm.gmk:172: recipe for target '[...]/linux- >> x64/hotspot/variant-server/libjvm/objs/os_linux.o' failed >> >> The change simply removes that unused function. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8218060 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8218060/webrev/ >> Testing: >> local compilation >> >> Thanks, >> ?? Thomas >> >> > From daniel.daugherty at oracle.com Wed Jan 30 22:42:18 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 30 Jan 2019 17:42:18 -0500 Subject: RFR (T): 8218060: JDK-8217786 breaks build due to remaining unused function In-Reply-To: <0f13540f-b752-060b-bc8e-f05471bc7339@oracle.com> References: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> <458eaa70-9a7b-4a32-9748-f95c27f09f48@oracle.com> <0f13540f-b752-060b-bc8e-f05471bc7339@oracle.com> Message-ID: <84208277-9b2f-c128-e2a0-b4a06a90e5ad@oracle.com> Already done by Thomas via JDK-8218063. Dan On 1/30/19 5:40 PM, David Holmes wrote: > Yes the fix is wrong. :( > > I'll restore the S390 fix if noone else has gotten to it yet. > > David > > On 31/01/2019 1:22 am, Stefan Karlsson wrote: >> I think this is the wrong fix. It's used here on s390: >> >> 2185 void os::Linux::print_virtualization_info(outputStream* st) { >> 2186 #if defined(S390) >> 2187?? // /proc/sysinfo contains interesting information about >> 2188?? // - LPAR >> 2189?? // - whole "Box" (CPUs ) >> 2190?? // - z/VM / KVM (VM); this is not available in an >> LPAR-only setup >> 2191?? const char* kw[] = { "LPAR", "CPUs", "VM", NULL }; >> 2192 >> 2193?? if (! print_matching_lines_from_sysinfo_file(st, kw)) { >> 2194???? st->print_cr("? "); >> 2195?? } >> 2196 #endif >> 2197 } >> >> StefanK >> >> >> >> On 2019-01-30 16:06, Thomas Schatzl wrote: >>> Hi all, >>> >>> ?? can I have quick reviews for this change that fixes a compilation >>> error due to an unused function? >>> >>> I.e. >>> >>> Compiling the repo after JDK-8217786 gives the following error: >>> >>> .../vmshare/jdk10/hs/open/src/hotspot/os/linux/os_linux.cpp:1860:13: >>> error: 'bool print_matching_lines_from_sysinfo_file(outputStream*, >>> const char**)' defined but not used [-Werror=unused-function] >>> ? static bool print_matching_lines_from_sysinfo_file(outputStream* st, >>> const char* keywords_to_match[]) { >>> ????????????? ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> Compiling 305 files for jdk.javadoc >>> cc1plus: all warnings being treated as errors >>> make[3]: *** [[...]variant-server/libjvm/objs/os_linux.o] Error 1 >>> lib/CompileJvm.gmk:172: recipe for target '[...]/linux- >>> x64/hotspot/variant-server/libjvm/objs/os_linux.o' failed >>> >>> The change simply removes that unused function. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8218060 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8218060/webrev/ >>> Testing: >>> local compilation >>> >>> Thanks, >>> ?? Thomas >>> >>> >> > From david.holmes at oracle.com Wed Jan 30 22:47:51 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 31 Jan 2019 08:47:51 +1000 Subject: RFR : 8217786: Provide virtualization related info in the hs_error file on linux s390x In-Reply-To: References: <816bec8b-b9ca-f0b8-f72d-be6ede83d63b@oracle.com> <73d4509f-09b7-e630-2568-bae0395f6b8d@oracle.com> <032db50d-0086-c44d-0655-a2fd100dce31@oracle.com> Message-ID: <8ab96b97-ba38-76ff-9f28-27f29ca19567@oracle.com> Matthias, Turned out this broke the build on non S390 as the compiler complained about the unused function print_matching_lines_from_sysinfo_file. In the rush to fix that 8218060 incorrectly removed the function completely. But then 8218063 put it back in an ifdef. So all should be well again. Please ensure all changes go through jdk-submit before pushing. Thanks, David On 30/01/2019 7:00 pm, Baesken, Matthias wrote: > Hi David, > >>> Style nit: avoid implicit booleans, explicitly check != NULL > > I added the explicit "!= NULL" check and an add a line with an introductory text . > > > @Thomas - may I add you as reviewer ? > > > Thanks, Matthias > > >> -----Original Message----- >> From: David Holmes >> Sent: Mittwoch, 30. Januar 2019 06:49 >> To: Baesken, Matthias ; Thomas St?fe >> >> Cc: hotspot-dev at openjdk.java.net >> Subject: Re: RFR : 8217786: Provide virtualization related info in the hs_error >> file on linux s390x >> >> Hi Matthias, >> >> Thanks for reworking this. >> >> On 30/01/2019 2:56 am, Baesken, Matthias wrote: >>> Hello, I added a break to avoid potential printing lines multiple times, >>> and removed the comment line : >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8217786.3/ >> >> A couple of minor comments: >> >> src/hotspot/os/linux/os_linux.cpp >> >> + while (keywords_to_match[i]) { >> >> Style nit: avoid implicit booleans, explicitly check != NULL >> >> + void os::Linux::print_virtualization_info(outputStream* st) { >> >> Don't you want an initial print of some introductory text eg: >> >> "Virtualization Information" >> >> No need for updated webrev. >> >> Thanks, >> David >> ----- > From uschindler at apache.org Wed Jan 30 22:52:03 2019 From: uschindler at apache.org (Uwe Schindler) Date: Wed, 30 Jan 2019 23:52:03 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: <079a01d4b8ab$40895f10$c19c1d30$@apache.org> Message-ID: <083b01d4b8ee$6c930a50$45b91ef0$@apache.org> OK, will try tomorrow. The big problem is that it triggers only from time to time. So maybe I can start some loop to run until it fails. Uwe ----- Uwe Schindler uschindler at apache.org ASF Member, Apache Lucene PMC / Committer Bremen, Germany http://lucene.apache.org/ > -----Original Message----- > From: Aleksey Shipilev > Sent: Wednesday, January 30, 2019 5:10 PM > To: Uwe Schindler ; 'Dawid Weiss' > ; 'hotspot-dev' > Subject: Re: SIGSEGV on PhaseIdealLoop::split_up? > > On 1/30/19 3:51 PM, Uwe Schindler wrote: > > as many of you know: I have no access to the OpenJDK bugtracker. I am > currently on travel, but > > later today, I'd take the current problem and I may open a new bug in the > OpenJDK issue tracker > > about this issue. The stack traces all look very similar, so there should be > enough information > > to figure out where exactly the SIGSEGV is happening. To my knowledge of > development, this should > > help to get the condition, why this is happening. Together with the Lucene > Source code, you may > > figure out the conditions. Because this bug is quite new, there must be > some change in Lucene > > that triggers this bug. With the information about where the SIGSEGV is > happening in OpenJDK code > > and the commit logs in Lucene, we may figure out how the Lucene code > changed so it causes that > > issue. > > > > I will take the time and try to collect all information an OpenJDK developer > needs to (hopefully) > > reproduce the issue. > > > > Does this sound like a plan? Uwe > > Yes. And try to run with fastdebug build, it might meaningfully assert, which > may point to existing > bugreports or even fixes. You can build it yourself, or pull one from here: > https://builds.shipilev.net/ > > -Aleksey > From david.holmes at oracle.com Thu Jan 31 01:31:45 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 31 Jan 2019 11:31:45 +1000 Subject: RFR (T): 8218060: JDK-8217786 breaks build due to remaining unused function In-Reply-To: <84208277-9b2f-c128-e2a0-b4a06a90e5ad@oracle.com> References: <7ff0838f3e0b8d4cf96806e5b446cdf526cee7cb.camel@oracle.com> <458eaa70-9a7b-4a32-9748-f95c27f09f48@oracle.com> <0f13540f-b752-060b-bc8e-f05471bc7339@oracle.com> <84208277-9b2f-c128-e2a0-b4a06a90e5ad@oracle.com> Message-ID: <71ba7a45-2cd9-680d-6fa2-c030eb4b1bff@oracle.com> Thanks Dan! I eventually caught up with things. :) David On 31/01/2019 8:42 am, Daniel D. Daugherty wrote: > Already done by Thomas via JDK-8218063. > > Dan > > > On 1/30/19 5:40 PM, David Holmes wrote: >> Yes the fix is wrong. :( >> >> I'll restore the S390 fix if noone else has gotten to it yet. >> >> David >> >> On 31/01/2019 1:22 am, Stefan Karlsson wrote: >>> I think this is the wrong fix. It's used here on s390: >>> >>> 2185 void os::Linux::print_virtualization_info(outputStream* st) { >>> 2186 #if defined(S390) >>> 2187?? // /proc/sysinfo contains interesting information about >>> 2188?? // - LPAR >>> 2189?? // - whole "Box" (CPUs ) >>> 2190?? // - z/VM / KVM (VM); this is not available in an >>> LPAR-only setup >>> 2191?? const char* kw[] = { "LPAR", "CPUs", "VM", NULL }; >>> 2192 >>> 2193?? if (! print_matching_lines_from_sysinfo_file(st, kw)) { >>> 2194???? st->print_cr("? "); >>> 2195?? } >>> 2196 #endif >>> 2197 } >>> >>> StefanK >>> >>> >>> >>> On 2019-01-30 16:06, Thomas Schatzl wrote: >>>> Hi all, >>>> >>>> ?? can I have quick reviews for this change that fixes a compilation >>>> error due to an unused function? >>>> >>>> I.e. >>>> >>>> Compiling the repo after JDK-8217786 gives the following error: >>>> >>>> .../vmshare/jdk10/hs/open/src/hotspot/os/linux/os_linux.cpp:1860:13: >>>> error: 'bool print_matching_lines_from_sysinfo_file(outputStream*, >>>> const char**)' defined but not used [-Werror=unused-function] >>>> ? static bool print_matching_lines_from_sysinfo_file(outputStream* st, >>>> const char* keywords_to_match[]) { >>>> ????????????? ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> Compiling 305 files for jdk.javadoc >>>> cc1plus: all warnings being treated as errors >>>> make[3]: *** [[...]variant-server/libjvm/objs/os_linux.o] Error 1 >>>> lib/CompileJvm.gmk:172: recipe for target '[...]/linux- >>>> x64/hotspot/variant-server/libjvm/objs/os_linux.o' failed >>>> >>>> The change simply removes that unused function. >>>> >>>> CR: >>>> https://bugs.openjdk.java.net/browse/JDK-8218060 >>>> Webrev: >>>> http://cr.openjdk.java.net/~tschatzl/8218060/webrev/ >>>> Testing: >>>> local compilation >>>> >>>> Thanks, >>>> ?? Thomas >>>> >>>> >>> >> > From david.holmes at oracle.com Thu Jan 31 02:18:04 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 31 Jan 2019 12:18:04 +1000 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> References: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> Message-ID: Still good to me too. :) David On 30/01/2019 11:30 pm, Robbin Ehn wrote: > Hi, here is v3. > > http://cr.openjdk.java.net/~rehn/8218041/v03/inc/ > http://cr.openjdk.java.net/~rehn/8218041/v03/ > > Passes same compilations and t1. > > /Robbin > > On 2019-01-30 09:21, Robbin Ehn wrote: >> Hi all, please review. >> >> Code: >> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8218041 >> >> After fixing these includes, there was a circular dependency via >> shenandoah >> code. I moved try_cancel_gc to cpp where the only use was. So it never >> should >> had been in the inline header in the first place. >> >> I listed why the include is needed below. >> >> Tier 1 and no pre-compiled. >> >> FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. >> gcc complains there being a local comdat symbol, forcing it to be >> inlined or >> using clang there is no issue. So it looks like a gcc bug both in 7.3 >> and 8.2. >> >> Thanks, Robbin >> >> src/hotspot/share/aot/aotLoader.cpp >> runtime/os.inline.hpp????? for os::dll_unload >> >> src/hotspot/share/c1/c1_Runtime1.cpp >> runtime/handles.inline.hpp for Handle(Thread*, oop) >> >> src/hotspot/share/gc/z/zFuture.inline.hpp >> runtime/interfaceSupport.inline.hpp not used. >> >> src/hotspot/share/prims/nativeLookup.cpp >> runtime/os.inline.hpp????? for os::dll_unload >> >> src/hotspot/share/runtime/handles.hpp >> Forward declaration??????????? Thread >> >> src/hotspot/share/runtime/handles.inline.hpp >> runtime/thread.hpp?????? for Thread::current >> oops/oop.inline.hpp??????? for oopDesc::is_a >> oops/metadata.hpp????????? for is_valid >> >> src/hotspot/share/runtime/semaphore.inline.hpp >> runtime/thread.hpp???????? for osthread >> >> src/hotspot/share/runtime/vframe.cpp >> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From david.holmes at oracle.com Thu Jan 31 05:54:22 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 31 Jan 2019 15:54:22 +1000 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <6654f80f-9932-28f4-63f6-eece68966a56@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> <384133a1-c2e0-11f2-6890-798c3646222f@oracle.com> <6654f80f-9932-28f4-63f6-eece68966a56@oracle.com> Message-ID: <45718944-1e85-e979-2843-f93b9e6bb9b6@oracle.com> On 31/01/2019 7:37 am, Patricio Chilano wrote: > Hi David, > > On 1/30/19 2:29 AM, David Holmes wrote: >> Hi Patricio, >> >> >> >> First, thanks for all the many weeks of work you've put into this, >> pulling together a number of ideas from different people to make it >> all work! > Thanks! Credit to you for the PlatformMonitor implementation? : ) :) Nothing innovative there. >> I've only got a few minor comments/suggestions. >> >> On 30/01/2019 10:24 am, Patricio Chilano wrote: >>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/webrev/ >>> Inc: http://cr.openjdk.java.net/~pchilanomate/8210832/v03/inc/webrev/ >>> >> >> src/hotspot/share/runtime/interfaceSupport.inline.hpp >> >> I'm very unclear how ThreadLockBlockInVM differs from ThreadBlockInVM. >> You've duplicated a lot of complex code which is masking the actual >> difference between the two wrappers to me. It seems to me that an >> extra arg to transition_and_fence should allow you to handle the new >> behaviour without having to duplicate so much of this code. In any >> case the semantics of ThreadLockBlockInVM needs to be described. > I could do it with one extra argument, but I would need to add two extra > branches in transition_and_fence(), one to decide if I'm in the Monitor > case to avoid calling SafepointMechanism::block_if_requested() directly > and another one to actually decide if I'm transitioning in or out, since > the actions to perform are different. I think it is easier to read > without adding new conditionals, and also we will save those extra > branches, but if you think it's better this way I can change it. I would like something that tells me more clearly how this new transition helper differs from the existing TBIVM. Sharing the code between them and using different args would be one way. Documenting the difference in comments would be another. Your choice. >> Also I'm unclear what the "Lock" in ThreadLockBlockInVM actually >> refers to. I find the name quite jarring to read. > What about changing it to ThreadBlockinMonitor? That's not quite conveying the semantics. The problem is that the semantics we are changing compared to TBIVM are not evident in the TBIVM name. If TBIVM was actually ThreadBlockInVMWithSafepointBlocking, then this new transition would obviously be ThreadBlockInVMWithoutSafepointBlocking - but perhaps that lengthy, but clear name would be okay anyway? >> On the subject of naming, do_preempt and preempt_by_safepoint don't >> really convey to me what happens - what is being "preempted" here? I >> would suggest a more direct Monitor::release_for_safepoint > Changed. > >> --- >> >> Logging: why "nativemonitor"? The logging in mutex.cpp doesn't relate >> to a "native" monitor?? Actually I'm not even sure if we need bother >> at all with the one logging statement that is present. > I added it to eventually track unbounded try locks. Not sure I follow > you with the name, isn't that how we name this monitors? I tried to > differentiate them from Java monitors. What about just "monitor"? How about vmmonitor ? >> --- >> >> src/hotspot/share/runtime/mutex.cpp >> >> void Monitor::lock_without_safepoint_check(Thread * self) { >> ? // Ensure that the Monitor does not require or allow safepoint checks. >> >> The comment there should only say "not require". > Done. > >> void Monitor::preempt_by_safepoint() { >> ? _lock.unlock(); >> } >> >> Apart from renaming this as suggested above, aren't there any suitable >> assertions we should have here? safepoint-in-progress or >> handshake-in-progress? _owner == Thread::current? > Ok, I added an assertion that owner should be NULL. Asserting > safepoint-in-progress does not really work because _state could change > to _not_synchronized right after you checked for it in TLBIVM. Okay. >> Nit: >> >> assert(_owner == Thread::current(), "should be equal: owner=" >> INTPTR_FORMAT >> ?????????????????? ", self=" INTPTR_FORMAT, p2i(_owner), >> p2i(Thread::current())); >> >> with Dan's enhanced assertions there's an indentation issue. The >> second line should indent to the first comma, but that will make the >> second line extend way past 80 columns. >> >> Also you could factor that assertion for _owner==Thread::current() >> into its own function or macro to avoid the repetition. > Corrected indentation based on Dan's reply to align with _owner. I though it should indent to the comma because it is a continuation of the same argument being passed to the assert "function". But I'm okay with Dan's suggestion. Factoring it into its own little function or macro would still be good to avoid the repetition. > >> ?OSThreadWaitState osts(self->osthread(), false /* not Object.wait() */); >> >> This needs to be returned to its original place as per Dan's comments. > Done. > >> ??? } else { >> ????? Monitor::lock(self); >> ??? } >> >> You don't need Monitor:: here > Removed. > >> // Temporary JVM_RawMonitor* support. A raw monitor can just be a >> PlatformMonitor now. >> >> This needs to be resolved before committing. Some of the existing >> commentary on what raw monitors are needs to be retained. Not clear if >> we need to set the _owner field or can just skip it. > Is it okay if I keep the following comments? > > // Yet another degenerate version of Monitor::lock() or > lock_without_safepoint_check() > // jvm_raw_lock() and _unlock() can be called by non-Java threads via > JVM_RawMonitorEnter. > // > // There's no expectation that JVM_RawMonitors will interoperate > properly with the native > // Mutex-Monitor constructs.? We happen to implement JVM_RawMonitors in > terms of > // native Mutex-Monitors simply as a matter of convenience. Yep that's perfect. And as a future RFE we can replace them with direct use of PlatformMonitor (or PlatformMutex). > > I could keep setting the owner as _owner = Thread::current_or_null() in > jvm_raw_lock(), at least it wouldn't hurt. It's useful for checking usage errors, but we won't have that if we replace with PlatformMonitor, so may as well drop it now IMO. >> Monitor::~Monitor() { >> ? assert(_owner == NULL, "should be NULL: owner=" INTPTR_FORMAT, >> p2i(_owner)); >> } >> >> Will this automatically result in the PlatformMonitor destructor being >> called? > Yes, should I add a comment to make it clear that ~PlatformMonitor() > will be executed? No need - assume other people have a better understanding of C++ than I do :) Thanks, David > > > Thanks for looking into this! Waiting on your comments to send v04. > > Thanks, > Patricio >> --- >> >> Thanks, >> David >> ----- >> > From robbin.ehn at oracle.com Thu Jan 31 07:29:43 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 31 Jan 2019 08:29:43 +0100 Subject: RFR(s): 8218041: Assorted wrong/missing includes In-Reply-To: References: <06885e05-0db7-64fd-1080-af7545c54434@oracle.com> Message-ID: <0f2ef599-96d1-d045-939e-1d4a283c9b1d@oracle.com> Great, thanks! /Robbin On 1/31/19 3:18 AM, David Holmes wrote: > Still good to me too. :) > > David > > On 30/01/2019 11:30 pm, Robbin Ehn wrote: >> Hi, here is v3. >> >> http://cr.openjdk.java.net/~rehn/8218041/v03/inc/ >> http://cr.openjdk.java.net/~rehn/8218041/v03/ >> >> Passes same compilations and t1. >> >> /Robbin >> >> On 2019-01-30 09:21, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8218041/webrev/ >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8218041 >>> >>> After fixing these includes, there was a circular dependency via shenandoah >>> code. I moved try_cancel_gc to cpp where the only use was. So it never should >>> had been in the inline header in the first place. >>> >>> I listed why the include is needed below. >>> >>> Tier 1 and no pre-compiled. >>> >>> FYI: I was investigating why Handle::Handle(Thread*,oop) was not inlined. >>> gcc complains there being a local comdat symbol, forcing it to be inlined or >>> using clang there is no issue. So it looks like a gcc bug both in 7.3 and 8.2. >>> >>> Thanks, Robbin >>> >>> src/hotspot/share/aot/aotLoader.cpp >>> runtime/os.inline.hpp????? for os::dll_unload >>> >>> src/hotspot/share/c1/c1_Runtime1.cpp >>> runtime/handles.inline.hpp for Handle(Thread*, oop) >>> >>> src/hotspot/share/gc/z/zFuture.inline.hpp >>> runtime/interfaceSupport.inline.hpp not used. >>> >>> src/hotspot/share/prims/nativeLookup.cpp >>> runtime/os.inline.hpp????? for os::dll_unload >>> >>> src/hotspot/share/runtime/handles.hpp >>> Forward declaration??????????? Thread >>> >>> src/hotspot/share/runtime/handles.inline.hpp >>> runtime/thread.hpp?????? for Thread::current >>> oops/oop.inline.hpp??????? for oopDesc::is_a >>> oops/metadata.hpp????????? for is_valid >>> >>> src/hotspot/share/runtime/semaphore.inline.hpp >>> runtime/thread.hpp???????? for osthread >>> >>> src/hotspot/share/runtime/vframe.cpp >>> runtime/thread.inline.hpp? for JavaThread::class_to_be_initialized From per.liden at oracle.com Thu Jan 31 11:44:50 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 31 Jan 2019 12:44:50 +0100 Subject: RFR: 8210832: Remove sneaky locking in class Monitor In-Reply-To: <4b4b845b-e235-7f42-a933-6a9a73ef798b@oracle.com> References: <7e111d06-a29a-67cc-4967-958da08766a4@oracle.com> <85a81ed8-da43-5ab1-1f62-8be4aca3c54d@oracle.com> <4562855b-fa0e-9f70-da0f-00f84984d78f@oracle.com> <780d1589-12a3-d82d-ba29-58f77e5cc840@oracle.com> <4b4b845b-e235-7f42-a933-6a9a73ef798b@oracle.com> Message-ID: <00f7001f-9c93-8408-58e0-129bffe04f32@oracle.com> On 01/30/2019 03:49 PM, Patricio Chilano wrote: [...] >>>> src/hotspot/os/posix/os_*.[ch]pp >>>> --------------------------------- >>>> * I'd suggest that we place the PlatformMonitor class in a separate >>>> file (like src/hotspot/os/posix/monitor_posix.cpp), just like we >>>> have done with Semaphore (in src/hotspot/os/posix/semaphore_posix.cpp). >>> I tried to moved them but there is a small issue in that >>> PlatformMonitor code needs static methods defined in their current >>> os_*.cpp files (methods that parse timing structs). I can declare >>> them as public (cannot move them since they are also used by >>> PlatformEvent and Parker), but for the Posix version of >>> PlatformMonitor I would also need to do that with _condAttr and >>> _mutexAttr which are also defined static in that file and are needed >>> by PlatformMonitor::PlatformMonitor. So not sure what the right >>> approach is here. >>> In any case shouldn't we aim to have all synchronization-like classes >>> in the same file for each platform (something like syncro_posix, >>> syncro_windows, etc) instead of a separate file for each of them >>> (semaphore_*, monitor_*, waitbarrier_*, etc). Otherwise seems >>> PlatformParker and PlatformEvent should also be in their own file. >> >> Keeping things in separate files can make sense if these things can be >> used standalone. A plain mutex (just like the plain semaphore we have) >> can come handy in many places where you just want that mutex, without >> having to drag in other classes or the whole os layer. Keeps >> dependencies under control, reduces compile times, etc. > Ok. If you don't mind then I can do that in the follow up RFE for the > Monitor/Mutex rework after JDK-8217843 since David mentioned is working > on moving things too. Sounds good to me. /Per From shade at redhat.com Thu Jan 31 12:08:27 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 13:08:27 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> Message-ID: <5a5bc910-54d2-e7de-e50a-895a6b9ec38d@redhat.com> On 1/28/19 5:48 PM, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8217879 > > Fix: > http://cr.openjdk.java.net/~shade/8217879/webrev.01/ > JDK-8217994 [1] is in, so we can simplify the safety logic: just print whatever around the PC, and let os::print_hex_dump handle it itself: http://cr.openjdk.java.net/~shade/8217879/webrev.06/ Testing: local build, eyeballing hs_errs, jdk-submit (running) [1] https://bugs.openjdk.java.net/browse/JDK-8217994 Thanks, -Aleksey From shade at redhat.com Thu Jan 31 12:26:02 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 13:26:02 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) Message-ID: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8218140 Fix: http://cr.openjdk.java.net/~shade/8218140/webrev.01/ Testing: Linux {x86_64, aarch64} compilation Maybe some other platforms are failing too? I would be happy to fold their fixes into this patch. So far I see only AArch64 is broken. -Aleksey From david.holmes at oracle.com Thu Jan 31 12:32:20 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 31 Jan 2019 22:32:20 +1000 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> Message-ID: <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> cc'ing Robbin. I can understand the Aaarch64 specific file may have an issue but I don't see how we can still have shared files that need changing. ??? David On 31/01/2019 10:26 pm, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8218140 > > Fix: > http://cr.openjdk.java.net/~shade/8218140/webrev.01/ > > Testing: Linux {x86_64, aarch64} compilation > > Maybe some other platforms are failing too? I would be happy to fold their fixes into this patch. So > far I see only AArch64 is broken. > > -Aleksey > From shade at redhat.com Thu Jan 31 12:40:24 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 13:40:24 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> Message-ID: AArch64 has "special" relationship with thread.inline.hpp -- JavaThread::thread_state() is defined as such: src/hotspot/share/runtime/thread.inline.hpp: #if defined(PPC64) || defined (AARCH64) inline JavaThreadState JavaThread::thread_state() const { return (JavaThreadState) OrderAccess::load_acquire((volatile jint*)&_thread_state); } inline void JavaThread::set_thread_state(JavaThreadState s) { OrderAccess::release_store((volatile jint*)&_thread_state, (jint)s); } #endif Which does break aarch64 every once in a while: https://bugs.openjdk.java.net/browse/JDK-8216591 https://bugs.openjdk.java.net/browse/JDK-8203278 https://bugs.openjdk.java.net/browse/JDK-8201799 ...and shared files have to include that thread.inline.hpp then. -Aleksey On 1/31/19 1:32 PM, David Holmes wrote: > cc'ing Robbin. > > I can understand the Aaarch64 specific file may have an issue but I don't see how we can still have > shared files that need changing. > > ??? > > David > > On 31/01/2019 10:26 pm, Aleksey Shipilev wrote: >> Bug: >> ?? https://bugs.openjdk.java.net/browse/JDK-8218140 >> >> Fix: >> ? http://cr.openjdk.java.net/~shade/8218140/webrev.01/ >> >> Testing: Linux {x86_64, aarch64} compilation >> >> Maybe some other platforms are failing too? I would be happy to fold their fixes into this patch. So >> far I see only AArch64 is broken. >> >> -Aleksey >> From robbin.ehn at oracle.com Thu Jan 31 13:09:53 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Thu, 31 Jan 2019 14:09:53 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> Message-ID: <99330653-3685-e6ae-c092-b732480dc66d@oracle.com> Looks good, thanks. In this case I would not blame 8218041, since these file don't even had thread.hpp included, the 'bug' is pre-existing. /Robbin On 1/31/19 1:40 PM, Aleksey Shipilev wrote: > AArch64 has "special" relationship with thread.inline.hpp -- JavaThread::thread_state() is defined > as such: > > src/hotspot/share/runtime/thread.inline.hpp: > > #if defined(PPC64) || defined (AARCH64) > inline JavaThreadState JavaThread::thread_state() const { > return (JavaThreadState) OrderAccess::load_acquire((volatile jint*)&_thread_state); > } > > inline void JavaThread::set_thread_state(JavaThreadState s) { > OrderAccess::release_store((volatile jint*)&_thread_state, (jint)s); > } > #endif > > Which does break aarch64 every once in a while: > https://bugs.openjdk.java.net/browse/JDK-8216591 > https://bugs.openjdk.java.net/browse/JDK-8203278 > https://bugs.openjdk.java.net/browse/JDK-8201799 > > ...and shared files have to include that thread.inline.hpp then. > > -Aleksey > > On 1/31/19 1:32 PM, David Holmes wrote: >> cc'ing Robbin. >> >> I can understand the Aaarch64 specific file may have an issue but I don't see how we can still have >> shared files that need changing. >> >> ??? >> >> David >> >> On 31/01/2019 10:26 pm, Aleksey Shipilev wrote: >>> Bug: >>> ?? https://bugs.openjdk.java.net/browse/JDK-8218140 >>> >>> Fix: >>> ? http://cr.openjdk.java.net/~shade/8218140/webrev.01/ >>> >>> Testing: Linux {x86_64, aarch64} compilation >>> >>> Maybe some other platforms are failing too? I would be happy to fold their fixes into this patch. So >>> far I see only AArch64 is broken. >>> >>> -Aleksey >>> > > From david.holmes at oracle.com Thu Jan 31 13:09:08 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 31 Jan 2019 23:09:08 +1000 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> Message-ID: <836ec4cf-58e6-2dc1-4ce5-0fe3e1bbeac7@oracle.com> On 31/01/2019 10:40 pm, Aleksey Shipilev wrote: > AArch64 has "special" relationship with thread.inline.hpp -- JavaThread::thread_state() is defined > as such: > > src/hotspot/share/runtime/thread.inline.hpp: > > #if defined(PPC64) || defined (AARCH64) > inline JavaThreadState JavaThread::thread_state() const { > return (JavaThreadState) OrderAccess::load_acquire((volatile jint*)&_thread_state); > } > > inline void JavaThread::set_thread_state(JavaThreadState s) { > OrderAccess::release_store((volatile jint*)&_thread_state, (jint)s); > } > #endif > > Which does break aarch64 every once in a while: > https://bugs.openjdk.java.net/browse/JDK-8216591 > https://bugs.openjdk.java.net/browse/JDK-8203278 > https://bugs.openjdk.java.net/browse/JDK-8201799 > > ...and shared files have to include that thread.inline.hpp then. Ah I see. That's unfortunate ... I must look into why those functions are needed only for those architectures. So the removal of thread.inline.hpp from handles.inline.hpp caused thread.inline.hpp to be missing from those files. Okay. Reviewed. Thanks, David > -Aleksey > > On 1/31/19 1:32 PM, David Holmes wrote: >> cc'ing Robbin. >> >> I can understand the Aaarch64 specific file may have an issue but I don't see how we can still have >> shared files that need changing. >> >> ??? >> >> David >> >> On 31/01/2019 10:26 pm, Aleksey Shipilev wrote: >>> Bug: >>> ?? https://bugs.openjdk.java.net/browse/JDK-8218140 >>> >>> Fix: >>> ? http://cr.openjdk.java.net/~shade/8218140/webrev.01/ >>> >>> Testing: Linux {x86_64, aarch64} compilation >>> >>> Maybe some other platforms are failing too? I would be happy to fold their fixes into this patch. So >>> far I see only AArch64 is broken. >>> >>> -Aleksey >>> > > From stefan.karlsson at oracle.com Thu Jan 31 13:30:21 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 31 Jan 2019 14:30:21 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> Message-ID: On 2019-01-31 13:40, Aleksey Shipilev wrote: > AArch64 has "special" relationship with thread.inline.hpp -- JavaThread::thread_state() is defined > as such: > > src/hotspot/share/runtime/thread.inline.hpp: > > #if defined(PPC64) || defined (AARCH64) > inline JavaThreadState JavaThread::thread_state() const { > return (JavaThreadState) OrderAccess::load_acquire((volatile jint*)&_thread_state); > } > > inline void JavaThread::set_thread_state(JavaThreadState s) { > OrderAccess::release_store((volatile jint*)&_thread_state, (jint)s); > } > #endif > > Which does break aarch64 every once in a while: > https://bugs.openjdk.java.net/browse/JDK-8216591 > https://bugs.openjdk.java.net/browse/JDK-8203278 > https://bugs.openjdk.java.net/browse/JDK-8201799 > > ...and shared files have to include that thread.inline.hpp then. This is really awkward. Some platforms require thread.hpp while others require thread.inline.hpp. I think we should enforce the same include requirement for all platforms, otherwise these kind of problems will happen again. We "recently" removed the orderAccess_.inline.hpp to allow OrderAcess to be used in headers. Maybe it's time to simply move the code above to thread.hpp? That would remove this source of compile errors. Thanks, StefanK > > -Aleksey > > On 1/31/19 1:32 PM, David Holmes wrote: >> cc'ing Robbin. >> >> I can understand the Aaarch64 specific file may have an issue but I don't see how we can still have >> shared files that need changing. >> >> ??? >> >> David >> >> On 31/01/2019 10:26 pm, Aleksey Shipilev wrote: >>> Bug: >>> ?? https://bugs.openjdk.java.net/browse/JDK-8218140 >>> >>> Fix: >>> ? http://cr.openjdk.java.net/~shade/8218140/webrev.01/ >>> >>> Testing: Linux {x86_64, aarch64} compilation >>> >>> Maybe some other platforms are failing too? I would be happy to fold their fixes into this patch. So >>> far I see only AArch64 is broken. >>> >>> -Aleksey >>> > > From stefan.karlsson at oracle.com Thu Jan 31 13:59:40 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 31 Jan 2019 14:59:40 +0100 Subject: RFR (S) 8217879: hs_err should print more instructions in hex dump In-Reply-To: <5a5bc910-54d2-e7de-e50a-895a6b9ec38d@redhat.com> References: <58f93cfa-0c8e-d5cb-c4c4-d52b41e433f0@redhat.com> <5a5bc910-54d2-e7de-e50a-895a6b9ec38d@redhat.com> Message-ID: <6312e66e-2023-99a1-90f7-cf282bc96217@oracle.com> Looks good. Thanks, StefanK On 2019-01-31 13:08, Aleksey Shipilev wrote: > On 1/28/19 5:48 PM, Aleksey Shipilev wrote: >> RFE: >> https://bugs.openjdk.java.net/browse/JDK-8217879 >> >> Fix: >> http://cr.openjdk.java.net/~shade/8217879/webrev.01/ >> > > JDK-8217994 [1] is in, so we can simplify the safety logic: just print whatever around the PC, and > let os::print_hex_dump handle it itself: > http://cr.openjdk.java.net/~shade/8217879/webrev.06/ > > Testing: local build, eyeballing hs_errs, jdk-submit (running) > > [1] https://bugs.openjdk.java.net/browse/JDK-8217994 > > Thanks, > -Aleksey > From nils.eliasson at oracle.com Thu Jan 31 14:31:41 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 31 Jan 2019 15:31:41 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> <6ce2d1d2-a3ca-1d8f-56eb-5a76c77754d1@oracle.com> Message-ID: <76169968-8c9c-30d6-c51b-d71cf4093fe9@oracle.com> Hi, I have tried reproducing the crash using the replay_pid27685.log, JDK 11 and solr build from checkout db57468242 together with the appropriate commandline. The java and solr versions match, no profile data is missing, the inlining matches, but the compile completes successfully anyway. // Nils On 2019-01-30 12:11, Dawid Weiss wrote: > Hi Nils, > > Those builds are made straight up from git (from various branches). > For example this failure: > > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/ > > with these hs_err and replay files: > > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/artifact/solr/build/solr-core/test/J1/hs_err_pid27685.log > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/artifact/solr/build/solr-core/test/J1/replay_pid27685.log > > comes from rev db57468242 of git at github.com:apache/lucene-solr.git > (the revision is mentioned on jenkins and in the full log), so: > > git clone git at github.com:apache/lucene-solr.git > cd lucene-solr > git checkout db57468242 > > then you can compile with: > > cd lucene > ant jar > > Dawid > > On Wed, Jan 30, 2019 at 12:03 PM Nils Eliasson wrote: >> Hi, >> >> With the help of the replay-file I manange to compile the right Class, >> but the inlining doesn't match. The latest release i see is 7.6, but it >> looks like the crash is from a 9.0? Is that master? Do you have a link >> to a build that I can download? >> >> Regards, >> >> Nils >> >> On 2019-01-30 11:55, Dawid Weiss wrote: >>>> This is release build, right? fastdebug build probably asserts somewhere? >>> I don't think we (or Uwe) runs jobs with fastdebug builds, to be >>> honest. This isn't a bad idea though. >>> >>>> If Nils is not there, let Uwe find me at FOSDEM? >>> CCing: Uwe. >>> >>> D. From matthias.baesken at sap.com Thu Jan 31 14:50:43 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 31 Jan 2019 14:50:43 +0000 Subject: RFR : 8218136: minor hotspot adjustments for xlclang++ from xlc16 on AIX Message-ID: Please review this small webrev . It contains a few changes for building hotspot on AIX with xlclang++ / xlc16 . ( most likely switching to xlclang++ / xlc16 will be a must once we introduce C++11/14 features ) Some comments on the changes : - porting_aix.cpp : workaround for demangle.h (does not work with xlclang++) - arguments.cpp/hpp : the UNSUPPORTED_OPTON macro lead to assigning false to AllocateHeapAt which is a bad idea (and does not work with xlclang++) - globalDefinitions_xlc.hpp : xlclang++ sets __GNUC__ so we must not have #error ... in this case Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8218136 http://cr.openjdk.java.net/~mbaesken/webrevs/8218136.0/ Thanks, Matthias From shade at redhat.com Thu Jan 31 15:03:20 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 16:03:20 +0100 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: <8851bcbe-78be-6c32-130c-a87b556dcaf7@oracle.com> References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> <09dff7f0-54b9-1812-09ca-69666a2f2abe@redhat.com> <8851bcbe-78be-6c32-130c-a87b556dcaf7@oracle.com> Message-ID: <295c04b5-1638-06d0-e316-31db5cdca676@redhat.com> On 1/30/19 8:41 PM, coleen.phillimore at oracle.com wrote: > I added the timing for triggering cleanup.? It's likely not interesting.? The main cleanup time is > code cache cleaning still. > > This is my fastdebug timing: > [4.992s][debug][gc,phases] GC(2) ClassLoaderData 0.016ms > [4.992s][debug][gc,phases] GC(2) Trigger cleanups 0.010ms > [5.039s][debug][gc,phases] GC(2) Class Unloading 47.290ms > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213753.02/webrev Looks good. -Aleksey From shade at redhat.com Thu Jan 31 15:23:43 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 16:23:43 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> Message-ID: On 1/31/19 2:30 PM, Stefan Karlsson wrote: > On 2019-01-31 13:40, Aleksey Shipilev wrote: >> ...and shared files have to include that thread.inline.hpp then. > > This is really awkward. Some platforms require thread.hpp while others require thread.inline.hpp. I > think we should enforce the same include requirement for all platforms, otherwise these kind of > problems will happen again. Yes, it is awkward. > We "recently" removed the orderAccess_.inline.hpp to allow OrderAcess to be used in > headers. Maybe it's time to simply move the code above to thread.hpp? That would remove this source > of compile errors. I am thinking the reverse: push #ifs around the definition into the method body instead. This would ensure we use thread.inline.hpp where it makes sense to. But let's make current repository buildable first, and it would also put thread.inline.hpp includes where appropriate. -Aleksey From shade at redhat.com Thu Jan 31 15:34:37 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 16:34:37 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> Message-ID: <402426c8-4ee1-4893-abfe-080bf330342a@redhat.com> On 1/31/19 1:26 PM, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8218140 > > Fix: > http://cr.openjdk.java.net/~shade/8218140/webrev.01/ > > Testing: Linux {x86_64, aarch64} compilation > > Maybe some other platforms are failing too? I would be happy to fold their fixes into this patch. So > far I see only AArch64 is broken. Actually, aarch64, arm32, ppc64, s390x are broken. This is the new fix: http://cr.openjdk.java.net/~shade/8218140/webrev.02/ It looks like frame_*.cpp files need to include os.inline.hpp to gain access to os::uses_stack_guard_pages(). Testing: Linux {x86_64, aarch64, arm32, ppc64, s390x} compilation -Aleksey From shade at redhat.com Thu Jan 31 15:38:37 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 16:38:37 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> Message-ID: <7bc6c87e-30ab-7810-2aee-073ba79815f4@redhat.com> On 1/31/19 4:23 PM, Aleksey Shipilev wrote: > On 1/31/19 2:30 PM, Stefan Karlsson wrote: >> On 2019-01-31 13:40, Aleksey Shipilev wrote: >>> ...and shared files have to include that thread.inline.hpp then. >> >> This is really awkward. Some platforms require thread.hpp while others require thread.inline.hpp. I >> think we should enforce the same include requirement for all platforms, otherwise these kind of >> problems will happen again. > > Yes, it is awkward. > >> We "recently" removed the orderAccess_.inline.hpp to allow OrderAcess to be used in >> headers. Maybe it's time to simply move the code above to thread.hpp? That would remove this source >> of compile errors. > > I am thinking the reverse: push #ifs around the definition into the method body instead. This would > ensure we use thread.inline.hpp where it makes sense to. But let's make current repository buildable > first, and it would also put thread.inline.hpp includes where appropriate. https://bugs.openjdk.java.net/browse/JDK-8218151 -Aleksey From stefan.karlsson at oracle.com Thu Jan 31 15:43:02 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 31 Jan 2019 16:43:02 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> Message-ID: On 2019-01-31 16:23, Aleksey Shipilev wrote: > On 1/31/19 2:30 PM, Stefan Karlsson wrote: >> On 2019-01-31 13:40, Aleksey Shipilev wrote: >>> ...and shared files have to include that thread.inline.hpp then. >> >> This is really awkward. Some platforms require thread.hpp while others require thread.inline.hpp. I >> think we should enforce the same include requirement for all platforms, otherwise these kind of >> problems will happen again. > > Yes, it is awkward. > >> We "recently" removed the orderAccess_.inline.hpp to allow OrderAcess to be used in >> headers. Maybe it's time to simply move the code above to thread.hpp? That would remove this source >> of compile errors. > > I am thinking the reverse: push #ifs around the definition into the method body instead. This would > ensure we use thread.inline.hpp where it makes sense to. Not sure what you mean. Maybe we're saying the same thing. My proposal is this (untested): http://cr.openjdk.java.net/~stefank/8218140/webrev.alt.01/ But let's make current repository buildable > first, and it would also put thread.inline.hpp includes where appropriate. Sure. StefanK > > -Aleksey > > From coleen.phillimore at oracle.com Thu Jan 31 15:56:32 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 31 Jan 2019 10:56:32 -0500 Subject: RFR (M) 8213753: SymbolTable is double walked during class unloading and clean up table timing in do_unloading In-Reply-To: <295c04b5-1638-06d0-e316-31db5cdca676@redhat.com> References: <3cbb2030-4c97-6478-3a02-196a7898d69c@oracle.com> <5717ffe1-9cdc-2d8b-5ee7-25c6b810ce8e@redhat.com> <09dff7f0-54b9-1812-09ca-69666a2f2abe@redhat.com> <8851bcbe-78be-6c32-130c-a87b556dcaf7@oracle.com> <295c04b5-1638-06d0-e316-31db5cdca676@redhat.com> Message-ID: Thanks for the code review! Coleen On 1/31/19 10:03 AM, Aleksey Shipilev wrote: > On 1/30/19 8:41 PM, coleen.phillimore at oracle.com wrote: >> I added the timing for triggering cleanup.? It's likely not interesting.? The main cleanup time is >> code cache cleaning still. >> >> This is my fastdebug timing: >> [4.992s][debug][gc,phases] GC(2) ClassLoaderData 0.016ms >> [4.992s][debug][gc,phases] GC(2) Trigger cleanups 0.010ms >> [5.039s][debug][gc,phases] GC(2) Class Unloading 47.290ms >> >> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8213753.02/webrev > Looks good. > > -Aleksey > From shade at redhat.com Thu Jan 31 15:56:00 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 16:56:00 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> Message-ID: <34c9dd07-9d7e-ba6d-8bc7-e83c199da6e0@redhat.com> On 1/31/19 4:43 PM, Stefan Karlsson wrote: >>> We "recently" removed the orderAccess_.inline.hpp to allow OrderAcess to be used in >>> headers. Maybe it's time to simply move the code above to thread.hpp? That would remove this source >>> of compile errors. >> >> I am thinking the reverse: push #ifs around the definition into the method body instead. This would >> ensure we use thread.inline.hpp where it makes sense to. > > Not sure what you mean. Maybe we're saying the same thing. My proposal is this (untested): > http://cr.openjdk.java.net/~stefank/8218140/webrev.alt.01/ Yes, the same thing, but in reverse :) Not a big fan of having non-trivial declarations in the header. See the patch here: https://bugs.openjdk.java.net/browse/JDK-8218151 This naturally builds up on having thread.inline.hpp included where needed by this build fix. -Aleksey From stefan.karlsson at oracle.com Thu Jan 31 16:32:43 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 31 Jan 2019 17:32:43 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: <34c9dd07-9d7e-ba6d-8bc7-e83c199da6e0@redhat.com> References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <4e6056ad-cd9c-166c-3918-26933a5c77bd@oracle.com> <34c9dd07-9d7e-ba6d-8bc7-e83c199da6e0@redhat.com> Message-ID: <92ed8128-34c1-9255-b9fa-0d8cd8514f8b@oracle.com> On 2019-01-31 16:56, Aleksey Shipilev wrote: > On 1/31/19 4:43 PM, Stefan Karlsson wrote: >>>> We "recently" removed the orderAccess_.inline.hpp to allow OrderAcess to be used in >>>> headers. Maybe it's time to simply move the code above to thread.hpp? That would remove this source >>>> of compile errors. >>> I am thinking the reverse: push #ifs around the definition into the method body instead. This would >>> ensure we use thread.inline.hpp where it makes sense to. >> Not sure what you mean. Maybe we're saying the same thing. My proposal is this (untested): >> http://cr.openjdk.java.net/~stefank/8218140/webrev.alt.01/ > Yes, the same thing, but in reverse :) Not a big fan of having non-trivial declarations in the > header. See the patch here: https://bugs.openjdk.java.net/browse/JDK-8218151 > > This naturally builds up on having thread.inline.hpp included where needed by this build fix. Yes, this is cleaner. StefanK > -Aleksey > From stefan.karlsson at oracle.com Thu Jan 31 16:35:04 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 31 Jan 2019 17:35:04 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: <402426c8-4ee1-4893-abfe-080bf330342a@redhat.com> References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <402426c8-4ee1-4893-abfe-080bf330342a@redhat.com> Message-ID: Looks good. StefanK On 2019-01-31 16:34, Aleksey Shipilev wrote: > On 1/31/19 1:26 PM, Aleksey Shipilev wrote: >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8218140 >> >> Fix: >> http://cr.openjdk.java.net/~shade/8218140/webrev.01/ >> >> Testing: Linux {x86_64, aarch64} compilation >> >> Maybe some other platforms are failing too? I would be happy to fold their fixes into this patch. So >> far I see only AArch64 is broken. > Actually, aarch64, arm32, ppc64, s390x are broken. This is the new fix: > http://cr.openjdk.java.net/~shade/8218140/webrev.02/ > > It looks like frame_*.cpp files need to include os.inline.hpp to gain access to > os::uses_stack_guard_pages(). > > Testing: Linux {x86_64, aarch64, arm32, ppc64, s390x} compilation > > -Aleksey > From shade at redhat.com Thu Jan 31 17:11:31 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 18:11:31 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <402426c8-4ee1-4893-abfe-080bf330342a@redhat.com> Message-ID: <6bdb5705-6ac0-c0b7-3b0d-480abb5f7dc3@redhat.com> Thanks! No further comments from, say, SAP folks? I am going to push this to unbreak jdk/jdk then. -Aleksey On 1/31/19 5:35 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2019-01-31 16:34, Aleksey Shipilev wrote: >> On 1/31/19 1:26 PM, Aleksey Shipilev wrote: >> Actually, aarch64, arm32, ppc64, s390x are broken. This is the new fix: >> ?? http://cr.openjdk.java.net/~shade/8218140/webrev.02/ >> >> It looks like frame_*.cpp files need to include os.inline.hpp to gain access to >> os::uses_stack_guard_pages(). >> >> Testing: Linux {x86_64, aarch64, arm32, ppc64, s390x} compilation >> >> -Aleksey From dawid.weiss at gmail.com Thu Jan 31 17:27:52 2019 From: dawid.weiss at gmail.com (Dawid Weiss) Date: Thu, 31 Jan 2019 18:27:52 +0100 Subject: SIGSEGV on PhaseIdealLoop::split_up? In-Reply-To: <76169968-8c9c-30d6-c51b-d71cf4093fe9@oracle.com> References: <48117f7c-77db-452f-cb7b-f5c266fb78cd@oracle.com> <8c07e7f9-2e84-9183-7551-e5148917ef5a@redhat.com> <6ce2d1d2-a3ca-1d8f-56eb-5a76c77754d1@oracle.com> <76169968-8c9c-30d6-c51b-d71cf4093fe9@oracle.com> Message-ID: Thanks Nils. We'll keep monitoring. Maybe Uwe can add a fastbuild version to the mix and something will pop up. This has been a recurring issue, although very infrequently. At first I though it's a random bit flip, but there has to be something to it as it does happen from time to time. Dawid On Thu, Jan 31, 2019 at 3:40 PM Nils Eliasson wrote: > > Hi, > > I have tried reproducing the crash using the replay_pid27685.log, JDK 11 > and solr build from checkout db57468242 together with the appropriate > commandline. The java and solr versions match, no profile data is > missing, the inlining matches, but the compile completes successfully > anyway. > > // Nils > > On 2019-01-30 12:11, Dawid Weiss wrote: > > Hi Nils, > > > > Those builds are made straight up from git (from various branches). > > For example this failure: > > > > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/ > > > > with these hs_err and replay files: > > > > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/artifact/solr/build/solr-core/test/J1/hs_err_pid27685.log > > https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/3472/artifact/solr/build/solr-core/test/J1/replay_pid27685.log > > > > comes from rev db57468242 of git at github.com:apache/lucene-solr.git > > (the revision is mentioned on jenkins and in the full log), so: > > > > git clone git at github.com:apache/lucene-solr.git > > cd lucene-solr > > git checkout db57468242 > > > > then you can compile with: > > > > cd lucene > > ant jar > > > > Dawid > > > > On Wed, Jan 30, 2019 at 12:03 PM Nils Eliasson wrote: > >> Hi, > >> > >> With the help of the replay-file I manange to compile the right Class, > >> but the inlining doesn't match. The latest release i see is 7.6, but it > >> looks like the crash is from a 9.0? Is that master? Do you have a link > >> to a build that I can download? > >> > >> Regards, > >> > >> Nils > >> > >> On 2019-01-30 11:55, Dawid Weiss wrote: > >>>> This is release build, right? fastdebug build probably asserts somewhere? > >>> I don't think we (or Uwe) runs jobs with fastdebug builds, to be > >>> honest. This isn't a bad idea though. > >>> > >>>> If Nils is not there, let Uwe find me at FOSDEM? > >>> CCing: Uwe. > >>> > >>> D. From shade at redhat.com Thu Jan 31 18:34:23 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 19:34:23 +0100 Subject: RFR (S) 8218140: Build failures after JDK-8218041 (Assorted wrong/missing includes) In-Reply-To: <6bdb5705-6ac0-c0b7-3b0d-480abb5f7dc3@redhat.com> References: <6dd0ada4-22bd-69ad-8118-18b805d9ed5f@redhat.com> <402426c8-4ee1-4893-abfe-080bf330342a@redhat.com> <6bdb5705-6ac0-c0b7-3b0d-480abb5f7dc3@redhat.com> Message-ID: On 1/31/19 6:11 PM, Aleksey Shipilev wrote: > I am going to push this to unbreak jdk/jdk then. Pushed. -Aleksey From shade at redhat.com Thu Jan 31 21:13:55 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 31 Jan 2019 22:13:55 +0100 Subject: RFR (S) 8218151: Simplify JavaThread::thread_state definition Message-ID: RFE: https://bugs.openjdk.java.net/browse/JDK-8218151 Fix: http://cr.openjdk.java.net/~shade/8218151/webrev.01/ The difference in definitions in JavaThread::thread_state in AArch64 and PPC64 is the source of frequent build failures when someone tests only x86_64, see the linked issues for a taste. The way out of this is to push the #if-s inside the method body, so that definition is always in one place, and would not accidentally break the build. Testing: Linux {x86_64, x86_32, aarch64, arm32, ppc64el, s390x} builds, Mac OS X {x86_64} build, Windows {x86_64} build, jdk-submit (failed windows test and macos build, but I suspect there are infra problems, will wait and re-run; see for example mach5-one-shade-JDK-8218151-20190131-1631-233989). Thanks, -Aleksey From stefan.karlsson at oracle.com Thu Jan 31 21:28:04 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 31 Jan 2019 22:28:04 +0100 Subject: RFR (S) 8218151: Simplify JavaThread::thread_state definition In-Reply-To: References: Message-ID: Looks good. Thanks for fixing this! StefanK On 2019-01-31 22:13, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8218151 > > Fix: > http://cr.openjdk.java.net/~shade/8218151/webrev.01/ > > The difference in definitions in JavaThread::thread_state in AArch64 and PPC64 is the source of > frequent build failures when someone tests only x86_64, see the linked issues for a taste. The way > out of this is to push the #if-s inside the method body, so that definition is always in one place, > and would not accidentally break the build. > > Testing: Linux {x86_64, x86_32, aarch64, arm32, ppc64el, s390x} builds, Mac OS X {x86_64} build, > Windows {x86_64} build, jdk-submit (failed windows test and macos build, but I suspect there are > infra problems, will wait and re-run; see for example mach5-one-shade-JDK-8218151-20190131-1631-233989). > > Thanks, > -Aleksey > From igor.ignatyev at oracle.com Thu Jan 31 21:42:53 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 31 Jan 2019 13:42:53 -0800 Subject: RFR(T)[12] : 8218168 : clean up hotspot ProblemList Message-ID: http://cr.openjdk.java.net/~iignatyev//8218168/webrev.00/index.html > 10 lines changed: 0 ins; 0 del; 10 mod; Hi all, JDK-8208255 and JDK-8208235 got closed as a duplicate of JDK-8058176 but still referenced in the problem list, this trivial patch is to fix that. webrev: http://cr.openjdk.java.net/~iignatyev//8218168/webrev.00/index.html JBS: https://bugs.openjdk.java.net/browse/JDK-8218168 Thanks, -- Igor