From christoph.langer at sap.com Thu Aug 1 02:54:35 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 1 Aug 2019 02:54:35 +0000 Subject: [11u] 8205654: serviceability/dcmd/framework/HelpTest.java timed out In-Reply-To: <8B42E32B-52A4-4391-A13C-8D20BCEC3295@oracle.com> References: <534514DB-A591-4DC9-AE45-2C8F2BE10186@oracle.com> <8B42E32B-52A4-4391-A13C-8D20BCEC3295@oracle.com> Message-ID: Thank you, Daniil and Serguei. I'll look into backporting the 2 suggested items and will try to push them in one go. Best Christoph > -----Original Message----- > From: Daniil Titov > Sent: Mittwoch, 31. Juli 2019 11:55 > To: serguei.spitsyn at oracle.com; Langer, Christoph > ; jdk-updates-dev at openjdk.java.net > Cc: OpenJDK Serviceability > Subject: Re: [11u] 8205654: serviceability/dcmd/framework/HelpTest.java > timed out > > I think either way is fine, but if backporting [1] and [2] separately, we need to > ensure, > that they will be also approved for 11u. Currently neither [1] nor [2] > have jdk11u-fix-request and jdk11u-fix-yes labels. > > [1]: JDK-8225543 - https://bugs.openjdk.java.net/browse/JDK-8225543 > [2]: JDK-8221730 - https://bugs.openjdk.java.net/browse/JDK-8221730 > > Best regards, > Daniil > > ?On 7/31/19, 10:42 AM, "serguei.spitsyn at oracle.com" > wrote: > > > > On 7/31/19 10:32 AM, Daniil Titov wrote: > > Hi Christoph, > > > > There were several issues that the original change introduced. These > issues were > > solved in [1] and [2] and they need to be included in the backport. > > You probably wanted to say, the 8225543 and 8221730 have to be > backported as well, > but not in the same backport. > > Thanks, > Serguei > > > [1]: JDK-8225543 - https://bugs.openjdk.java.net/browse/JDK-8225543 > > [2]: JDK-8221730 - https://bugs.openjdk.java.net/browse/JDK-8221730 > > > > Thanks, > > Daniil > > > > ?On 7/31/19, 9:41 AM, "serguei.spitsyn at oracle.com" > wrote: > > > > > > On 7/23/19 07:58, Langer, Christoph wrote: > > > Hi, > > > > > > please review the backport of "8205654: > serviceability/dcmd/framework/HelpTest.java timed out" to OpenJDK 11u. > We're seeing the mentioned test issue intermittently in our nightlies. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205654 > > > Webrev: http://cr.openjdk.java.net/~clanger/webrevs/8205654.11u- > dev.0/ > > > Original Change: > http://hg.openjdk.java.net/jdk/jdk/rev/67537bbafd7f > > > Original review discussion: > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019- > February/026883.html > > > > > > The change improves the way how jcmd explores running Java > processes on Linux. It'll then try to use the proc file system first to get the > necessary information before trying to attach via the attach framework. > > > > > > The original patch needs 2 modifications: > > > 1. In > src/jdk.jcmd/share/classes/sun/tools/common/ProcessArgumentMatcher.ja > va I had to add the fix of JDK-8218705. The bug is not visible unfortunately > but it seems something was forgot in the original commit. The change for > JDK-8218705 is: http://hg.openjdk.java.net/jdk/jdk/rev/50c1b0a0f1e8 > > > 2. In test/lib/jdk/test/lib/util/JarUtils.java I had to take over some > upstream coding from jdk/jdk to provide the JarUtils support that is needed > by the new testcase "TestProcessHelper". > > > > > > Thanks > > > Christoph > > > > > > > > > > > > > > From david.holmes at oracle.com Thu Aug 1 06:50:46 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Aug 2019 16:50:46 +1000 Subject: RFR: 8170299: Debugger does not stop inside the low memory notifications code In-Reply-To: References: <75B5F778-DC49-494B-AC12-270F301677CA@oracle.com> <60639d41-735a-00d3-c9db-1955f581b89a@oracle.com> Message-ID: <9783ca89-0af8-2167-436a-e5ff2db631a3@oracle.com> Hi Daniil, On 25/07/2019 3:34 am, Daniil Titov wrote: > Hi David, > > Hope you had a great vacation! I did thank you. Apologies again for taking so long to get back to this work. > Please find below the latest version of the change . The only difference from the version 01 is > the corrected ordering of include statements as Serguei suggested. > > Webrev: https://cr.openjdk.java.net/~dtitov/8170299/webrev.02/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8170299 I'm still remain concerned about introducing yet-another-thread to the system. The potential interactions with other threads is not at all clear. I'm also concerned that this thread has to be visible so that you can debug the notification code, yet at the same time being visible makes it vulnerable to application level actions that don't impact the service thread - in particular if we suspend all threads then this thread will be suspended too, if we resume a thread that triggers a notification, the notification thread won't be able to respond to it as it is suspended. The user won't know that they need to explicitly resume this internal system thread. Also note in serviceThread.cpp we have: 129 // This ThreadBlockInVM object is not also considered to be 130 // suspend-equivalent because ServiceThread is not visible to 131 // external suspension. 132 133 ThreadBlockInVM tbivm(jt); and you copied that across to notificationThread.cpp as: 93 // Need state transition ThreadBlockInVM so that this thread 94 // will be handled by safepoint correctly when this thread is 95 // notified at a safepoint. 96 97 ThreadBlockInVM tbivm(jt); so this will continue to not be a suspend-equivalent condition even though this thread is visible and suspendible! So something seems wrong there. I'm unclear why we need to use the ThreadBlockInVM rather than defining the NotificationLock as a safepoint-checks-always lock, rather than a safepoint-check-never lock? In fact with some recent changes to locks I'm not even sure it is legal for the notification thread to use a safepoint-check-never lock - have you re-based this recently? Thanks, David > Thanks! > --Daniil > > ?On 7/3/19, 11:47 PM, "David Holmes" wrote: > > Hi Daniil, > > On 4/07/2019 1:04 pm, Daniil Titov wrote: > > Please review the change the fixes the problem with the debugger not stopping in the low memory notification code. > > > > The problem here is that the ServiceThread that calls these MXBean listeners is hidden from the external view that prevents the debugger from stopping in it. > > > > The fix introduces new NotificationThread that is visible to the external view and offloads the ServiceThread from sending low memory and other notifications that could result in Java calls ( GC and diagnostic commands notifications) by moving these activities in this new NotificationThread. > > There is a long and unfortunate history with this bug. > > The original incarnation of this fix was introducing a new thread at the > Java library level, and I had some concerns about that: > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-December/022612.html > > That effort was resurrected at: > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024466.html > > and > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024849.html > > but was left somewhat in limbo. There was a lot of doubt about the right > way to fix this bug and whether introducing a new thread was too disruptive. > > But introducing a new thread in the VM also has the same set of > concerns! This needs consideration by the runtime team before going > ahead. Introducing a new thread likes this needs to be examined in > detail - particularly the synchronization interactions with other > threads. It also introduces another monitor designated safepoint-never > at a time when we are in the process of cleaning up monitors so that > JavaThreads will only use safepoint-check-always monitors. > > Unfortunately I'm about to head out for two weeks vacation, and a number > of other key runtime folk are also on vacation. but I'd ask that you > hold off on this until we can look at it in more detail. > > Thanks, > David > ----- > > > Testing: Mach5 tier1,tier2 and tier3 tests succeeded. > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8170299/webrev.01/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8170299 > > > > Thanks! > > --Daniil > > > > > > > From matthias.baesken at sap.com Thu Aug 1 07:13:08 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 1 Aug 2019 07:13:08 +0000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: <9868e92d-398b-be7c-5d45-020c19a61052@oracle.com> References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> <470c58c5-c364-aca0-62fe-ef469c5cb390@oracle.com> <9868e92d-398b-be7c-5d45-020c19a61052@oracle.com> Message-ID: Hi David + JC , thanks for the reviews . David - I added the suggested print-outputs , and also the parameter to executeThreadDumps . Best regards, Matthias > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 31. Juli 2019 23:57 > To: Baesken, Matthias ; Jean Christophe > Beyler > Cc: hotspot-dev at openjdk.java.net; serviceability-dev dev at openjdk.java.net> > Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast > Linux machines with Total safepoint time 0 ms > > On 1/08/2019 12:01 am, Baesken, Matthias wrote: > > > > Hi upload works again, now with webrev : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.2/ > > Could you please add, for diagnostic purposes: > > System.out.println("Total safepoint time (ms): " + value); > > after: > > 60 long value = executeThreadDumps(); > > and > > 68 long value2 = executeThreadDumps(); > > that way if the test fails we can check logs to see what kind of > safepoint times have been observed previously. No need to see an updated > webrev just for that. > > I have one further suggestion, take it or leave it, that > executeThreadDumps() takes a parameter to specify the initial value, so > we'd have: > > 60 long value = executeThreadDumps(0); > > and > > 68 long value2 = executeThreadDumps(value); > > This might help detect getTotalSafepointTime() going backwards slightly > better than current code. > > Thanks, > David > > > Best regards, Matthias > > > > > >> -----Original Message----- > >> From: Baesken, Matthias > >> Sent: Mittwoch, 31. Juli 2019 14:05 > >> To: 'David Holmes' ; Jean Christophe Beyler > >> > >> Cc: hotspot-dev at openjdk.java.net; serviceability-dev >> dev at openjdk.java.net> > >> Subject: RE: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on > fast > >> Linux machines with Total safepoint time 0 ms > >> > >> Hello, here is a version following the latest proposal of JC . > >> > >> Unfortunately attached as patch, sorry for that - the uploads / pushes > >> currently do not work from here . > >> > >> Best regards, Matthias > >> > >> > >>> -----Original Message----- > >>> From: David Holmes > >>> Sent: Mittwoch, 31. Juli 2019 05:04 > >>> To: Jean Christophe Beyler > >>> Cc: Baesken, Matthias ; hotspot- > >>> dev at openjdk.java.net; serviceability-dev >>> dev at openjdk.java.net> > >>> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on > >> fast > >>> Linux machines with Total safepoint time 0 ms > >>> > >>> On 31/07/2019 9:08 am, Jean Christophe Beyler wrote: > >>>> FWIW, I would have done something like what David was suggesting, > just > >>>> slightly tweaked: > >>>> > >>>> public static long executeThreadDumps() { > >>>> ?long value; > >>>> ?long initial_value = mbean.getTotalSafepointTime(); > >>>> ?do { > >>>> ? ? ?Thread.getAllStackTraces(); > >>>> ? ? ?value = mbean.getTotalSafepointTime(); > >>>> ?} while (value == initial_value); > >>>> ?return value; > >>>> } > >>>> > >>>> This ensures that the value is a new value as opposed to the current > >>>> value and if something goes wrong, as David said, it will timeout; which > >>>> is ok. > >>> > >>> Works for me. > >>> > >>>> But I come back to not really understanding why we are doing this at > >>>> this point of relaxing (just get a new value of safepoint time). > >>>> Because, if we accept timeouts now as a failure here, then really the > >>>> whole test becomes: > >>>> > >>>> executeThreadDumps(); > >>>> executeThreadDumps(); > >>>> > >>>> Since?the first call will return when value > 0 and the second call will > >>>> return when value2 > value (I still wonder why we want to ensure it > >>>> works twice...). > >>> > >>> The test is trying to sanity check that we are actually recording the > >>> time used by safepoints. So first check is that we can get a non-zero > >>> value; second check is we get a greater non-zero value. It's just a > >>> sanity test to try and catch if something gets unexpectedly broken in > >>> the time tracking code. > >>> > >>>> So both failures and even testing for it is kind of redundant, once you > >>>> have a do/while until a change? > >>> > >>> Yes - the problem with the tests that try to check internal VM behaviour > >>> is that we have no specified way to do something, in this case execute > >>> safepoints, that relates to internal VM behaviour, so we have to do > >>> something we know will currently work even if not specified to do so - > >>> e.g. dumping all thread stacks uses a global safepoint. The second > >>> problem is that the timer granularity is so coarse that we then have to > >>> guess how many times we need to do that something before seeing a > >>> change. To make the test robust we can keep doing stuff until we see a > >>> change and so the only way that will fail is if the overall timeout of > >>> the test kicks in. Or we can try and second guess how long it should > >>> take by introducing our own internal timeout - either directly or by > >>> limiting the number of loops in this case. That has its own problems and > >>> in general we have tried to reduce internal test timeouts (by removing > >>> them) and let overall timeouts take charge. > >>> > >>> No ideal solution. And this has already consumed way too much of > >>> everyone's time. > >>> > >>> Cheers, > >>> David > >>> > >>>> Thanks, > >>>> Jc > >>>> > >>>> > >>>> On Tue, Jul 30, 2019 at 2:35 PM David Holmes > >>>> > wrote: > >>>> > >>>> On 30/07/2019 10:39 pm, Baesken, Matthias wrote: > >>>> > Hi David,? ?"put that whole code (the while loop) in a helper > >>>> method."? ?was JC's idea,? and I like the idea . > >>>> > >>>> Regardless I think the way you are using NUM_THREAD_DUMPS is > >> really > >>>> confusing. As an all-caps static you'd expect it to be a constant. > >>>> > >>>> Thanks, > >>>> David > >>>> > >>>> > Let's see what others think . > >>>> > > >>>> >> > >>>> >> Overall tests like this are not very useful, yet very fragile. > >>>> >> > >>>> > > >>>> > I am also? fine with putting the test on the exclude list. > >>>> > > >>>> > Best regards, Matthias > >>>> > > >>>> > > >>>> >> -----Original Message----- > >>>> >> From: David Holmes >>>> > > >>>> >> Sent: Dienstag, 30. Juli 2019 14:12 > >>>> >> To: Baesken, Matthias >>>> >; Jean Christophe > >>>> >> Beyler > > >>>> >> Cc: hotspot-dev at openjdk.java.net > >>>> ; serviceability-dev > >>>> >>>> >> dev at openjdk.java.net > > >>>> >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java > >>>> fails on fast > >>>> >> Linux machines with Total safepoint time 0 ms > >>>> >> > >>>> >> Hi Matthias, > >>>> >> > >>>> >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: > >>>> >>> Hello? JC / David,?? here is a second webrev? : > >>>> >>> > >>>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ > >>>> >>> > >>>> >>> It moves?? the? thread dump execution into a? method > >>>> >>> executeThreadDumps(long)?? ??, and also adds? while loops > >>>> (but with a > >>>> >>> limitation? for the number of thread dumps, really don?t > >>>> >>> want to cause timeouts etc.).??? I removed a check for > >>>> >>> MAX_VALUE_FOR_PASS?? because we cannot go over > >>> Long.MAX_VALUE . > >>>> >> > >>>> >> I don't think executeThreadDumps is worth factoring out like out. > >>>> >> > >>>> >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd > rather > >> it > >>>> >> remains a constant 100, and then you set a simple loop iteration > >>>> count > >>>> >> limit. Further with the proposed code when you get here: > >>>> >> > >>>> >>? ? 85? ? ? ? ?NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; > >>>> >> > >>>> >> you don't even know what value you may be starting with. > >>>> >> > >>>> >> But I was thinking of simply: > >>>> >> > >>>> >> long value = 0; > >>>> >> do { > >>>> >>? ? ? ?Thread.getAllStackTraces(); > >>>> >>? ? ? ?value = mbean.getTotalSafepointTime(); > >>>> >> } while (value == 0); > >>>> >> > >>>> >> We'd only hit a timeout if something is completely broken - > >>>> which is fine. > >>>> >> > >>>> >> Overall tests like this are not very useful, yet very fragile. > >>>> >> > >>>> >> Thanks, > >>>> >> David > >>>> >> > >>>> >>> Hope you like this version ?better. > >>>> >>> > >>>> >>> Best regards, Matthias > >>>> >>> > >>>> >>> *From:*Jean Christophe Beyler >>>> > > >>>> >>> *Sent:* Dienstag, 30. Juli 2019 05:39 > >>>> >>> *To:* David Holmes >>>> > > >>>> >>> *Cc:* Baesken, Matthias >>>> >; > >>>> >>> hotspot-dev at openjdk.java.net > >>>> ; serviceability-dev > >>>> >>> >>>> > > >>>> >>> *Subject:* Re: RFR: [XS] 8228658: test > >>>> GetTotalSafepointTime.java fails > >>>> >>> on fast Linux machines with Total safepoint time 0 ms > >>>> >>> > >>>> >>> Hi Matthias, > >>>> >>> > >>>> >>> I wonder if you should not do what David is suggesting and then > >>>> put that > >>>> >>> whole code (the while loop) in a helper method. Below you > have a > >>>> >>> calculation again using value2 (which I wonder what the added > >>>> value of > >>>> >>> it is though) but anyway, that value2 could also be 0 at some > >>>> point, no? > >>>> >>> > >>>> >>> So would it not be best to just refactor the getAllStackTraces > and > >>>> >>> calculate safepoint time in a helper method for both value / > value2 > >>>> >>> variables? > >>>> >>> > >>>> >>> Thanks, > >>>> >>> > >>>> >>> Jc > >>>> >>> > >>>> >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes > >>>> > >>>> >>> >>>> >> wrote: > >>>> >>> > >>>> >>>? ? ? Hi Matthias, > >>>> >>> > >>>> >>>? ? ? On 29/07/2019 8:20 pm, Baesken, Matthias wrote: > >>>> >>>? ? ? ?> Hello , please review this small test fix . > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> The test > >>>> >>> > >>>> >> > >>> > >> > test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. > >>>> >> java > >>>> >>>? ? ? fails sometimes on fast Linux machines with this error > >>>> message : > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> java.lang.RuntimeException: Total safepoint time > >>>> illegal value: 0 > >>>> >>>? ? ? ms (MIN = 1; MAX = 9223372036854775807) > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> looks like the total safepoint time is too low > >>>> currently on these > >>>> >>>? ? ? machines, it is < 1 ms. > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> There might be several ways to handle this : > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?>? ? *? ?Change the test? in a way that it might generate > >>>> nigher > >>>> >>>? ? ? safepoint times > >>>> >>>? ? ? ?>? ? *? ?Allow? safepoint time? == 0 ms > >>>> >>>? ? ? ?>? ? *? ?Offer an additional interface that gives > >>>> safepoint times > >>>> >>>? ? ? with finer granularity ( currently the HS has safepoint > >>>> time values > >>>> >>>? ? ? in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp > >>>> >>>? ? ? ??SafepointTracing::end > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> But it is converted on ms in this code > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> 114jlong RuntimeService::safepoint_time_ms() { > >>>> >>>? ? ? ?> 115? return UsePerfData ? > >>>> >>>? ? ? ?> 116 > >>>> >>> > >>>> Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : - > 1; > >>>> >>>? ? ? ?> 117} > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> 064jlong Management::ticks_to_ms(jlong ticks) { > >>>> >>>? ? ? ?> 2065? assert(os::elapsed_frequency() > 0, "Must be > >>>> non-zero"); > >>>> >>>? ? ? ?> 2066? return (jlong)(((double)ticks / > >>>> >>>? ? ? (double)os::elapsed_frequency()) > >>>> >>>? ? ? ?> 2067? ? ? ? ? ? ? ? ?* (double)1000.0); > >>>> >>>? ? ? ?> 2068} > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> Currently I go for? the first attempt (and try to generate > >>>> >>>? ? ? higher safepoint times in my patch) . > >>>> >>> > >>>> >>>? ? ? Yes that's probably best. Coarse-grained timing on very > >>>> fast machines > >>>> >>>? ? ? was bound to eventually lead to problems. > >>>> >>> > >>>> >>>? ? ? But perhaps a more future-proof approach is to just add a > >>>> do-while loop > >>>> >>>? ? ? around the stack dumps and only exit when we have a non- > zero > >>>> >> safepoint > >>>> >>>? ? ? time? > >>>> >>> > >>>> >>>? ? ? Thanks, > >>>> >>>? ? ? David > >>>> >>>? ? ? ----- > >>>> >>> > >>>> >>>? ? ? ?> Bug/webrev : > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> https://bugs.openjdk.java.net/browse/JDK-8228658 > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> > http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> > >>>> >>>? ? ? ?> Thanks, Matthias > >>>> >>>? ? ? ?> > >>>> >>> > >>>> >>> > >>>> >>> -- > >>>> >>> > >>>> >>> Thanks, > >>>> >>> > >>>> >>> Jc > >>>> >>> > >>>> > >>>> > >>>> > >>>> -- > >>>> > >>>> Thanks, > >>>> Jc From david.holmes at oracle.com Thu Aug 1 07:56:21 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 1 Aug 2019 17:56:21 +1000 Subject: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast Linux machines with Total safepoint time 0 ms In-Reply-To: References: <82938477-ce6d-fdeb-02ab-60809541d9e4@oracle.com> <1e303f06-3933-ba73-f34b-081b827a725d@oracle.com> <470c58c5-c364-aca0-62fe-ef469c5cb390@oracle.com> <9868e92d-398b-be7c-5d45-020c19a61052@oracle.com> Message-ID: <852abeab-7c91-69d3-53d6-57da71c8ccf6@oracle.com> On 1/08/2019 5:13 pm, Baesken, Matthias wrote: > Hi David + JC , thanks for the reviews . > > David - I added the suggested print-outputs , and also the parameter to executeThreadDumps . Okay - thanks for that. David ----- > Best regards, Matthias > > > >> -----Original Message----- >> From: David Holmes >> Sent: Mittwoch, 31. Juli 2019 23:57 >> To: Baesken, Matthias ; Jean Christophe >> Beyler >> Cc: hotspot-dev at openjdk.java.net; serviceability-dev > dev at openjdk.java.net> >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast >> Linux machines with Total safepoint time 0 ms >> >> On 1/08/2019 12:01 am, Baesken, Matthias wrote: >>> >>> Hi upload works again, now with webrev : >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.2/ >> >> Could you please add, for diagnostic purposes: >> >> System.out.println("Total safepoint time (ms): " + value); >> >> after: >> >> 60 long value = executeThreadDumps(); >> >> and >> >> 68 long value2 = executeThreadDumps(); >> >> that way if the test fails we can check logs to see what kind of >> safepoint times have been observed previously. No need to see an updated >> webrev just for that. >> >> I have one further suggestion, take it or leave it, that >> executeThreadDumps() takes a parameter to specify the initial value, so >> we'd have: >> >> 60 long value = executeThreadDumps(0); >> >> and >> >> 68 long value2 = executeThreadDumps(value); >> >> This might help detect getTotalSafepointTime() going backwards slightly >> better than current code. >> >> Thanks, >> David >> >>> Best regards, Matthias >>> >>> >>>> -----Original Message----- >>>> From: Baesken, Matthias >>>> Sent: Mittwoch, 31. Juli 2019 14:05 >>>> To: 'David Holmes' ; Jean Christophe Beyler >>>> >>>> Cc: hotspot-dev at openjdk.java.net; serviceability-dev >>> dev at openjdk.java.net> >>>> Subject: RE: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on >> fast >>>> Linux machines with Total safepoint time 0 ms >>>> >>>> Hello, here is a version following the latest proposal of JC . >>>> >>>> Unfortunately attached as patch, sorry for that - the uploads / pushes >>>> currently do not work from here . >>>> >>>> Best regards, Matthias >>>> >>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Mittwoch, 31. Juli 2019 05:04 >>>>> To: Jean Christophe Beyler >>>>> Cc: Baesken, Matthias ; hotspot- >>>>> dev at openjdk.java.net; serviceability-dev >>>> dev at openjdk.java.net> >>>>> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on >>>> fast >>>>> Linux machines with Total safepoint time 0 ms >>>>> >>>>> On 31/07/2019 9:08 am, Jean Christophe Beyler wrote: >>>>>> FWIW, I would have done something like what David was suggesting, >> just >>>>>> slightly tweaked: >>>>>> >>>>>> public static long executeThreadDumps() { >>>>>> ?long value; >>>>>> ?long initial_value = mbean.getTotalSafepointTime(); >>>>>> ?do { >>>>>> ? ? ?Thread.getAllStackTraces(); >>>>>> ? ? ?value = mbean.getTotalSafepointTime(); >>>>>> ?} while (value == initial_value); >>>>>> ?return value; >>>>>> } >>>>>> >>>>>> This ensures that the value is a new value as opposed to the current >>>>>> value and if something goes wrong, as David said, it will timeout; which >>>>>> is ok. >>>>> >>>>> Works for me. >>>>> >>>>>> But I come back to not really understanding why we are doing this at >>>>>> this point of relaxing (just get a new value of safepoint time). >>>>>> Because, if we accept timeouts now as a failure here, then really the >>>>>> whole test becomes: >>>>>> >>>>>> executeThreadDumps(); >>>>>> executeThreadDumps(); >>>>>> >>>>>> Since?the first call will return when value > 0 and the second call will >>>>>> return when value2 > value (I still wonder why we want to ensure it >>>>>> works twice...). >>>>> >>>>> The test is trying to sanity check that we are actually recording the >>>>> time used by safepoints. So first check is that we can get a non-zero >>>>> value; second check is we get a greater non-zero value. It's just a >>>>> sanity test to try and catch if something gets unexpectedly broken in >>>>> the time tracking code. >>>>> >>>>>> So both failures and even testing for it is kind of redundant, once you >>>>>> have a do/while until a change? >>>>> >>>>> Yes - the problem with the tests that try to check internal VM behaviour >>>>> is that we have no specified way to do something, in this case execute >>>>> safepoints, that relates to internal VM behaviour, so we have to do >>>>> something we know will currently work even if not specified to do so - >>>>> e.g. dumping all thread stacks uses a global safepoint. The second >>>>> problem is that the timer granularity is so coarse that we then have to >>>>> guess how many times we need to do that something before seeing a >>>>> change. To make the test robust we can keep doing stuff until we see a >>>>> change and so the only way that will fail is if the overall timeout of >>>>> the test kicks in. Or we can try and second guess how long it should >>>>> take by introducing our own internal timeout - either directly or by >>>>> limiting the number of loops in this case. That has its own problems and >>>>> in general we have tried to reduce internal test timeouts (by removing >>>>> them) and let overall timeouts take charge. >>>>> >>>>> No ideal solution. And this has already consumed way too much of >>>>> everyone's time. >>>>> >>>>> Cheers, >>>>> David >>>>> >>>>>> Thanks, >>>>>> Jc >>>>>> >>>>>> >>>>>> On Tue, Jul 30, 2019 at 2:35 PM David Holmes >> >>>>> > wrote: >>>>>> >>>>>> On 30/07/2019 10:39 pm, Baesken, Matthias wrote: >>>>>> > Hi David,? ?"put that whole code (the while loop) in a helper >>>>>> method."? ?was JC's idea,? and I like the idea . >>>>>> >>>>>> Regardless I think the way you are using NUM_THREAD_DUMPS is >>>> really >>>>>> confusing. As an all-caps static you'd expect it to be a constant. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> > Let's see what others think . >>>>>> > >>>>>> >> >>>>>> >> Overall tests like this are not very useful, yet very fragile. >>>>>> >> >>>>>> > >>>>>> > I am also? fine with putting the test on the exclude list. >>>>>> > >>>>>> > Best regards, Matthias >>>>>> > >>>>>> > >>>>>> >> -----Original Message----- >>>>>> >> From: David Holmes >>>>> > >>>>>> >> Sent: Dienstag, 30. Juli 2019 14:12 >>>>>> >> To: Baesken, Matthias >>>>> >; Jean Christophe >>>>>> >> Beyler > >>>>>> >> Cc: hotspot-dev at openjdk.java.net >>>>>> ; serviceability-dev >>>>>> >>>>> >> dev at openjdk.java.net > >>>>>> >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java >>>>>> fails on fast >>>>>> >> Linux machines with Total safepoint time 0 ms >>>>>> >> >>>>>> >> Hi Matthias, >>>>>> >> >>>>>> >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote: >>>>>> >>> Hello? JC / David,?? here is a second webrev? : >>>>>> >>> >>>>>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/ >>>>>> >>> >>>>>> >>> It moves?? the? thread dump execution into a? method >>>>>> >>> executeThreadDumps(long)?? ??, and also adds? while loops >>>>>> (but with a >>>>>> >>> limitation? for the number of thread dumps, really don?t >>>>>> >>> want to cause timeouts etc.).??? I removed a check for >>>>>> >>> MAX_VALUE_FOR_PASS?? because we cannot go over >>>>> Long.MAX_VALUE . >>>>>> >> >>>>>> >> I don't think executeThreadDumps is worth factoring out like out. >>>>>> >> >>>>>> >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd >> rather >>>> it >>>>>> >> remains a constant 100, and then you set a simple loop iteration >>>>>> count >>>>>> >> limit. Further with the proposed code when you get here: >>>>>> >> >>>>>> >>? ? 85? ? ? ? ?NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2; >>>>>> >> >>>>>> >> you don't even know what value you may be starting with. >>>>>> >> >>>>>> >> But I was thinking of simply: >>>>>> >> >>>>>> >> long value = 0; >>>>>> >> do { >>>>>> >>? ? ? ?Thread.getAllStackTraces(); >>>>>> >>? ? ? ?value = mbean.getTotalSafepointTime(); >>>>>> >> } while (value == 0); >>>>>> >> >>>>>> >> We'd only hit a timeout if something is completely broken - >>>>>> which is fine. >>>>>> >> >>>>>> >> Overall tests like this are not very useful, yet very fragile. >>>>>> >> >>>>>> >> Thanks, >>>>>> >> David >>>>>> >> >>>>>> >>> Hope you like this version ?better. >>>>>> >>> >>>>>> >>> Best regards, Matthias >>>>>> >>> >>>>>> >>> *From:*Jean Christophe Beyler >>>>> > >>>>>> >>> *Sent:* Dienstag, 30. Juli 2019 05:39 >>>>>> >>> *To:* David Holmes >>>>> > >>>>>> >>> *Cc:* Baesken, Matthias >>>>> >; >>>>>> >>> hotspot-dev at openjdk.java.net >>>>>> ; serviceability-dev >>>>>> >>> >>>>> > >>>>>> >>> *Subject:* Re: RFR: [XS] 8228658: test >>>>>> GetTotalSafepointTime.java fails >>>>>> >>> on fast Linux machines with Total safepoint time 0 ms >>>>>> >>> >>>>>> >>> Hi Matthias, >>>>>> >>> >>>>>> >>> I wonder if you should not do what David is suggesting and then >>>>>> put that >>>>>> >>> whole code (the while loop) in a helper method. Below you >> have a >>>>>> >>> calculation again using value2 (which I wonder what the added >>>>>> value of >>>>>> >>> it is though) but anyway, that value2 could also be 0 at some >>>>>> point, no? >>>>>> >>> >>>>>> >>> So would it not be best to just refactor the getAllStackTraces >> and >>>>>> >>> calculate safepoint time in a helper method for both value / >> value2 >>>>>> >>> variables? >>>>>> >>> >>>>>> >>> Thanks, >>>>>> >>> >>>>>> >>> Jc >>>>>> >>> >>>>>> >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes >>>>>> >>>>>> >>> >>>>> >> wrote: >>>>>> >>> >>>>>> >>>? ? ? Hi Matthias, >>>>>> >>> >>>>>> >>>? ? ? On 29/07/2019 8:20 pm, Baesken, Matthias wrote: >>>>>> >>>? ? ? ?> Hello , please review this small test fix . >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> The test >>>>>> >>> >>>>>> >> >>>>> >>>> >> test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime. >>>>>> >> java >>>>>> >>>? ? ? fails sometimes on fast Linux machines with this error >>>>>> message : >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> java.lang.RuntimeException: Total safepoint time >>>>>> illegal value: 0 >>>>>> >>>? ? ? ms (MIN = 1; MAX = 9223372036854775807) >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> looks like the total safepoint time is too low >>>>>> currently on these >>>>>> >>>? ? ? machines, it is < 1 ms. >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> There might be several ways to handle this : >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?>? ? *? ?Change the test? in a way that it might generate >>>>>> nigher >>>>>> >>>? ? ? safepoint times >>>>>> >>>? ? ? ?>? ? *? ?Allow? safepoint time? == 0 ms >>>>>> >>>? ? ? ?>? ? *? ?Offer an additional interface that gives >>>>>> safepoint times >>>>>> >>>? ? ? with finer granularity ( currently the HS has safepoint >>>>>> time values >>>>>> >>>? ? ? in ns , see? jdk/src/hotspot/share/runtime/safepoint.cpp >>>>>> >>>? ? ? ??SafepointTracing::end >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> But it is converted on ms in this code >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> 114jlong RuntimeService::safepoint_time_ms() { >>>>>> >>>? ? ? ?> 115? return UsePerfData ? >>>>>> >>>? ? ? ?> 116 >>>>>> >>> >>>>>> Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : - >> 1; >>>>>> >>>? ? ? ?> 117} >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> 064jlong Management::ticks_to_ms(jlong ticks) { >>>>>> >>>? ? ? ?> 2065? assert(os::elapsed_frequency() > 0, "Must be >>>>>> non-zero"); >>>>>> >>>? ? ? ?> 2066? return (jlong)(((double)ticks / >>>>>> >>>? ? ? (double)os::elapsed_frequency()) >>>>>> >>>? ? ? ?> 2067? ? ? ? ? ? ? ? ?* (double)1000.0); >>>>>> >>>? ? ? ?> 2068} >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> Currently I go for? the first attempt (and try to generate >>>>>> >>>? ? ? higher safepoint times in my patch) . >>>>>> >>> >>>>>> >>>? ? ? Yes that's probably best. Coarse-grained timing on very >>>>>> fast machines >>>>>> >>>? ? ? was bound to eventually lead to problems. >>>>>> >>> >>>>>> >>>? ? ? But perhaps a more future-proof approach is to just add a >>>>>> do-while loop >>>>>> >>>? ? ? around the stack dumps and only exit when we have a non- >> zero >>>>>> >> safepoint >>>>>> >>>? ? ? time? >>>>>> >>> >>>>>> >>>? ? ? Thanks, >>>>>> >>>? ? ? David >>>>>> >>>? ? ? ----- >>>>>> >>> >>>>>> >>>? ? ? ?> Bug/webrev : >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> https://bugs.openjdk.java.net/browse/JDK-8228658 >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/ >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> >>>>>> >>>? ? ? ?> Thanks, Matthias >>>>>> >>>? ? ? ?> >>>>>> >>> >>>>>> >>> >>>>>> >>> -- >>>>>> >>> >>>>>> >>> Thanks, >>>>>> >>> >>>>>> >>> Jc >>>>>> >>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks, >>>>>> Jc From daniel.daugherty at oracle.com Thu Aug 1 19:28:21 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 1 Aug 2019 15:28:21 -0400 Subject: RFR(T): 8228999 ProblemList vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java Message-ID: Greetings, There are almost 70 sightings of this test failure in the CI. Almost all of the are on Windows so time to ProblemList this test... $ hg diff diff -r 9afbcd27f26f test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 09:29:13 2019 -0700 +++ b/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 15:19:13 2019 -0400 @@ -177,6 +177,7 @@ ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java 8219652 aix-ppc64 ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java 8219652 aix-ppc64 ?vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java 8219652 aix-ppc64 +vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java 8221372 windows-x64 Thanks, in advance, for any questions, comments or suggestions. Dan From chris.plummer at oracle.com Thu Aug 1 20:04:46 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 1 Aug 2019 13:04:46 -0700 Subject: RFR(T): 8228999 ProblemList vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java In-Reply-To: References: Message-ID: Looks good. Chris On 8/1/19 12:28 PM, Daniel D. Daugherty wrote: > Greetings, > > There are almost 70 sightings of this test failure in the CI. > Almost all of the are on Windows so time to ProblemList > this test... > > $ hg diff > diff -r 9afbcd27f26f test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 09:29:13 2019 -0700 > +++ b/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 15:19:13 2019 -0400 > @@ -177,6 +177,7 @@ > ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java > 8219652 aix-ppc64 > ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java > 8219652 aix-ppc64 > ?vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java > 8219652 aix-ppc64 > +vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java > 8221372 windows-x64 > > > Thanks, in advance, for any questions, comments or suggestions. > > Dan From daniel.daugherty at oracle.com Thu Aug 1 20:06:00 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 1 Aug 2019 16:06:00 -0400 Subject: RFR(T): 8228999 ProblemList vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java In-Reply-To: References: Message-ID: Thanks for the fast review! Dan On 8/1/19 4:04 PM, Chris Plummer wrote: > Looks good. > > Chris > > On 8/1/19 12:28 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> There are almost 70 sightings of this test failure in the CI. >> Almost all of the are on Windows so time to ProblemList >> this test... >> >> $ hg diff >> diff -r 9afbcd27f26f test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 09:29:13 2019 -0700 >> +++ b/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 15:19:13 2019 -0400 >> @@ -177,6 +177,7 @@ >> ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java >> 8219652 aix-ppc64 >> ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java >> 8219652 aix-ppc64 >> ?vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java >> 8219652 aix-ppc64 >> +vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java >> 8221372 windows-x64 >> >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan > > From serguei.spitsyn at oracle.com Thu Aug 1 21:35:46 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 1 Aug 2019 14:35:46 -0700 Subject: RFR(T): 8228999 ProblemList vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java In-Reply-To: References: Message-ID: <29434b22-e261-4b87-7fce-fd0a0b7dd942@oracle.com> Hi Dan, +1 Thank you or taking care about it! Serguei On 8/1/19 1:04 PM, Chris Plummer wrote: > Looks good. > > Chris > > On 8/1/19 12:28 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> There are almost 70 sightings of this test failure in the CI. >> Almost all of the are on Windows so time to ProblemList >> this test... >> >> $ hg diff >> diff -r 9afbcd27f26f test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 09:29:13 2019 -0700 >> +++ b/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 15:19:13 2019 -0400 >> @@ -177,6 +177,7 @@ >> ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java >> 8219652 aix-ppc64 >> ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java >> 8219652 aix-ppc64 >> ?vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java >> 8219652 aix-ppc64 >> +vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java >> 8221372 windows-x64 >> >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan > > From daniel.daugherty at oracle.com Thu Aug 1 21:54:27 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 1 Aug 2019 17:54:27 -0400 Subject: RFR(T): 8228999 ProblemList vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java In-Reply-To: <29434b22-e261-4b87-7fce-fd0a0b7dd942@oracle.com> References: <29434b22-e261-4b87-7fce-fd0a0b7dd942@oracle.com> Message-ID: Thanks for the review. Can't list you as a reviewer since I already pushed the changeset... Dan On 8/1/19 5:35 PM, serguei.spitsyn at oracle.com wrote: > Hi Dan, > > +1 > > Thank you or taking care about it! > Serguei > > On 8/1/19 1:04 PM, Chris Plummer wrote: >> Looks good. >> >> Chris >> >> On 8/1/19 12:28 PM, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> There are almost 70 sightings of this test failure in the CI. >>> Almost all of the are on Windows so time to ProblemList >>> this test... >>> >>> $ hg diff >>> diff -r 9afbcd27f26f test/hotspot/jtreg/ProblemList.txt >>> --- a/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 09:29:13 2019 -0700 >>> +++ b/test/hotspot/jtreg/ProblemList.txt Thu Aug 01 15:19:13 2019 -0400 >>> @@ -177,6 +177,7 @@ >>> ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java >>> 8219652 aix-ppc64 >>> ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java >>> 8219652 aix-ppc64 >>> ?vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java >>> 8219652 aix-ppc64 >>> +vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java >>> 8221372 windows-x64 >>> >>> >>> Thanks, in advance, for any questions, comments or suggestions. >>> >>> Dan >> >> > From jcbeyler at google.com Thu Aug 1 22:16:19 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Thu, 1 Aug 2019 15:16:19 -0700 Subject: RFR (L) 8228998: Remove the testing against NSK_FALSE from tests Message-ID: Hi all, It took me a while to pick this item back but here we go :-). Here is a webrev that removes all the if (.* == NSK_FALSE) and replaces them with if (! .*). Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev/ Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 For the EM tests, I also updated the returns to be boolean, entirely removing the NSK_FALSE/NSK_TRUE parts because of the way the tests were done. Let me know if you'd rather I divide those up. This was tested by running the tests changed on my dev machine, I'll push it to the submit repo after review :-) Thanks and have a great evening, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Aug 1 23:06:57 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 1 Aug 2019 16:06:57 -0700 Subject: RFR (L) 8228998: Remove the testing against NSK_FALSE from tests In-Reply-To: References: Message-ID: <3e57e978-52bb-2fc5-6993-ad444869b1f1@oracle.com> Hi Jc, Thank you for taking care about this! Most of the links in the webrev can not be resolved. I'm getting the error: "403 - Forbidden". The only item that works is: |Cdiffs Udiffs Sdiffs Frames Old New ----- Raw | *test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/FollowReferences/followref001/followref001.cpp* Also, the patch is readable: http://cr.openjdk.java.net/%7Ejcbeyler/8228998/webrev/jdk-false.changeset It looks pretty good. Only a one comments: http://cr.openjdk.java.net/%7Ejcbeyler/8228998/webrev/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/FollowReferences/followref001/followref001.cpp.frames.html A couple of fragments are not aligned properly: 221 static bool getFieldsAndObjects(jvmtiEnv* jvmti, 222 JNIEnv* jni, 223 jclass debugeeClass, . . . 434 static bool checkTestedObjects(jvmtiEnv* jvmti, 435 JNIEnv* jni, 436 int chainLength, 437 ObjectDesc objectDescList[]) Thanks, Serguei On 8/1/19 3:16 PM, Jean Christophe Beyler wrote: > Hi all, > > It took me a while to pick this item back but here we go :-). Here is > a webrev that removes all the if (.* == NSK_FALSE) and replaces them > with if (! .*). > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 > > For the EM tests, I also updated the returns to be boolean, entirely > removing the NSK_FALSE/NSK_TRUE parts because of the way the tests > were done. Let me know if you'd rather I divide those up. > > This was tested by running the tests changed on my dev machine, I'll > push it to the submit repo after review :-) > > Thanks and have a great evening, > Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Thu Aug 1 23:53:14 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Thu, 1 Aug 2019 16:53:14 -0700 Subject: RFR (L) 8228998: Remove the testing against NSK_FALSE from tests In-Reply-To: <3e57e978-52bb-2fc5-6993-ad444869b1f1@oracle.com> References: <3e57e978-52bb-2fc5-6993-ad444869b1f1@oracle.com> Message-ID: Hi Serguei, My apologies. I fixed the forbidden on the old webrev link. Then I rechecked the white-spaces, and made webrev not ignore white space changes. Here is the new webrev: Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev.01/ Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 Thanks for your review :) Jc On Thu, Aug 1, 2019 at 4:07 PM wrote: > Hi Jc, > > Thank you for taking care about this! > Most of the links in the webrev can not be resolved. > I'm getting the error: "403 - Forbidden". > > The only item that works is: > > Cdiffs > > Udiffs > > Sdiffs > > Frames > > Old > > New > > ----- Raw > > > *test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/FollowReferences/followref001/followref001.cpp* > Also, the patch is readable: > > http://cr.openjdk.java.net/%7Ejcbeyler/8228998/webrev/jdk-false.changeset > > > It looks pretty good. > > Only a one comments: > > > http://cr.openjdk.java.net/%7Ejcbeyler/8228998/webrev/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/FollowReferences/followref001/followref001.cpp.frames.html > > A couple of fragments are not aligned properly: > > 221 static bool getFieldsAndObjects(jvmtiEnv* jvmti, > 222 JNIEnv* jni, > 223 jclass debugeeClass, > . . . > > 434 static bool checkTestedObjects(jvmtiEnv* jvmti, > 435 JNIEnv* jni, > 436 int chainLength, > 437 ObjectDesc objectDescList[]) > > > Thanks, > Serguei > > > On 8/1/19 3:16 PM, Jean Christophe Beyler wrote: > > Hi all, > > It took me a while to pick this item back but here we go :-). Here is a > webrev that removes all the if (.* == NSK_FALSE) and replaces them with if > (! .*). > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 > > For the EM tests, I also updated the returns to be boolean, entirely > removing the NSK_FALSE/NSK_TRUE parts because of the way the tests were > done. Let me know if you'd rather I divide those up. > > This was tested by running the tests changed on my dev machine, I'll push > it to the submit repo after review :-) > > Thanks and have a great evening, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Aug 2 00:27:41 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 1 Aug 2019 17:27:41 -0700 Subject: RFR (L) 8228998: Remove the testing against NSK_FALSE from tests In-Reply-To: References: <3e57e978-52bb-2fc5-6993-ad444869b1f1@oracle.com> Message-ID: Hi Jc, Looks good. This still aligned incorrectly: 221 static bool getFieldsAndObjects(jvmtiEnv* jvmti, 222 JNIEnv* jni, 223 jclass debugeeClass, 224 jclass rootObjectClass, 225 jclass chainObjectClass, 226 jobject* rootObjectPtr, 227 jfieldID* reachableChainField, 228 jfieldID* unreachableChainField, 229 jfieldID* nextField) { Some copyright comments need an update. No need in another review if you fix it. You may want to update the webrev in place for other reviewers. Thanks, Serguei On 8/1/19 4:53 PM, Jean Christophe Beyler wrote: > Hi Serguei, > > My apologies. I fixed the forbidden on the old webrev link. Then I > rechecked the white-spaces, and made webrev not ignore white space > changes. Here is the new webrev: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev.01/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 > > Thanks for your review :) > Jc > > > On Thu, Aug 1, 2019 at 4:07 PM > wrote: > > Hi Jc, > > Thank you for taking care about this! > Most of the links in the webrev can not be resolved. > I'm getting the error: "403 - Forbidden". > > The only item that works is: > > |Cdiffs > > Udiffs > > Sdiffs > > Frames > > Old > > New > > ----- Raw > > > | > *test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/FollowReferences/followref001/followref001.cpp* > > > Also, the patch is readable: > http://cr.openjdk.java.net/%7Ejcbeyler/8228998/webrev/jdk-false.changeset > > > It looks pretty good. > > Only a one comments: > > http://cr.openjdk.java.net/%7Ejcbeyler/8228998/webrev/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/FollowReferences/followref001/followref001.cpp.frames.html > > A couple of fragments are not aligned properly: > > 221 static bool getFieldsAndObjects(jvmtiEnv* jvmti, > 222 JNIEnv* jni, > 223 jclass debugeeClass, > . . . > > 434 static bool checkTestedObjects(jvmtiEnv* jvmti, > 435 JNIEnv* jni, > 436 int chainLength, > 437 ObjectDesc objectDescList[]) > > > Thanks, > Serguei > > > On 8/1/19 3:16 PM, Jean Christophe Beyler wrote: >> Hi all, >> >> It took me a while to pick this item back but here we go :-). >> Here is a webrev that removes all the if (.* == NSK_FALSE) and >> replaces them with if (! .*). >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev/ >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 >> >> For the EM tests, I also updated the returns to be boolean, >> entirely removing the NSK_FALSE/NSK_TRUE parts because of the way >> the tests were done. Let me know if you'd rather I divide those up. >> >> This was tested by running the tests changed on my dev machine, >> I'll push it to the submit repo after review :-) >> >> Thanks and have a great evening, >> Jc > > > > -- > > Thanks, > Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Fri Aug 2 01:55:53 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Thu, 1 Aug 2019 18:55:53 -0700 Subject: RFR (L) 8228998: Remove the testing against NSK_FALSE from tests In-Reply-To: References: <3e57e978-52bb-2fc5-6993-ad444869b1f1@oracle.com> Message-ID: Hi Serguei, Thanks :) Done, I updated it and the copyrights and did an in place replacement: Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev.01/ Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 Thanks again, Jc On Thu, Aug 1, 2019 at 5:27 PM wrote: > Hi Jc, > > Looks good. > > This still aligned incorrectly: > > 221 static bool getFieldsAndObjects(jvmtiEnv* jvmti, 222 JNIEnv* jni, 223 jclass debugeeClass, 224 jclass rootObjectClass, 225 jclass chainObjectClass, 226 jobject* rootObjectPtr, 227 jfieldID* reachableChainField, 228 jfieldID* unreachableChainField, 229 jfieldID* nextField) { > > > Some copyright comments need an update. > No need in another review if you fix it. > You may want to update the webrev in place for other reviewers. > > Thanks, > Serguei > > > On 8/1/19 4:53 PM, Jean Christophe Beyler wrote: > > Hi Serguei, > > My apologies. I fixed the forbidden on the old webrev link. Then I > rechecked the white-spaces, and made webrev not ignore white space changes. > Here is the new webrev: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev.01/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 > > Thanks for your review :) > Jc > > > On Thu, Aug 1, 2019 at 4:07 PM wrote: > >> Hi Jc, >> >> Thank you for taking care about this! >> Most of the links in the webrev can not be resolved. >> I'm getting the error: "403 - Forbidden". >> >> The only item that works is: >> >> Cdiffs >> >> Udiffs >> >> Sdiffs >> >> Frames >> >> Old >> >> New >> >> ----- Raw >> >> >> *test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/FollowReferences/followref001/followref001.cpp* >> Also, the patch is readable: >> >> http://cr.openjdk.java.net/%7Ejcbeyler/8228998/webrev/jdk-false.changeset >> >> >> It looks pretty good. >> >> Only a one comments: >> >> >> http://cr.openjdk.java.net/%7Ejcbeyler/8228998/webrev/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/FollowReferences/followref001/followref001.cpp.frames.html >> >> A couple of fragments are not aligned properly: >> >> 221 static bool getFieldsAndObjects(jvmtiEnv* jvmti, >> 222 JNIEnv* jni, >> 223 jclass debugeeClass, >> . . . >> >> 434 static bool checkTestedObjects(jvmtiEnv* jvmti, >> 435 JNIEnv* jni, >> 436 int chainLength, >> 437 ObjectDesc objectDescList[]) >> >> >> Thanks, >> Serguei >> >> >> On 8/1/19 3:16 PM, Jean Christophe Beyler wrote: >> >> Hi all, >> >> It took me a while to pick this item back but here we go :-). Here is a >> webrev that removes all the if (.* == NSK_FALSE) and replaces them with if >> (! .*). >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8228998/webrev/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8228998 >> >> For the EM tests, I also updated the returns to be boolean, entirely >> removing the NSK_FALSE/NSK_TRUE parts because of the way the tests were >> done. Let me know if you'd rather I divide those up. >> >> This was tested by running the tests changed on my dev machine, I'll push >> it to the submit repo after review :-) >> >> Thanks and have a great evening, >> Jc >> >> >> > > -- > > Thanks, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Aug 2 06:48:32 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 1 Aug 2019 23:48:32 -0700 Subject: RFR (L) 8228998: Remove the testing against NSK_FALSE from tests In-Reply-To: References: <3e57e978-52bb-2fc5-6993-ad444869b1f1@oracle.com> Message-ID: <80805460-255b-1ef3-6b49-f6f25487c022@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Aug 2 18:12:17 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 2 Aug 2019 11:12:17 -0700 Subject: RFR: 8170299: Debugger does not stop inside the low memory notifications code In-Reply-To: <9783ca89-0af8-2167-436a-e5ff2db631a3@oracle.com> References: <75B5F778-DC49-494B-AC12-270F301677CA@oracle.com> <60639d41-735a-00d3-c9db-1955f581b89a@oracle.com> <9783ca89-0af8-2167-436a-e5ff2db631a3@oracle.com> Message-ID: <9a805686-57bf-c158-a777-c3cb7e38f09f@oracle.com> On 7/31/19 11:50 PM, David Holmes wrote: > Hi Daniil, > > On 25/07/2019 3:34 am, Daniil Titov wrote: >> Hi David, >> >> Hope you had a great vacation! > > I did thank you. Apologies again for taking so long to get back to > this work. > >> Please find below the latest version of the change . The only >> difference from the version 01 is >> the corrected ordering of include statements as Serguei suggested. >> >> Webrev: https://cr.openjdk.java.net/~dtitov/8170299/webrev.02/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8170299 > > I'm still remain concerned about introducing yet-another-thread to the > system. The potential interactions with other threads is not at all > clear. > > I'm also concerned that this thread has to be visible so that you can > debug the notification code, yet at the same time being visible makes > it vulnerable to application level actions that don't impact the > service thread - in particular if we suspend all threads then this > thread will be suspended too, if we resume a thread that triggers a > notification, the notification thread won't be able to respond to it > as it is suspended. The user won't know that they need to explicitly > resume this internal system thread. This is indeed problematic, but it seems less of an issue than running java code on the service thread. BTW, are there any other cases where we run java code on the service thread? It seems running java code on a hidden thread is just asking for trouble. I assume if? you hit a breakpoint while doing this, it is simply ignored. Not exactly what the debugger user is expecting. Chris > > Also note in serviceThread.cpp we have: > > ?129?????? // This ThreadBlockInVM object is not also considered to be > ?130?????? // suspend-equivalent because ServiceThread is not visible to > ?131?????? // external suspension. > ?132 > ?133?????? ThreadBlockInVM tbivm(jt); > > and you copied that across to notificationThread.cpp as: > > ? 93?????? // Need state transition ThreadBlockInVM so that this thread > ? 94?????? // will be handled by safepoint correctly when this thread is > ? 95?????? // notified at a safepoint. > ? 96 > ? 97?????? ThreadBlockInVM tbivm(jt); > > so this will continue to not be a suspend-equivalent condition even > though this thread is visible and suspendible! So something seems > wrong there. I'm unclear why we need to use the ThreadBlockInVM rather > than defining the NotificationLock as a safepoint-checks-always lock, > rather than a safepoint-check-never lock? In fact with some recent > changes to locks I'm not even sure it is legal for the notification > thread to use a safepoint-check-never lock - have you re-based this > recently? > > Thanks, > David > >> Thanks! >> --Daniil >> >> ?On 7/3/19, 11:47 PM, "David Holmes" wrote: >> >> ???? Hi Daniil, >> ???? ???? On 4/07/2019 1:04 pm, Daniil Titov wrote: >> ???? > Please review the change the fixes the problem with the >> debugger not stopping in the low memory notification code. >> ???? > >> ???? > The problem here is that the ServiceThread that calls these >> MXBean listeners is hidden from the external view that prevents the >> debugger from stopping in it. >> ???? > >> ???? > The fix introduces new NotificationThread that is visible to >> the external view and offloads the ServiceThread from sending low >> memory and other notifications that could result in Java calls ( GC >> and diagnostic commands notifications) by moving these activities in >> this new NotificationThread. >> ???? ???? There is a long and unfortunate history with this bug. >> ???? ???? The original incarnation of this fix was introducing a new >> thread at the >> ???? Java library level, and I had some concerns about that: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-December/022612.html >> ???? ???? That effort was resurrected at: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024466.html >> ???? ???? and >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024849.html >> ???? ???? but was left somewhat in limbo. There was a lot of doubt >> about the right >> ???? way to fix this bug and whether introducing a new thread was too >> disruptive. >> ???? ???? But introducing a new thread in the VM also has the same >> set of >> ???? concerns! This needs consideration by the runtime team before going >> ???? ahead. Introducing a new thread likes this needs to be examined in >> ???? detail - particularly the synchronization interactions with other >> ???? threads. It also introduces another monitor designated >> safepoint-never >> ???? at a time when we are in the process of cleaning up monitors so >> that >> ???? JavaThreads will only use safepoint-check-always monitors. >> ???? ???? Unfortunately I'm about to head out for two weeks vacation, >> and a number >> ???? of other key runtime folk are also on vacation. but I'd ask that >> you >> ???? hold off on this until we can look at it in more detail. >> ???? ???? Thanks, >> ???? David >> ???? ----- >> ???? ???? > Testing: Mach5 tier1,tier2 and tier3 tests succeeded. >> ???? > >> ???? > Webrev: https://cr.openjdk.java.net/~dtitov/8170299/webrev.01/ >> ???? > Bug: https://bugs.openjdk.java.net/browse/JDK-8170299 >> ???? > >> ???? > Thanks! >> ???? > --Daniil >> ???? > >> ???? > >> >> From jcbeyler at google.com Fri Aug 2 21:21:24 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Fri, 2 Aug 2019 14:21:24 -0700 Subject: RFR (M) 8229036: Remove the testing against NSK_TRUE from tests Message-ID: Hi all, Here is the webrev that does the removal of if (.* == NSK_TRUE) and replaces them with if (.*). Webrev: http://cr.openjdk.java.net/~jcbeyler/8229036/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8229036 This was tested by running the tests changed on my dev machine, I'll push it to the submit repo after review :-) Thanks and have a great day & weekend!, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Aug 2 21:40:04 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 2 Aug 2019 14:40:04 -0700 Subject: RFR (M) 8229036: Remove the testing against NSK_TRUE from tests In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From jcbeyler at google.com Fri Aug 2 22:00:08 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Fri, 2 Aug 2019 15:00:08 -0700 Subject: RFR (M) 8229036: Remove the testing against NSK_TRUE from tests In-Reply-To: References: Message-ID: Hi Chris, I only did it when there were repercussions to the change of if (.* == NSK_TRUE). For example: http://cr.openjdk.java.net/~jcbeyler/8229036/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetTime/gettime001/gettime001.cpp.udiff.html I wanted to move: - if (success != NSK_TRUE) { + if (!success) { But really I was thinking that success should be a bool then. However, success was assigned also by: success = checkTime(jvmti, &time, &prevTime, "VM_DEATH callback"); So I went to transform checkTime to return a boolean, and that rippled into changing the NSK_FALSE to false as well. I can reduce the scope of this webrev to only being the if statement if you prefer, I was just working on getting the various elements to bool instead of int and NSK_TRUE/NSK_FALSE. Another solution would be to maybe augment the scope of the bug item to : Move NSK_TRUE/NSK_FALSE to true/false; and this webrev as a side-effect covers all the "if (.* NSK_TRUE)" cases. What do you think? Jc On Fri, Aug 2, 2019 at 2:40 PM Chris Plummer wrote: > Hi JC, > > Why does this webrev also remove references NSK_FALSE, and the previous > one references to NSK_TRUE? > > thanks, > > Chris > > On 8/2/19 2:21 PM, Jean Christophe Beyler wrote: > > Hi all, > > Here is the webrev that does the removal of if (.* == NSK_TRUE) and > replaces them with if (.*). > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8229036/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8229036 > > This was tested by running the tests changed on my dev machine, I'll push > it to the submit repo after review :-) > > Thanks and have a great day & weekend!, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Fri Aug 2 22:16:05 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 02 Aug 2019 15:16:05 -0700 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> Message-ID: <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> Hi David, Thank you for your detailed review. Please review a new version of the fix that includes the changes you suggested: - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; - ThreadTableCreate_lock is made _safepoint_check_always; - ServiceThread is no longer responsible for the resizing of the thread table, instead, the thread table is changed to grow on demand by the thread that is doing the addition; - fixed nits and formatting issues. >> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() >> as Daniel suggested. > Not sure it's best to combine these, but if they are limited to the > changes in management.cpp only then that may be okay. The additional optimization for some callers of find_JavaThread_from_java_tid() is limited to management.cpp (plus a new test) so I left them in the webrev but I also could move it in the separate issue if required. > src/hotspot/share/runtime/threadSMR.cpp >755 jlong tid = SharedRuntime::get_java_tid(thread); > 926 jlong tid = SharedRuntime::get_java_tid(thread); > I think it cleaner/better to just use > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > as we know thread is not NULL, it is a JavaThread and it has to have a > non-null threadObj. I had to leave this code unchanged since it turned out the threadObj is null when VM is destroyed: V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c C [libjli.so+0x4333] JavaMain+0x2c3 C [libjli.so+0x8159] ThreadJavaMain+0x9 > src/hotspot/share/services/threadTable.cpp > 71 static uintx get_hash(Value const& value, bool* is_dead) { > The is_dead parameter still bothers me here. I can't make enough sense > out of the template code in ConcurrentHashtable to see why we have to > have it, but I'm concerned that its very existence means we perhaps > should not be trying to extend CHT in this context. ?? My understanding is that is_dead parameter provides a mechanism for ConcurrentHashtable to remove stale entries that were not explicitly removed by calling ConcurrentHashTable::remove() method. I think that just because in our case we don't use this mechanism doesn't mean we should not use ConcurrentHashTable. > I would still want to see what impact this has on thread > startup cost, both with and without the table being initialized. I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), starts some threads as a worm-up, and then creates and starts 100,000 threads (each thread just sleeps for 100 ms). In case when the thread table is enabled 100,000 threads are created and started for about 15200 ms. If the thread table is off the test takes about 14800 ms. Based on this information the enabled thread table makes the thread startup about 2.7% slower. Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 Thanks! --Daniil ?On 7/29/19, 12:53 AM, "David Holmes" wrote: Hi Daniil, Overall I think this is a reasonable approach but I would still like to see some performance and footprint numbers, both to verify it fixes the problem reported, and that we are not getting penalized elsewhere. On 25/07/2019 3:21 am, Daniil Titov wrote: > Hi David, Daniel, and Serguei, > > Please review the new version of the fix, that makes the thread table initialization on demand and > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > table is being initialized . Such threads will be found by the linear search and added to the thread table > later, in ThreadsList::find_JavaThread_from_java_tid(). The initialization allows the created but unpopulated, or partially populated, table to be seen by other threads - is that your intention? It seems it should be okay as the other threads will then race with the initializing thread to add specific entries, and this is a concurrent map so that should be functionally correct. But if so then I think you can also reduce the scope of the ThreadTableCreate_lock so that it covers creation of the table only, not the initial population of the table. I like the approach of only initializing the table when needed and using that to control when the add/remove-thread code needs to update the table. But I would still want to see what impact this has on thread startup cost, both with and without the table being initialized. > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > as Daniel suggested. Not sure it's best to combine these, but if they are limited to the changes in management.cpp only then that may be okay. It helps to be able to focus on the table related changes without being distracted by other optimizations. > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > to strip it of the all functionality that is not required in the thread table case. The revised version seems better in that regard. But I still have a concern, see below. > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > growing the thread table when required. Yes but why? Why can't this table be grown on demand by the thread that is doing the addition? For other tables we may have to delegate to the service thread because the current thread cannot perform the action, or it doesn't want to perform it at the time the need for the resize is detected (e.g. its detected at a safepoint and you want the resize to happen later outside the safepoint). It's not apparent to me that such restrictions apply here. > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > already has ConcurrentHashTable doesn't seem reasonable for me. Ok. > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 Some specific code comments: src/hotspot/share/runtime/mutexLocker.cpp + def(ThreadTableCreate_lock , PaddedMutex , special, false, Monitor::_safepoint_check_never); I think this needs to be a _safepoint_check_always lock. The table will be created by regular JavaThreads and they should (nearly) always be checking for safepoints if they are going to block acquiring the lock. And it isn't at all obvious that the thread doing the creation can't go to a safepoint whilst this lock is held. --- src/hotspot/share/runtime/threadSMR.cpp Nit: 618 JavaThread* thread = thread_at(i); you could reuse the new java_thread local you introduced at line 613 and just rename that "new" variable to "thread" so you don't have to change all other uses. 628 } else if (java_thread != NULL && ... You don't need to check != NULL here as you only get here when java_thread is not NULL. 755 jlong tid = SharedRuntime::get_java_tid(thread); 926 jlong tid = SharedRuntime::get_java_tid(thread); I think it cleaner/better to just use jlong tid = java_lang_Thread::thread_id(thread->threadObj()); as we know thread is not NULL, it is a JavaThread and it has to have a non-null threadObj. --- src/hotspot/share/services/management.cpp 1323 if (THREAD->is_Java_thread()) { 1324 JavaThread* current_thread = (JavaThread*)THREAD; These calls can only be made on a JavaThread so this be simplified to remove the is_Java_thread() call. Similarly in other places. --- src/hotspot/share/services/threadTable.cpp 55 class ThreadTableEntry : public CHeapObj { 56 private: 57 jlong _tid; I believe hotspot style is to not indent the access modifiers in C++ class declarations, so the above would just be: 55 class ThreadTableEntry : public CHeapObj { 56 private: 57 jlong _tid; etc. 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : 61 _tid(tid),_java_thread(java_thread) {} line 61 should be indented as it continues line 60. 67 class ThreadTableConfig : public AllStatic { ... 71 static uintx get_hash(Value const& value, bool* is_dead) { The is_dead parameter still bothers me here. I can't make enough sense out of the template code in ConcurrentHashtable to see why we have to have it, but I'm concerned that its very existence means we perhaps should not be trying to extend CHT in this context. ?? 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog 116 ? size_log : DefaultThreadTableSizeLog; line 116 should be indented, though in this case I think a better layout would be: 115 size_t start_size_log = 116 size_log > DefaultThreadTableSizeLog ? size_log : DefaultThreadTableSizeLog; 131 double ThreadTable::get_load_factor() { 132 return (double)_items_count/_current_size; 133 } Not sure that is doing what you want/expect. It will perform integer division and then cast that whole integer to a double. If you want double arithmetic you need: return ((double)_items_count)/_current_size; 180 jlong _tid; 181 uintx _hash; Nit: no need for all those spaces before the variable name. 183 ThreadTableLookup(jlong tid) 184 : _tid(tid), _hash(primitive_hash(tid)) {} line 184 should be indented. 201 ThreadGet():_return(NULL) {} Nit: need space after : 211 assert(_is_initialized, "Thread table is not initialized"); 212 _has_work = false; line 211 is indented one space too far. 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); Nit: need space after , 252 return _local_table->remove(thread,lookup); Nit: need space after , Thanks, David ------ > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > Thanks! > --Daniil > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > Hi Serguei and David, > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > I have the same concerns as David H. about this new ThreadTable. > ThreadsList::find_JavaThread_from_java_tid() is only called from code > in src/hotspot/share/services/management.cpp so I think that table > needs to enabled and populated only if it is going to be used. > > I've taken a look at the webrev below and I see that David has > followed up with additional comments. Before I do a crawl through > code review for this, I would like to see the ThreadTable stuff > made optional and David's other comments addressed. > > Another possible optimization is for callers of > find_JavaThread_from_java_tid() to save the calling thread's > tid value before they loop and if the current tid == saved_tid > then use the current JavaThread* instead of calling > find_JavaThread_from_java_tid() to get the JavaThread*. > > Dan > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > Thanks! > > --Daniil > > > > From: > > Organization: Oracle Corporation > > Date: Friday, June 28, 2019 at 7:56 PM > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > Hi Daniil, > > > > I have several quick comments. > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > 618 // thread. Thus, we find this thread with a linear search and add it > > 619 // to the thread table. > > 620 for (uint i = 0; i < length(); i++) { > > 621 JavaThread* thread = thread_at(i); > > 622 if (is_valid_java_thread(java_tid,thread)) { > > 623 ThreadTable::add_thread(java_tid, thread); > > 624 return thread; > > 625 } > > 626 } > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > 628 return java_thread; > > 629 } > > 630 return NULL; > > 631 } > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > 633 oop tobj = java_thread->threadObj(); > > 634 // Ignore the thread if it hasn't run yet, has exited > > 635 // or is starting to exit. > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > 638 } > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > A space is missed after the comma: > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > An empty line is needed before L632. > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > Something like 'is_alive_java_thread_with_tid()' would be better. > > It'd better to list parameters in the opposite order. > > > > The call to is_valid_java_thread() is confusing: > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > Thanks, > > Serguei > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > Hi Daniil, > > > > The definition and use of this hashtable (yet another hashtable > > implementation!) will need careful examination. We have to be concerned > > about the cost of maintaining it when it may never even be queried. You > > would need to look at footprint cost and performance impact. > > > > Unfortunately I'm just about to board a plane and will be out for the > > next few days. I will try to look at this asap next week, but we will > > need a lot more data on it. > > > > Thanks, > > David > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > in the thread table. > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > Thanks! > > > > Best regards, > > Daniil > > > > > > > > > > > > > > > > > > From chris.plummer at oracle.com Fri Aug 2 23:20:06 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 2 Aug 2019 16:20:06 -0700 Subject: RFR (M) 8229036: Remove the testing against NSK_TRUE from tests In-Reply-To: References: Message-ID: <0180f1e6-85e4-f174-d89e-91cae5987fce@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Aug 5 01:34:34 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 5 Aug 2019 11:34:34 +1000 Subject: RFR: 8170299: Debugger does not stop inside the low memory notifications code In-Reply-To: <9a805686-57bf-c158-a777-c3cb7e38f09f@oracle.com> References: <75B5F778-DC49-494B-AC12-270F301677CA@oracle.com> <60639d41-735a-00d3-c9db-1955f581b89a@oracle.com> <9783ca89-0af8-2167-436a-e5ff2db631a3@oracle.com> <9a805686-57bf-c158-a777-c3cb7e38f09f@oracle.com> Message-ID: <194bd23a-0f16-19a9-a3e7-d02fa6d58369@oracle.com> Hi Chris, On 3/08/2019 4:12 am, Chris Plummer wrote: > On 7/31/19 11:50 PM, David Holmes wrote: >> Hi Daniil, >> >> On 25/07/2019 3:34 am, Daniil Titov wrote: >>> Hi David, >>> >>> Hope you had a great vacation! >> >> I did thank you. Apologies again for taking so long to get back to >> this work. >> >>> Please find below the latest version of the change . The only >>> difference from the version 01 is >>> the corrected ordering of include statements as Serguei suggested. >>> >>> Webrev: https://cr.openjdk.java.net/~dtitov/8170299/webrev.02/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170299 >> >> I'm still remain concerned about introducing yet-another-thread to the >> system. The potential interactions with other threads is not at all >> clear. >> >> I'm also concerned that this thread has to be visible so that you can >> debug the notification code, yet at the same time being visible makes >> it vulnerable to application level actions that don't impact the >> service thread - in particular if we suspend all threads then this >> thread will be suspended too, if we resume a thread that triggers a >> notification, the notification thread won't be able to respond to it >> as it is suspended. The user won't know that they need to explicitly >> resume this internal system thread. > > This is indeed problematic, but it seems less of an issue than running > java code on the service thread. BTW, are there any other cases where we > run java code on the service thread? It seems running java code on a > hidden thread is just asking for trouble. I assume if? you hit a > breakpoint while doing this, it is simply ignored. Not exactly what the > debugger user is expecting. The ServiceThread was introduced as a generalization of the LowMemoryDetectorThread, specifically because it was needed to run Java code (in particular load Java classes) in response to other event notifications (ie compiler load events). See JDK-6766644 fixed in JDK 7. So it has always been the case that the low memory notification code has executed in a hidden thread; and since JDK-6766644 other Java code has executed in this same hidden thread. As I said at the start this bug has a long and complex history. It is far from clear that it can, or even should, be fixed, due to the other compatibility problems that the fix will introduce. If you look at the docs for MemoryPoolMXBean: https://docs.oracle.com/javase/10/docs/api/java/lang/management/MemoryPoolMXBean.html it recommends doing minimal work in the actual notification code and instead hand off the real work to another thread: "The handleNotification method should be designed to do a very minimal amount of work and return without delay to avoid causing delay in delivering subsequent notifications. Time-consuming actions should be performed by a separate thread." with that approach you don't need to be able to debug the notification code executed by the service-thread because it should be trivial. I would agree that the threading model for the notification system is under-specified and that these kinds of details should have been considered originally and any limitations clearly spelt out. Perhaps all we should do here is improve the documentation? Either way any change in doc or behaviour will require a CSR request. Cheers, David ----- > Chris > >> >> Also note in serviceThread.cpp we have: >> >> ?129?????? // This ThreadBlockInVM object is not also considered to be >> ?130?????? // suspend-equivalent because ServiceThread is not visible to >> ?131?????? // external suspension. >> ?132 >> ?133?????? ThreadBlockInVM tbivm(jt); >> >> and you copied that across to notificationThread.cpp as: >> >> ? 93?????? // Need state transition ThreadBlockInVM so that this thread >> ? 94?????? // will be handled by safepoint correctly when this thread is >> ? 95?????? // notified at a safepoint. >> ? 96 >> ? 97?????? ThreadBlockInVM tbivm(jt); >> >> so this will continue to not be a suspend-equivalent condition even >> though this thread is visible and suspendible! So something seems >> wrong there. I'm unclear why we need to use the ThreadBlockInVM rather >> than defining the NotificationLock as a safepoint-checks-always lock, >> rather than a safepoint-check-never lock? In fact with some recent >> changes to locks I'm not even sure it is legal for the notification >> thread to use a safepoint-check-never lock - have you re-based this >> recently? >> >> Thanks, >> David >> >>> Thanks! >>> --Daniil >>> >>> ?On 7/3/19, 11:47 PM, "David Holmes" wrote: >>> >>> ???? Hi Daniil, >>> ???? ???? On 4/07/2019 1:04 pm, Daniil Titov wrote: >>> ???? > Please review the change the fixes the problem with the >>> debugger not stopping in the low memory notification code. >>> ???? > >>> ???? > The problem here is that the ServiceThread that calls these >>> MXBean listeners is hidden from the external view that prevents the >>> debugger from stopping in it. >>> ???? > >>> ???? > The fix introduces new NotificationThread that is visible to >>> the external view and offloads the ServiceThread from sending low >>> memory and other notifications that could result in Java calls ( GC >>> and diagnostic commands notifications) by moving these activities in >>> this new NotificationThread. >>> ???? ???? There is a long and unfortunate history with this bug. >>> ???? ???? The original incarnation of this fix was introducing a new >>> thread at the >>> ???? Java library level, and I had some concerns about that: >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-December/022612.html >>> >>> ???? ???? That effort was resurrected at: >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024466.html >>> >>> ???? ???? and >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024849.html >>> >>> ???? ???? but was left somewhat in limbo. There was a lot of doubt >>> about the right >>> ???? way to fix this bug and whether introducing a new thread was too >>> disruptive. >>> ???? ???? But introducing a new thread in the VM also has the same >>> set of >>> ???? concerns! This needs consideration by the runtime team before going >>> ???? ahead. Introducing a new thread likes this needs to be examined in >>> ???? detail - particularly the synchronization interactions with other >>> ???? threads. It also introduces another monitor designated >>> safepoint-never >>> ???? at a time when we are in the process of cleaning up monitors so >>> that >>> ???? JavaThreads will only use safepoint-check-always monitors. >>> ???? ???? Unfortunately I'm about to head out for two weeks vacation, >>> and a number >>> ???? of other key runtime folk are also on vacation. but I'd ask that >>> you >>> ???? hold off on this until we can look at it in more detail. >>> ???? ???? Thanks, >>> ???? David >>> ???? ----- >>> ???? ???? > Testing: Mach5 tier1,tier2 and tier3 tests succeeded. >>> ???? > >>> ???? > Webrev: https://cr.openjdk.java.net/~dtitov/8170299/webrev.01/ >>> ???? > Bug: https://bugs.openjdk.java.net/browse/JDK-8170299 >>> ???? > >>> ???? > Thanks! >>> ???? > --Daniil >>> ???? > >>> ???? > >>> >>> > > From david.holmes at oracle.com Mon Aug 5 02:54:07 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 5 Aug 2019 12:54:07 +1000 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> Message-ID: Hi Daniil, On 3/08/2019 8:16 am, Daniil Titov wrote: > Hi David, > > Thank you for your detailed review. Please review a new version of the fix that includes > the changes you suggested: > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > - ThreadTableCreate_lock is made _safepoint_check_always; Okay. > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > the thread table is changed to grow on demand by the thread that is doing the addition; Okay - I'm happy to get the serviceThread out of the picture here. > - fixed nits and formatting issues. Okay. >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() >>> as Daniel suggested. >> Not sure it's best to combine these, but if they are limited to the >> changes in management.cpp only then that may be okay. > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > limited to management.cpp (plus a new test) so I left them in the webrev but > I also could move it in the separate issue if required. I'd prefer this part of be separated out, but won't insist. Let's see if Dan or Serguei have a strong opinion. > > src/hotspot/share/runtime/threadSMR.cpp > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > I think it cleaner/better to just use > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > as we know thread is not NULL, it is a JavaThread and it has to have a > > non-null threadObj. > > I had to leave this code unchanged since it turned out the threadObj is null > when VM is destroyed: > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > C [libjli.so+0x4333] JavaMain+0x2c3 > C [libjli.so+0x8159] ThreadJavaMain+0x9 This is actually nothing to do with the VM being destroyed, but is an issue with JNI_AttachCurrentThread and its interaction with the ThreadSMR iterators. The attach process is: - create JavaThread - mark as "is attaching via jni" - add to ThreadsList - create java.lang.Thread object (you can only execute Java code after you are attached) - mark as "attach completed" So while a thread "is attaching" it will be seen by the ThreadSMR thread iterator but will have a NULL java.lang.Thread object. We special-case attaching threads in a number of places in the VM and I think we should be explicitly doing something here to filter out attaching threads, rather than just being tolerant of a NULL j.l.Thread object. Specifically in ThreadsSMRSupport::add_thread: if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { jlong tid = java_lang_Thread::thread_id(thread->threadObj()); ThreadTable::add_thread(tid, thread); } Note that in ThreadsSMRSupport::remove_thread we can use the same guard, which covers the case the JNI attach encountered an error trying to create the j.l.Thread object. >> src/hotspot/share/services/threadTable.cpp >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > >> The is_dead parameter still bothers me here. I can't make enough sense >> out of the template code in ConcurrentHashtable to see why we have to >> have it, but I'm concerned that its very existence means we perhaps >> should not be trying to extend CHT in this context. ?? > > My understanding is that is_dead parameter provides a mechanism for > ConcurrentHashtable to remove stale entries that were not explicitly > removed by calling ConcurrentHashTable::remove() method. > I think that just because in our case we don't use this mechanism doesn't > mean we should not use ConcurrentHashTable. Can you confirm that this usage is okay with Robbin Ehn please. He's back from vacation this week. >> I would still want to see what impact this has on thread >> startup cost, both with and without the table being initialized. > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > starts some threads as a worm-up, and then creates and starts 100,000 threads > (each thread just sleeps for 100 ms). In case when the thread table is enabled > 100,000 threads are created and started for about 15200 ms. If the thread table > is off the test takes about 14800 ms. Based on this information the enabled > thread table makes the thread startup about 2.7% slower. That doesn't sound very good. I think we may need to Claes involved to help investigate overall performance impact here. > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 No further code comments. I didn't look at the test in detail. Thanks, David > Thanks! > --Daniil > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > Hi Daniil, > > Overall I think this is a reasonable approach but I would still like to > see some performance and footprint numbers, both to verify it fixes the > problem reported, and that we are not getting penalized elsewhere. > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > Hi David, Daniel, and Serguei, > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > The initialization allows the created but unpopulated, or partially > populated, table to be seen by other threads - is that your intention? > It seems it should be okay as the other threads will then race with the > initializing thread to add specific entries, and this is a concurrent > map so that should be functionally correct. But if so then I think you > can also reduce the scope of the ThreadTableCreate_lock so that it > covers creation of the table only, not the initial population of the table. > > I like the approach of only initializing the table when needed and using > that to control when the add/remove-thread code needs to update the > table. But I would still want to see what impact this has on thread > startup cost, both with and without the table being initialized. > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > as Daniel suggested. > > Not sure it's best to combine these, but if they are limited to the > changes in management.cpp only then that may be okay. It helps to be > able to focus on the table related changes without being distracted by > other optimizations. > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > to strip it of the all functionality that is not required in the thread table case. > > The revised version seems better in that regard. But I still have a > concern, see below. > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > growing the thread table when required. > > Yes but why? Why can't this table be grown on demand by the thread that > is doing the addition? For other tables we may have to delegate to the > service thread because the current thread cannot perform the action, or > it doesn't want to perform it at the time the need for the resize is > detected (e.g. its detected at a safepoint and you want the resize to > happen later outside the safepoint). It's not apparent to me that such > restrictions apply here. > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > already has ConcurrentHashTable doesn't seem reasonable for me. > > Ok. > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > Some specific code comments: > > src/hotspot/share/runtime/mutexLocker.cpp > > + def(ThreadTableCreate_lock , PaddedMutex , special, > false, Monitor::_safepoint_check_never); > > I think this needs to be a _safepoint_check_always lock. The table will > be created by regular JavaThreads and they should (nearly) always be > checking for safepoints if they are going to block acquiring the lock. > And it isn't at all obvious that the thread doing the creation can't go > to a safepoint whilst this lock is held. > > --- > > src/hotspot/share/runtime/threadSMR.cpp > > Nit: > > 618 JavaThread* thread = thread_at(i); > > you could reuse the new java_thread local you introduced at line 613 and > just rename that "new" variable to "thread" so you don't have to change > all other uses. > > 628 } else if (java_thread != NULL && ... > > You don't need to check != NULL here as you only get here when > java_thread is not NULL. > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > I think it cleaner/better to just use > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > as we know thread is not NULL, it is a JavaThread and it has to have a > non-null threadObj. > > --- > > src/hotspot/share/services/management.cpp > > 1323 if (THREAD->is_Java_thread()) { > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > These calls can only be made on a JavaThread so this be simplified to > remove the is_Java_thread() call. Similarly in other places. > > --- > > src/hotspot/share/services/threadTable.cpp > > 55 class ThreadTableEntry : public CHeapObj { > 56 private: > 57 jlong _tid; > > I believe hotspot style is to not indent the access modifiers in C++ > class declarations, so the above would just be: > > 55 class ThreadTableEntry : public CHeapObj { > 56 private: > 57 jlong _tid; > > etc. > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > 61 _tid(tid),_java_thread(java_thread) {} > > line 61 should be indented as it continues line 60. > > 67 class ThreadTableConfig : public AllStatic { > ... > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > The is_dead parameter still bothers me here. I can't make enough sense > out of the template code in ConcurrentHashtable to see why we have to > have it, but I'm concerned that its very existence means we perhaps > should not be trying to extend CHT in this context. ?? > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > 116 ? size_log : DefaultThreadTableSizeLog; > > line 116 should be indented, though in this case I think a better layout > would be: > > 115 size_t start_size_log = > 116 size_log > DefaultThreadTableSizeLog ? size_log : > DefaultThreadTableSizeLog; > > 131 double ThreadTable::get_load_factor() { > 132 return (double)_items_count/_current_size; > 133 } > > Not sure that is doing what you want/expect. It will perform integer > division and then cast that whole integer to a double. If you want > double arithmetic you need: > > return ((double)_items_count)/_current_size; > > 180 jlong _tid; > 181 uintx _hash; > > Nit: no need for all those spaces before the variable name. > > 183 ThreadTableLookup(jlong tid) > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > line 184 should be indented. > > 201 ThreadGet():_return(NULL) {} > > Nit: need space after : > > 211 assert(_is_initialized, "Thread table is not initialized"); > 212 _has_work = false; > > line 211 is indented one space too far. > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > Nit: need space after , > > 252 return _local_table->remove(thread,lookup); > > Nit: need space after , > > Thanks, > David > ------ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > Thanks! > > --Daniil > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > Hi Serguei and David, > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > I have the same concerns as David H. about this new ThreadTable. > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > in src/hotspot/share/services/management.cpp so I think that table > > needs to enabled and populated only if it is going to be used. > > > > I've taken a look at the webrev below and I see that David has > > followed up with additional comments. Before I do a crawl through > > code review for this, I would like to see the ThreadTable stuff > > made optional and David's other comments addressed. > > > > Another possible optimization is for callers of > > find_JavaThread_from_java_tid() to save the calling thread's > > tid value before they loop and if the current tid == saved_tid > > then use the current JavaThread* instead of calling > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > Dan > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > --Daniil > > > > > > From: > > > Organization: Oracle Corporation > > > Date: Friday, June 28, 2019 at 7:56 PM > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > Hi Daniil, > > > > > > I have several quick comments. > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > 619 // to the thread table. > > > 620 for (uint i = 0; i < length(); i++) { > > > 621 JavaThread* thread = thread_at(i); > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > 623 ThreadTable::add_thread(java_tid, thread); > > > 624 return thread; > > > 625 } > > > 626 } > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > 628 return java_thread; > > > 629 } > > > 630 return NULL; > > > 631 } > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > 633 oop tobj = java_thread->threadObj(); > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > 635 // or is starting to exit. > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > 638 } > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > A space is missed after the comma: > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > An empty line is needed before L632. > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > It'd better to list parameters in the opposite order. > > > > > > The call to is_valid_java_thread() is confusing: > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > Thanks, > > > Serguei > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > Hi Daniil, > > > > > > The definition and use of this hashtable (yet another hashtable > > > implementation!) will need careful examination. We have to be concerned > > > about the cost of maintaining it when it may never even be queried. You > > > would need to look at footprint cost and performance impact. > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > next few days. I will try to look at this asap next week, but we will > > > need a lot more data on it. > > > > > > Thanks, > > > David > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > in the thread table. > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > > > > Best regards, > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From jcbeyler at google.com Tue Aug 6 15:49:55 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Tue, 6 Aug 2019 08:49:55 -0700 Subject: RFR (M) 8229036: Remove the testing against NSK_TRUE from tests In-Reply-To: <0180f1e6-85e4-f174-d89e-91cae5987fce@oracle.com> References: <0180f1e6-85e4-f174-d89e-91cae5987fce@oracle.com> Message-ID: Thanks Chris! Could I get a second review please? :) Jc On Fri, Aug 2, 2019 at 4:20 PM Chris Plummer wrote: > Ok. I think the changes are fine as is. > > thanks, > > Chris > > On 8/2/19 3:00 PM, Jean Christophe Beyler wrote: > > Hi Chris, > > I only did it when there were repercussions to the change of if (.* == > NSK_TRUE). > > For example: > > http://cr.openjdk.java.net/~jcbeyler/8229036/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetTime/gettime001/gettime001.cpp.udiff.html > > I wanted to move: > > - if (success != NSK_TRUE) { > + if (!success) { > > But really I was thinking that success should be a bool then. However, > success was assigned also by: > > success = checkTime(jvmti, &time, &prevTime, "VM_DEATH callback"); > > So I went to transform checkTime to return a boolean, and that rippled into changing the NSK_FALSE to false as well. > > I can reduce the scope of this webrev to only being the if statement if you prefer, I was just working on getting the various elements to bool instead of int and NSK_TRUE/NSK_FALSE. > > Another solution would be to maybe augment the scope of the bug item to : Move NSK_TRUE/NSK_FALSE to true/false; and this webrev as a side-effect covers all the "if (.* NSK_TRUE)" cases. > > What do you think? > > Jc > > > > On Fri, Aug 2, 2019 at 2:40 PM Chris Plummer > wrote: > >> Hi JC, >> >> Why does this webrev also remove references NSK_FALSE, and the previous >> one references to NSK_TRUE? >> >> thanks, >> >> Chris >> >> On 8/2/19 2:21 PM, Jean Christophe Beyler wrote: >> >> Hi all, >> >> Here is the webrev that does the removal of if (.* == NSK_TRUE) and >> replaces them with if (.*). >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8229036/webrev.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8229036 >> >> This was tested by running the tests changed on my dev machine, I'll push >> it to the submit repo after review :-) >> >> Thanks and have a great day & weekend!, >> Jc >> >> >> > > -- > > Thanks, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Aug 6 16:02:39 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 6 Aug 2019 09:02:39 -0700 Subject: RFR (M) 8229036: Remove the testing against NSK_TRUE from tests In-Reply-To: References: <0180f1e6-85e4-f174-d89e-91cae5987fce@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From robbin.ehn at oracle.com Wed Aug 7 10:16:44 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 7 Aug 2019 12:16:44 +0200 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> Message-ID: <30776902-2164-9654-3c08-0656d30d495c@oracle.com> Hi Daniil, On 8/3/19 12:16 AM, Daniil Titov wrote: >> The is_dead parameter still bothers me here. I can't make enough sense >> out of the template code in ConcurrentHashtable to see why we have to >> have it, but I'm concerned that its very existence means we perhaps >> should not be trying to extend CHT in this context. ?? > > My understanding is that is_dead parameter provides a mechanism for > ConcurrentHashtable to remove stale entries that were not explicitly > removed by calling ConcurrentHashTable::remove() method. > I think that just because in our case we don't use this mechanism doesn't > mean we should not use ConcurrentHashTable. is_dead is an optimization for cleaning on inserts. When we have indirect to oops as values, they can be dead when loading them. But feeding back this information we reduce the chance of having to do a table scan to remove does dead entries. Should be refactored. CHT set it default to false, if you don't touch it it will just be false and unused, thus compiler can remove that code. > >> I would still want to see what impact this has on thread >> startup cost, both with and without the table being initialized. > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > starts some threads as a worm-up, and then creates and starts 100,000 threads > (each thread just sleeps for 100 ms). In case when the thread table is enabled > 100,000 threads are created and started for about 15200 ms. If the thread table > is off the test takes about 14800 ms. Based on this information the enabled > thread table makes the thread startup about 2.7% slower. > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ Seems good! Thanks, Robbin > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > Thanks! > --Daniil > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > Hi Daniil, > > Overall I think this is a reasonable approach but I would still like to > see some performance and footprint numbers, both to verify it fixes the > problem reported, and that we are not getting penalized elsewhere. > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > Hi David, Daniel, and Serguei, > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > The initialization allows the created but unpopulated, or partially > populated, table to be seen by other threads - is that your intention? > It seems it should be okay as the other threads will then race with the > initializing thread to add specific entries, and this is a concurrent > map so that should be functionally correct. But if so then I think you > can also reduce the scope of the ThreadTableCreate_lock so that it > covers creation of the table only, not the initial population of the table. > > I like the approach of only initializing the table when needed and using > that to control when the add/remove-thread code needs to update the > table. But I would still want to see what impact this has on thread > startup cost, both with and without the table being initialized. > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > as Daniel suggested. > > Not sure it's best to combine these, but if they are limited to the > changes in management.cpp only then that may be okay. It helps to be > able to focus on the table related changes without being distracted by > other optimizations. > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > to strip it of the all functionality that is not required in the thread table case. > > The revised version seems better in that regard. But I still have a > concern, see below. > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > growing the thread table when required. > > Yes but why? Why can't this table be grown on demand by the thread that > is doing the addition? For other tables we may have to delegate to the > service thread because the current thread cannot perform the action, or > it doesn't want to perform it at the time the need for the resize is > detected (e.g. its detected at a safepoint and you want the resize to > happen later outside the safepoint). It's not apparent to me that such > restrictions apply here. > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > already has ConcurrentHashTable doesn't seem reasonable for me. > > Ok. > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > Some specific code comments: > > src/hotspot/share/runtime/mutexLocker.cpp > > + def(ThreadTableCreate_lock , PaddedMutex , special, > false, Monitor::_safepoint_check_never); > > I think this needs to be a _safepoint_check_always lock. The table will > be created by regular JavaThreads and they should (nearly) always be > checking for safepoints if they are going to block acquiring the lock. > And it isn't at all obvious that the thread doing the creation can't go > to a safepoint whilst this lock is held. > > --- > > src/hotspot/share/runtime/threadSMR.cpp > > Nit: > > 618 JavaThread* thread = thread_at(i); > > you could reuse the new java_thread local you introduced at line 613 and > just rename that "new" variable to "thread" so you don't have to change > all other uses. > > 628 } else if (java_thread != NULL && ... > > You don't need to check != NULL here as you only get here when > java_thread is not NULL. > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > I think it cleaner/better to just use > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > as we know thread is not NULL, it is a JavaThread and it has to have a > non-null threadObj. > > --- > > src/hotspot/share/services/management.cpp > > 1323 if (THREAD->is_Java_thread()) { > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > These calls can only be made on a JavaThread so this be simplified to > remove the is_Java_thread() call. Similarly in other places. > > --- > > src/hotspot/share/services/threadTable.cpp > > 55 class ThreadTableEntry : public CHeapObj { > 56 private: > 57 jlong _tid; > > I believe hotspot style is to not indent the access modifiers in C++ > class declarations, so the above would just be: > > 55 class ThreadTableEntry : public CHeapObj { > 56 private: > 57 jlong _tid; > > etc. > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > 61 _tid(tid),_java_thread(java_thread) {} > > line 61 should be indented as it continues line 60. > > 67 class ThreadTableConfig : public AllStatic { > ... > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > The is_dead parameter still bothers me here. I can't make enough sense > out of the template code in ConcurrentHashtable to see why we have to > have it, but I'm concerned that its very existence means we perhaps > should not be trying to extend CHT in this context. ?? > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > 116 ? size_log : DefaultThreadTableSizeLog; > > line 116 should be indented, though in this case I think a better layout > would be: > > 115 size_t start_size_log = > 116 size_log > DefaultThreadTableSizeLog ? size_log : > DefaultThreadTableSizeLog; > > 131 double ThreadTable::get_load_factor() { > 132 return (double)_items_count/_current_size; > 133 } > > Not sure that is doing what you want/expect. It will perform integer > division and then cast that whole integer to a double. If you want > double arithmetic you need: > > return ((double)_items_count)/_current_size; > > 180 jlong _tid; > 181 uintx _hash; > > Nit: no need for all those spaces before the variable name. > > 183 ThreadTableLookup(jlong tid) > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > line 184 should be indented. > > 201 ThreadGet():_return(NULL) {} > > Nit: need space after : > > 211 assert(_is_initialized, "Thread table is not initialized"); > 212 _has_work = false; > > line 211 is indented one space too far. > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > Nit: need space after , > > 252 return _local_table->remove(thread,lookup); > > Nit: need space after , > > Thanks, > David > ------ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > Thanks! > > --Daniil > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > Hi Serguei and David, > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > I have the same concerns as David H. about this new ThreadTable. > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > in src/hotspot/share/services/management.cpp so I think that table > > needs to enabled and populated only if it is going to be used. > > > > I've taken a look at the webrev below and I see that David has > > followed up with additional comments. Before I do a crawl through > > code review for this, I would like to see the ThreadTable stuff > > made optional and David's other comments addressed. > > > > Another possible optimization is for callers of > > find_JavaThread_from_java_tid() to save the calling thread's > > tid value before they loop and if the current tid == saved_tid > > then use the current JavaThread* instead of calling > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > Dan > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > --Daniil > > > > > > From: > > > Organization: Oracle Corporation > > > Date: Friday, June 28, 2019 at 7:56 PM > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > Hi Daniil, > > > > > > I have several quick comments. > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > 619 // to the thread table. > > > 620 for (uint i = 0; i < length(); i++) { > > > 621 JavaThread* thread = thread_at(i); > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > 623 ThreadTable::add_thread(java_tid, thread); > > > 624 return thread; > > > 625 } > > > 626 } > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > 628 return java_thread; > > > 629 } > > > 630 return NULL; > > > 631 } > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > 633 oop tobj = java_thread->threadObj(); > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > 635 // or is starting to exit. > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > 638 } > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > A space is missed after the comma: > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > An empty line is needed before L632. > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > It'd better to list parameters in the opposite order. > > > > > > The call to is_valid_java_thread() is confusing: > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > Thanks, > > > Serguei > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > Hi Daniil, > > > > > > The definition and use of this hashtable (yet another hashtable > > > implementation!) will need careful examination. We have to be concerned > > > about the cost of maintaining it when it may never even be queried. You > > > would need to look at footprint cost and performance impact. > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > next few days. I will try to look at this asap next week, but we will > > > need a lot more data on it. > > > > > > Thanks, > > > David > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > in the thread table. > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > > > > Best regards, > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From daniil.x.titov at oracle.com Wed Aug 7 22:38:55 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 07 Aug 2019 15:38:55 -0700 Subject: RFR: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed Message-ID: Please review the change that fixes the failing tests when running with Graal. The issue originally included several vmTestbase/nsk/jdi tests but only 2 of them still fail: - vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java - vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java The problem with these two tests is that they consume all memory to force the class unloading that results in the exception during JVMCI compiler initialization and the test failure. The fix filters these tests out to not run with Graal compiler. Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 Thanks, Daniil From david.holmes at oracle.com Thu Aug 8 00:11:50 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 8 Aug 2019 10:11:50 +1000 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: <5D4B58E9.2070002@oracle.com> References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> Message-ID: <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: > Hi Severin, Bob, > > ? Thank you for reviewing the code. > > On 8/7/19, 11:38 AM, Bob Vandette wrote: >> Can?t you come up with a better way of synchronizing the test by >> possibly writing a >> file and waiting for it to exist with a timeout? > I will try out this approach. This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? David ----- > Thanks, > Misha >> Isn?t there a shared volume between the two >> processes? >> >> We?ve been fighting test reliability for a while now.? I can only hope >> we?re getting >> to the end. >> >> Bob. >> >>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf? wrote: >>> >>> Hi Misha, >>> >>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>> Please review this change that fixes a container test >>>> TestJcmdWithSideCar. >>>> >>>> My investigation indicated that a root cause for this failure is: >>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>> been loaded yet. >>>> The target test JVM has started, it is initializing, but has not loaded >>>> the main test class. >>> That's what I've found too. >>> >>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>> sleep in between. >>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>> >>>> Also I have commented out the testCase02() due to another bug: >>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>> which is not a test bug. IMO, it is better to run the test and skip a >>>> sub-case than to skip the entire test. >>>> >>>> ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>> ???? Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>> Looks OK to me. >>> >>> Thanks, >>> Severin >>> From chris.plummer at oracle.com Thu Aug 8 01:56:42 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 7 Aug 2019 18:56:42 -0700 Subject: =?UTF-8?Q?RFR=28S=29=3a_8227645=3a_Some_tests_in_serviceability/sa_?= =?UTF-8?Q?run_with_fixed_-Xmx_values_and_risk_running_out_of_memory?= =?UTF-8?B?4oCL?= Message-ID: Hello, Please review the following: http://cr.openjdk.java.net/~cjplummer/8227645/webrev.00/webrev.open/ https://bugs.openjdk.java.net/browse/JDK-8227645 I moved the offending tests to their own directory and added "exclusiveAccess.dirs=." for that directory. There were two extra support classes I had to move also (they aren't tests), and also a minor @library fix due to a dependency on another file in the sa test directory. thanks, Chris From nick.gasson at arm.com Thu Aug 8 09:32:03 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Thu, 8 Aug 2019 17:32:03 +0800 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 Message-ID: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> Hi, Bug: https://bugs.openjdk.java.net/browse/JDK-8229118 Webrev: http://cr.openjdk.java.net/~ngasson/8229118/webrev.0/ This test starts a sub-process with -Xcomp and then uses the SA to get a stack trace of it. It expects to see this line: In code in NMethod for jdk/test/lib/apps/LingeredApp.main But actually on AArch64 the stack trace looks like this: - java.lang.Thread.sleep(long) @bci=0, pc=0x0000ffff74603d08, Method*=0x0000ffff031baf98 (Compiled frame; information may be imprecise) - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=53, line=502, pc=0x0000ffff6c9276e0, Method*=0x0000ffff03611d48 (Interpreted frame) The main method is interpreted even though we're running with -Xcomp. That's because it is deoptimized almost immediately, because main calls some methods on java.nio.file.Paths, but that class hasn't been loaded when main is compiled. X86 can patch in the address of the method on-the-fly, but AArch64 can't do this because of restrictions on which instructions can be legally rewritten. This patch lifts the code that uses the java.nio classes out of LingeredApp::main into a separate static method. LingeredApp.main now only uses classes that are loaded very early in boot, before main is compiled. The stack trace now looks like: "main" #1 prio=5 tid=0x0000ffffb4022800 nid=0xd610 waiting on condition [0x0000ffffbb755000] java.lang.Thread.State: TIMED_WAITING (sleeping) JavaThread state: _thread_blocked - java.lang.Thread.sleep(long) @bci=0, pc=0x0000fffface414c8, Method*=0x0000ffff3dac8a28 (Compiled frame; information may be imprecise) - jdk.test.lib.apps.LingeredApp.pollLockFile(java.lang.String) @bci=30, line=499, pc=0x0000ffffa50818e0, Method*=0x0000ffff3c122cf0 (Interpreted frame) - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=25, line=529, pc=0x0000ffffa59afd58, Method*=0x0000ffff3c122de0 (Compiled frame) I.e. pollLockFile was deoptimized to an interpreted frame but LingeredApp.main is still a compiled frame which is what ClhsdbFindPC is looking for. This solution does seem a bit hacky, so if it's not acceptable an alternative is to just skip the -Xcomp part of the test on AArch64. Ran a full jtreg test on AArch64/x86 to check for regressions. Thanks, Nick From adinn at redhat.com Thu Aug 8 10:16:40 2019 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 8 Aug 2019 11:16:40 +0100 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> Message-ID: <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> Hi Nick, On 08/08/2019 10:32, Nick Gasson wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8229118 > Webrev: http://cr.openjdk.java.net/~ngasson/8229118/webrev.0/ > > This test starts a sub-process with -Xcomp and then uses the SA to get a > stack trace of it. It expects to see this line: > > In code in NMethod for jdk/test/lib/apps/LingeredApp.main > > But actually on AArch64 the stack trace looks like this: > > - java.lang.Thread.sleep(long) @bci=0, pc=0x0000ffff74603d08, Method*=0x0000ffff031baf98 (Compiled frame; information may be imprecise) > - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=53, line=502, pc=0x0000ffff6c9276e0, Method*=0x0000ffff03611d48 (Interpreted frame) > > The main method is interpreted even though we're running with > -Xcomp. That's because it is deoptimized almost immediately, because > main calls some methods on java.nio.file.Paths, but that class hasn't > been loaded when main is compiled. > > X86 can patch in the address of the method on-the-fly, but AArch64 can't > do this because of restrictions on which instructions can be legally > rewritten. > > This patch lifts the code that uses the java.nio classes out of > LingeredApp::main into a separate static method. LingeredApp.main now > only uses classes that are loaded very early in boot, before main is > compiled. The stack trace now looks like: > > "main" #1 prio=5 tid=0x0000ffffb4022800 nid=0xd610 waiting on condition [0x0000ffffbb755000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > JavaThread state: _thread_blocked > - java.lang.Thread.sleep(long) @bci=0, pc=0x0000fffface414c8, Method*=0x0000ffff3dac8a28 (Compiled frame; information may be imprecise) > - jdk.test.lib.apps.LingeredApp.pollLockFile(java.lang.String) @bci=30, line=499, pc=0x0000ffffa50818e0, Method*=0x0000ffff3c122cf0 (Interpreted frame) > - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=25, line=529, pc=0x0000ffffa59afd58, Method*=0x0000ffff3c122de0 (Compiled frame) > > I.e. pollLockFile was deoptimized to an interpreted frame but > LingeredApp.main is still a compiled frame which is what ClhsdbFindPC is > looking for. > > This solution does seem a bit hacky, so if it's not acceptable an > alternative is to just skip the -Xcomp part of the test on AArch64. > > Ran a full jtreg test on AArch64/x86 to check for regressions. Yuck! That's a nice hack to avoid the indeterminate effect of -Xcomp. However, my gut feeling is still that relying on -Xcomp in tests is just a /really/ bad idea and I'd prefer to omit it but . . . I'm not 100% clear what the point of this test is but it looks like it is meant to exercise the stack backtrace code when there is a compiled method on the stack. If so then I guess your hack fits the bill while removing the -Xcomp flag from the command line would not fulfil the test's remit. If that is the point of the test then I agree, reluctantly, that your hack is the right solution. On those grounds I'm happy to accept the patch. However, I'd prefer someone else (Andrew Haley?) also to review this before it gets pushed. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Thu Aug 8 10:37:05 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 8 Aug 2019 11:37:05 +0100 Subject: [aarch64-port-dev ] RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> Message-ID: On 8/8/19 11:16 AM, Andrew Dinn wrote: > I'm not 100% clear what the point of this test is but it looks like it > is meant to exercise the stack backtrace code when there is a compiled > method on the stack. If so then I guess your hack fits the bill while > removing the -Xcomp flag from the command line would not fulfil the > test's remit. If that is the point of the test then I agree, > reluctantly, that your hack is the right solution. On those grounds I'm > happy to accept the patch. However, I'd prefer someone else (Andrew > Haley?) also to review this before it gets pushed. Eww. I suppose that -Xcomp often always fails on AArch64 because we deoptimize so readily. Given that the test is not supposed to be testing -Xcomp but SA I guess the test is OK, but it's very fragile. It does mean that any tests which depend on -Xcomp don't really work on AArch64. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From jcbeyler at google.com Thu Aug 8 11:31:19 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Thu, 8 Aug 2019 04:31:19 -0700 Subject: =?UTF-8?Q?Re=3A_RFR=28S=29=3A_8227645=3A_Some_tests_in_serviceability=2Fsa?= =?UTF-8?Q?_run_with_fixed_=2DXmx_values_and_risk_running_out_of_memory?= =?UTF-8?Q?=E2=80=8B?= In-Reply-To: References: Message-ID: Hi Chris, Looks good to me, Jc On Wed, Aug 7, 2019 at 6:57 PM Chris Plummer wrote: > Hello, > > Please review the following: > > http://cr.openjdk.java.net/~cjplummer/8227645/webrev.00/webrev.open/ > https://bugs.openjdk.java.net/browse/JDK-8227645 > > I moved the offending tests to their own directory and added > "exclusiveAccess.dirs=." for that directory. There were two extra > support classes I had to move also (they aren't tests), and also a minor > @library fix due to a dependency on another file in the sa test directory. > > thanks, > > Chris > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Thu Aug 8 11:38:25 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Thu, 8 Aug 2019 04:38:25 -0700 Subject: RFR: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: References: Message-ID: Hi Daniil, Looks good to me, Jc On Wed, Aug 7, 2019 at 3:39 PM Daniil Titov wrote: > Please review the change that fixes the failing tests when running with > Graal. The issue originally > included several vmTestbase/nsk/jdi tests but only 2 of them still fail: > - > vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java > - > vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java > > The problem with these two tests is that they consume all memory to force > the class unloading that > results in the exception during JVMCI compiler initialization and the test > failure. > > The fix filters these tests out to not run with Graal compiler. > > Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 > > Thanks, > Daniil > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Aug 8 14:45:14 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 08 Aug 2019 07:45:14 -0700 Subject: RFR(S): 8227645: Some tests in serviceability/sa run with fixed -Xmx values and risk running out of =?UTF-8?B?bWVtb3J54oCL?= In-Reply-To: <7600B13D-6FA5-46FA-83E6-710E85695A5E@oracle.com> References: <7600B13D-6FA5-46FA-83E6-710E85695A5E@oracle.com> Message-ID: Hi Chris, The change looks good to me. Thanks, Daniil ?On 8/7/19, 6:57 PM, "serviceability-dev on behalf of Chris Plummer" wrote: Hello, Please review the following: http://cr.openjdk.java.net/~cjplummer/8227645/webrev.00/webrev.open/ https://bugs.openjdk.java.net/browse/JDK-8227645 I moved the offending tests to their own directory and added "exclusiveAccess.dirs=." for that directory. There were two extra support classes I had to move also (they aren't tests), and also a minor @library fix due to a dependency on another file in the sa test directory. thanks, Chris From chris.plummer at oracle.com Thu Aug 8 17:21:07 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 8 Aug 2019 10:21:07 -0700 Subject: RFR: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: References: Message-ID: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> Hi Daniil, My only objection is at some point it seems we need to be able to run these tests with graal (and other tests that have been disabled due to graal) because graal might be the only compiler, and we'll lose test coverage without these tests. Currently we have 260 jtreg tests disabled due to graal. I'm not sure to what extent they are waiting on graal fixes or otherwise have a bug filed to eventually fix them. Would be nice if we had a process in place to make sure these issues are eventually addressed. That fact that tests that exhaust memory in general seem to be incompatible with graal would to be the bigger issue that needs to be addressed. thanks, Chris On 8/7/19 3:38 PM, Daniil Titov wrote: > Please review the change that fixes the failing tests when running with Graal. The issue originally > included several vmTestbase/nsk/jdi tests but only 2 of them still fail: > - vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java > - vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java > > The problem with these two tests is that they consume all memory to force the class unloading that > results in the exception during JVMCI compiler initialization and the test failure. > > The fix filters these tests out to not run with Graal compiler. > > Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 > > Thanks, > Daniil > > From dean.long at oracle.com Thu Aug 8 23:33:37 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 8 Aug 2019 16:33:37 -0700 Subject: RFR: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> References: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> Message-ID: <8df66d4e-df9d-c502-3510-30c22ba58445@oracle.com> This is the kind of failure that is expected to go away with libgraal.? You can add the tests to the Graal-specific problem list (see JDK-8196611) and they should be re-enabled with libgraal (see JDK-JDK-8207267). dl On 8/8/19 10:21 AM, Chris Plummer wrote: > Hi Daniil, > > My only objection is at some point it seems we need to be able to run > these tests with graal (and other tests that have been disabled due to > graal) because graal might be the only compiler, and we'll lose test > coverage without these tests. Currently we have 260 jtreg tests > disabled due to graal. I'm not sure to what extent they are waiting on > graal fixes or otherwise have a bug filed to eventually fix them. > Would be nice if we had a process in place to make sure these issues > are eventually addressed. That fact that tests that exhaust memory in > general seem to be incompatible with graal would to be the bigger > issue that needs to be addressed. > > thanks, > > Chris > > On 8/7/19 3:38 PM, Daniil Titov wrote: >> Please review the change that fixes the failing tests when running >> with Graal. The issue originally >> included several vmTestbase/nsk/jdi tests but only 2 of them still fail: >> - >> vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java >> - >> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java >> >> The problem with these two tests is that they consume all memory to >> force the class unloading that >> results in the exception during JVMCI compiler initialization and the >> test failure. >> ? The fix filters these tests out to not run with Graal compiler. >> >> Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 >> >> Thanks, >> Daniil >> >> > From chris.plummer at oracle.com Fri Aug 9 00:08:11 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 8 Aug 2019 17:08:11 -0700 Subject: RFR: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: <8df66d4e-df9d-c502-3510-30c22ba58445@oracle.com> References: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> <8df66d4e-df9d-c502-3510-30c22ba58445@oracle.com> Message-ID: <15699850-37f5-93f6-6a55-525a4d099bd3@oracle.com> That? sounds like a better approach to me. thanks, Chris On 8/8/19 4:33 PM, dean.long at oracle.com wrote: > This is the kind of failure that is expected to go away with libgraal. > You can add the tests to the Graal-specific problem list (see > JDK-8196611) and they should be re-enabled with libgraal (see > JDK-JDK-8207267). > > dl > > On 8/8/19 10:21 AM, Chris Plummer wrote: >> Hi Daniil, >> >> My only objection is at some point it seems we need to be able to run >> these tests with graal (and other tests that have been disabled due >> to graal) because graal might be the only compiler, and we'll lose >> test coverage without these tests. Currently we have 260 jtreg tests >> disabled due to graal. I'm not sure to what extent they are waiting >> on graal fixes or otherwise have a bug filed to eventually fix them. >> Would be nice if we had a process in place to make sure these issues >> are eventually addressed. That fact that tests that exhaust memory in >> general seem to be incompatible with graal would to be the bigger >> issue that needs to be addressed. >> >> thanks, >> >> Chris >> >> On 8/7/19 3:38 PM, Daniil Titov wrote: >>> Please review the change that fixes the failing tests when running >>> with Graal. The issue originally >>> included several vmTestbase/nsk/jdi tests but only 2 of them still >>> fail: >>> - >>> vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java >>> - >>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java >>> >>> The problem with these two tests is that they consume all memory to >>> force the class unloading that >>> results in the exception during JVMCI compiler initialization and >>> the test failure. >>> ? The fix filters these tests out to not run with Graal compiler. >>> >>> Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 >>> >>> Thanks, >>> Daniil >>> >>> >> > From chris.plummer at oracle.com Fri Aug 9 02:42:53 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 8 Aug 2019 19:42:53 -0700 Subject: RFR: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: <15699850-37f5-93f6-6a55-525a4d099bd3@oracle.com> References: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> <8df66d4e-df9d-c502-3510-30c22ba58445@oracle.com> <15699850-37f5-93f6-6a55-525a4d099bd3@oracle.com> Message-ID: <95eb05da-cf13-f14b-74a6-9e8bf604b29b@oracle.com> Actually looking at JDK-8207267 a little closer, it looks like it's job is to re-enable tests that have been disabled with @requires !vm.graal.enabled, so it looks like we have two different approaches going in here. Which is preferred? If the preference is to problem list, do we want to undo JDK-8207261 (except use JDK-8196611 as the CR). Chris On 8/8/19 5:08 PM, Chris Plummer wrote: > That sounds like a better approach to me. > > thanks, > > Chris > > On 8/8/19 4:33 PM, dean.long at oracle.com wrote: >> This is the kind of failure that is expected to go away with >> libgraal. You can add the tests to the Graal-specific problem list >> (see JDK-8196611) and they should be re-enabled with libgraal (see >> JDK-JDK-8207267). >> >> dl >> >> On 8/8/19 10:21 AM, Chris Plummer wrote: >>> Hi Daniil, >>> >>> My only objection is at some point it seems we need to be able to >>> run these tests with graal (and other tests that have been disabled >>> due to graal) because graal might be the only compiler, and we'll >>> lose test coverage without these tests. Currently we have 260 jtreg >>> tests disabled due to graal. I'm not sure to what extent they are >>> waiting on graal fixes or otherwise have a bug filed to eventually >>> fix them. Would be nice if we had a process in place to make sure >>> these issues are eventually addressed. That fact that tests that >>> exhaust memory in general seem to be incompatible with graal would >>> to be the bigger issue that needs to be addressed. >>> >>> thanks, >>> >>> Chris >>> >>> On 8/7/19 3:38 PM, Daniil Titov wrote: >>>> Please review the change that fixes the failing tests when running >>>> with Graal. The issue originally >>>> included several vmTestbase/nsk/jdi tests but only 2 of them still >>>> fail: >>>> - >>>> vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java >>>> - >>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java >>>> >>>> The problem with these two tests is that they consume all memory to >>>> force the class unloading that >>>> results in the exception during JVMCI compiler initialization and >>>> the test failure. >>>> ? The fix filters these tests out to not run with Graal compiler. >>>> >>>> Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 >>>> >>>> Thanks, >>>> Daniil >>>> >>>> >>> >> > > From dean.long at oracle.com Fri Aug 9 22:37:03 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 9 Aug 2019 15:37:03 -0700 Subject: RFR: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: <95eb05da-cf13-f14b-74a6-9e8bf604b29b@oracle.com> References: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> <8df66d4e-df9d-c502-3510-30c22ba58445@oracle.com> <15699850-37f5-93f6-6a55-525a4d099bd3@oracle.com> <95eb05da-cf13-f14b-74a6-9e8bf604b29b@oracle.com> Message-ID: <3c292c47-2962-d79b-e99d-7c3b7376ac5e@oracle.com> Good question? When we have libgraal, there will still be an option (at least for debugging) to turn it off and use Graal the same way we do now, so it seems like the @requires would need to take that into account once we have libgraal.? Maybe we will need a new "vm.libgraal.enabled" or make "vm.graal.enabled" be false for libgraal? It does seem a little backwards to require tests to know about the OOM handling details of different JVM features.? Instead, how about if we let the test assert that it requires "vm.no-background-oom" or whatever, and let the JVM decide if it supports it. CC'ing hotspot-compiler-dev. dl On 8/8/19 7:42 PM, Chris Plummer wrote: > Actually looking at JDK-8207267 a little closer, it looks like it's > job is to re-enable tests that have been disabled with @requires > !vm.graal.enabled, so it looks like we have two different approaches > going in here. Which is preferred? If the preference is to problem > list, do we want to undo JDK-8207261 (except use JDK-8196611 as the CR). > > Chris > > On 8/8/19 5:08 PM, Chris Plummer wrote: >> That sounds like a better approach to me. >> >> thanks, >> >> Chris >> >> On 8/8/19 4:33 PM, dean.long at oracle.com wrote: >>> This is the kind of failure that is expected to go away with >>> libgraal. You can add the tests to the Graal-specific problem list >>> (see JDK-8196611) and they should be re-enabled with libgraal (see >>> JDK-JDK-8207267). >>> >>> dl >>> >>> On 8/8/19 10:21 AM, Chris Plummer wrote: >>>> Hi Daniil, >>>> >>>> My only objection is at some point it seems we need to be able to >>>> run these tests with graal (and other tests that have been disabled >>>> due to graal) because graal might be the only compiler, and we'll >>>> lose test coverage without these tests. Currently we have 260 jtreg >>>> tests disabled due to graal. I'm not sure to what extent they are >>>> waiting on graal fixes or otherwise have a bug filed to eventually >>>> fix them. Would be nice if we had a process in place to make sure >>>> these issues are eventually addressed. That fact that tests that >>>> exhaust memory in general seem to be incompatible with graal would >>>> to be the bigger issue that needs to be addressed. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 8/7/19 3:38 PM, Daniil Titov wrote: >>>>> Please review the change that fixes the failing tests when running >>>>> with Graal. The issue originally >>>>> included several vmTestbase/nsk/jdi tests but only 2 of them still >>>>> fail: >>>>> - >>>>> vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java >>>>> - >>>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java >>>>> >>>>> The problem with these two tests is that they consume all memory >>>>> to force the class unloading that >>>>> results in the exception during JVMCI compiler initialization and >>>>> the test failure. >>>>> ? The fix filters these tests out to not run with Graal compiler. >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 >>>>> >>>>> Thanks, >>>>> Daniil >>>>> >>>>> >>>> >>> >> >> > > From yasuenag at gmail.com Sat Aug 10 11:14:02 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Sat, 10 Aug 2019 20:14:02 +0900 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: References: Message-ID: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> PING: Could you review it? > JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 > webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ Yasumasa On 2019/07/24 10:18, Yasumasa Suenaga wrote: > Hi all, > > Please review this change: > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ > > This enhancement has been proposed in [1]. > > SALauncher (jhsdb implementation) processes the option for each subcommand (e.g. jstack, hsdb). > But they exist in many place with similar code. > So there is some room for refactoring. > > This change has passed the tests on submit repo and serviceability/sa tests. > > > Thanks, > > Yasumasa > > > [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028376.html From poonam.bajaj at oracle.com Sun Aug 11 14:25:42 2019 From: poonam.bajaj at oracle.com (Poonam Parhar) Date: Sun, 11 Aug 2019 07:25:42 -0700 Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC In-Reply-To: References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com> <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com> <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com> <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com> Message-ID: <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com> Hello, The fix for this bug had to be backed out with '8227178: Backout of 8215523' because it had caused timeout failures for some of the CMS tests. Those failures get? resolved by adding the following check before calling recalculate_used_stable() in CompactibleFreeListSpace::allocate(): 1387 // During GC we do not need to recalculate the stable used value for 1388 // every allocation in old gen. It is done once at the end of GC instead 1389 // for performance reasons. 1390 if (!CMSHeap::heap()->is_gc_active()) { 1391 recalculate_used_stable(); 1392 } 1393 Please review the updated webrev: http://cr.openjdk.java.net/~poonam/8215523/webrev.02/ Thanks, Poonam On 7/2/19 6:42 AM, Poonam Parhar wrote: > Hi Aleksey, Thomas, > > It wasn't meant to be non-public. I have opened it. > > Thanks, > Poonam > > On 7/2/19 3:36 AM, Thomas Schatzl wrote: >> Hi, >> >> On Tue, 2019-07-02 at 10:10 +0200, Aleksey Shipilev wrote: >>> Hi, >>> >>> On 6/21/19 10:30 PM, Poonam Parhar wrote: >>>> On 6/21/19 12:21 PM, Poonam Parhar wrote: >>>>> Bug 8215523 : >>>>> jstat reports incorrect values for >>>>> OU for CMS GC >>> This bug is non-public, was it really meant to be? >>> >> ?? there does not seem to be anything confidential in the public areas >> of the bug. Maybe Poonam can open it after looking at it again, and >> eventually open it (and add a token "Description" ;) ). >> >> Thanks, >> ?? Thomas >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Mon Aug 12 03:49:57 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Sun, 11 Aug 2019 20:49:57 -0700 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> Message-ID: Hi David, Robbin, Daniel, and Serguei, Please review a new version of the fix. As David suggested I created a separated Jira issue [1] to cover additional optimization for some callers of find_JavaThread_from_java_tid() and this version of the fix no longer includes changes in management.cpp ( and the test related with these changes). Regarding the impact the previous version of the fix had on the thread startup time at heavy load (e.g. when 5000 threads are created and destroyed every second) I tried a different approach that makes calls to ThreadTable::add_thread and ThreadTable::remove_thread asynchronous and offloads the work for actual modifications of the thread table to a periodic task that runs every 5 seconds. With the same stress test scenario (the test does some warm-up and then measures the time it takes to create and start 100,000 threads; every thread just sleeps for 100 ms) the impact on the thread startup time was reduced to 1.2% ( from 2.7%). The cause of this impact in this stress test scenario is that as soon as the thread table is initialized, an additional work to insert and delete entries in the thread table should be performed, even if com.sun.management.ThreadMXBean methods are no longer called. For example, In the stress test mentioned above, every second about 5000 entries had to be inserted in the table and then deleted. That doesn't look right and the new version of the fix uses the different approach: the thread is added to the thread table only when this thread is requested by com.sun.management.ThreadMXBean bean. Every time when find_JavaThread_from_java_tid() is called for a new tid, the thread is found by the iterating over the thread list and added to the thread table. All consequent calls to find_JavaThread_from_java_tid() for the same tid returns the thread from the thread table. Running stress test for the cases when the thread table is enabled and not showed no difference in the average thread startup times. [1] : https://bugs.openjdk.java.net/browse/JDK-8229391 Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.05/ Thanks, Daniil ?On 8/4/19, 7:54 PM, "David Holmes" wrote: Hi Daniil, On 3/08/2019 8:16 am, Daniil Titov wrote: > Hi David, > > Thank you for your detailed review. Please review a new version of the fix that includes > the changes you suggested: > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > - ThreadTableCreate_lock is made _safepoint_check_always; Okay. > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > the thread table is changed to grow on demand by the thread that is doing the addition; Okay - I'm happy to get the serviceThread out of the picture here. > - fixed nits and formatting issues. Okay. >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() >>> as Daniel suggested. >> Not sure it's best to combine these, but if they are limited to the >> changes in management.cpp only then that may be okay. > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > limited to management.cpp (plus a new test) so I left them in the webrev but > I also could move it in the separate issue if required. I'd prefer this part of be separated out, but won't insist. Let's see if Dan or Serguei have a strong opinion. > > src/hotspot/share/runtime/threadSMR.cpp > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > I think it cleaner/better to just use > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > as we know thread is not NULL, it is a JavaThread and it has to have a > > non-null threadObj. > > I had to leave this code unchanged since it turned out the threadObj is null > when VM is destroyed: > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > C [libjli.so+0x4333] JavaMain+0x2c3 > C [libjli.so+0x8159] ThreadJavaMain+0x9 This is actually nothing to do with the VM being destroyed, but is an issue with JNI_AttachCurrentThread and its interaction with the ThreadSMR iterators. The attach process is: - create JavaThread - mark as "is attaching via jni" - add to ThreadsList - create java.lang.Thread object (you can only execute Java code after you are attached) - mark as "attach completed" So while a thread "is attaching" it will be seen by the ThreadSMR thread iterator but will have a NULL java.lang.Thread object. We special-case attaching threads in a number of places in the VM and I think we should be explicitly doing something here to filter out attaching threads, rather than just being tolerant of a NULL j.l.Thread object. Specifically in ThreadsSMRSupport::add_thread: if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { jlong tid = java_lang_Thread::thread_id(thread->threadObj()); ThreadTable::add_thread(tid, thread); } Note that in ThreadsSMRSupport::remove_thread we can use the same guard, which covers the case the JNI attach encountered an error trying to create the j.l.Thread object. >> src/hotspot/share/services/threadTable.cpp >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > >> The is_dead parameter still bothers me here. I can't make enough sense >> out of the template code in ConcurrentHashtable to see why we have to >> have it, but I'm concerned that its very existence means we perhaps >> should not be trying to extend CHT in this context. ?? > > My understanding is that is_dead parameter provides a mechanism for > ConcurrentHashtable to remove stale entries that were not explicitly > removed by calling ConcurrentHashTable::remove() method. > I think that just because in our case we don't use this mechanism doesn't > mean we should not use ConcurrentHashTable. Can you confirm that this usage is okay with Robbin Ehn please. He's back from vacation this week. >> I would still want to see what impact this has on thread >> startup cost, both with and without the table being initialized. > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > starts some threads as a worm-up, and then creates and starts 100,000 threads > (each thread just sleeps for 100 ms). In case when the thread table is enabled > 100,000 threads are created and started for about 15200 ms. If the thread table > is off the test takes about 14800 ms. Based on this information the enabled > thread table makes the thread startup about 2.7% slower. That doesn't sound very good. I think we may need to Claes involved to help investigate overall performance impact here. > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 No further code comments. I didn't look at the test in detail. Thanks, David > Thanks! > --Daniil > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > Hi Daniil, > > Overall I think this is a reasonable approach but I would still like to > see some performance and footprint numbers, both to verify it fixes the > problem reported, and that we are not getting penalized elsewhere. > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > Hi David, Daniel, and Serguei, > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > The initialization allows the created but unpopulated, or partially > populated, table to be seen by other threads - is that your intention? > It seems it should be okay as the other threads will then race with the > initializing thread to add specific entries, and this is a concurrent > map so that should be functionally correct. But if so then I think you > can also reduce the scope of the ThreadTableCreate_lock so that it > covers creation of the table only, not the initial population of the table. > > I like the approach of only initializing the table when needed and using > that to control when the add/remove-thread code needs to update the > table. But I would still want to see what impact this has on thread > startup cost, both with and without the table being initialized. > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > as Daniel suggested. > > Not sure it's best to combine these, but if they are limited to the > changes in management.cpp only then that may be okay. It helps to be > able to focus on the table related changes without being distracted by > other optimizations. > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > to strip it of the all functionality that is not required in the thread table case. > > The revised version seems better in that regard. But I still have a > concern, see below. > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > growing the thread table when required. > > Yes but why? Why can't this table be grown on demand by the thread that > is doing the addition? For other tables we may have to delegate to the > service thread because the current thread cannot perform the action, or > it doesn't want to perform it at the time the need for the resize is > detected (e.g. its detected at a safepoint and you want the resize to > happen later outside the safepoint). It's not apparent to me that such > restrictions apply here. > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > already has ConcurrentHashTable doesn't seem reasonable for me. > > Ok. > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > Some specific code comments: > > src/hotspot/share/runtime/mutexLocker.cpp > > + def(ThreadTableCreate_lock , PaddedMutex , special, > false, Monitor::_safepoint_check_never); > > I think this needs to be a _safepoint_check_always lock. The table will > be created by regular JavaThreads and they should (nearly) always be > checking for safepoints if they are going to block acquiring the lock. > And it isn't at all obvious that the thread doing the creation can't go > to a safepoint whilst this lock is held. > > --- > > src/hotspot/share/runtime/threadSMR.cpp > > Nit: > > 618 JavaThread* thread = thread_at(i); > > you could reuse the new java_thread local you introduced at line 613 and > just rename that "new" variable to "thread" so you don't have to change > all other uses. > > 628 } else if (java_thread != NULL && ... > > You don't need to check != NULL here as you only get here when > java_thread is not NULL. > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > I think it cleaner/better to just use > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > as we know thread is not NULL, it is a JavaThread and it has to have a > non-null threadObj. > > --- > > src/hotspot/share/services/management.cpp > > 1323 if (THREAD->is_Java_thread()) { > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > These calls can only be made on a JavaThread so this be simplified to > remove the is_Java_thread() call. Similarly in other places. > > --- > > src/hotspot/share/services/threadTable.cpp > > 55 class ThreadTableEntry : public CHeapObj { > 56 private: > 57 jlong _tid; > > I believe hotspot style is to not indent the access modifiers in C++ > class declarations, so the above would just be: > > 55 class ThreadTableEntry : public CHeapObj { > 56 private: > 57 jlong _tid; > > etc. > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > 61 _tid(tid),_java_thread(java_thread) {} > > line 61 should be indented as it continues line 60. > > 67 class ThreadTableConfig : public AllStatic { > ... > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > The is_dead parameter still bothers me here. I can't make enough sense > out of the template code in ConcurrentHashtable to see why we have to > have it, but I'm concerned that its very existence means we perhaps > should not be trying to extend CHT in this context. ?? > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > 116 ? size_log : DefaultThreadTableSizeLog; > > line 116 should be indented, though in this case I think a better layout > would be: > > 115 size_t start_size_log = > 116 size_log > DefaultThreadTableSizeLog ? size_log : > DefaultThreadTableSizeLog; > > 131 double ThreadTable::get_load_factor() { > 132 return (double)_items_count/_current_size; > 133 } > > Not sure that is doing what you want/expect. It will perform integer > division and then cast that whole integer to a double. If you want > double arithmetic you need: > > return ((double)_items_count)/_current_size; > > 180 jlong _tid; > 181 uintx _hash; > > Nit: no need for all those spaces before the variable name. > > 183 ThreadTableLookup(jlong tid) > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > line 184 should be indented. > > 201 ThreadGet():_return(NULL) {} > > Nit: need space after : > > 211 assert(_is_initialized, "Thread table is not initialized"); > 212 _has_work = false; > > line 211 is indented one space too far. > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > Nit: need space after , > > 252 return _local_table->remove(thread,lookup); > > Nit: need space after , > > Thanks, > David > ------ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > Thanks! > > --Daniil > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > Hi Serguei and David, > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > I have the same concerns as David H. about this new ThreadTable. > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > in src/hotspot/share/services/management.cpp so I think that table > > needs to enabled and populated only if it is going to be used. > > > > I've taken a look at the webrev below and I see that David has > > followed up with additional comments. Before I do a crawl through > > code review for this, I would like to see the ThreadTable stuff > > made optional and David's other comments addressed. > > > > Another possible optimization is for callers of > > find_JavaThread_from_java_tid() to save the calling thread's > > tid value before they loop and if the current tid == saved_tid > > then use the current JavaThread* instead of calling > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > Dan > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > --Daniil > > > > > > From: > > > Organization: Oracle Corporation > > > Date: Friday, June 28, 2019 at 7:56 PM > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > Hi Daniil, > > > > > > I have several quick comments. > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > 619 // to the thread table. > > > 620 for (uint i = 0; i < length(); i++) { > > > 621 JavaThread* thread = thread_at(i); > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > 623 ThreadTable::add_thread(java_tid, thread); > > > 624 return thread; > > > 625 } > > > 626 } > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > 628 return java_thread; > > > 629 } > > > 630 return NULL; > > > 631 } > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > 633 oop tobj = java_thread->threadObj(); > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > 635 // or is starting to exit. > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > 638 } > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > A space is missed after the comma: > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > An empty line is needed before L632. > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > It'd better to list parameters in the opposite order. > > > > > > The call to is_valid_java_thread() is confusing: > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > Thanks, > > > Serguei > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > Hi Daniil, > > > > > > The definition and use of this hashtable (yet another hashtable > > > implementation!) will need careful examination. We have to be concerned > > > about the cost of maintaining it when it may never even be queried. You > > > would need to look at footprint cost and performance impact. > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > next few days. I will try to look at this asap next week, but we will > > > need a lot more data on it. > > > > > > Thanks, > > > David > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > in the thread table. > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > > > > Best regards, > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From sgehwolf at redhat.com Mon Aug 12 08:22:43 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 12 Aug 2019 10:22:43 +0200 Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC In-Reply-To: <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com> References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com> <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com> <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com> <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com> <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com> Message-ID: Hi, On Sun, 2019-08-11 at 07:25 -0700, Poonam Parhar wrote: > Hello, > > The fix for this bug had to be backed out with '8227178: Backout of > 8215523' because it had caused timeout failures for some of the CMS > tests. Those failures get resolved by adding the following check > before calling recalculate_used_stable() in > CompactibleFreeListSpace::allocate(): > > 1387 // During GC we do not need to recalculate the stable used value for > 1388 // every allocation in old gen. It is done once at the end of GC instead > 1389 // for performance reasons. > 1390 if (!CMSHeap::heap()->is_gc_active()) { > 1391 recalculate_used_stable(); > 1392 } > 1393 > > Please review the updated webrev: > http://cr.openjdk.java.net/~poonam/8215523/webrev.02/ + // Returns monotonically increasing stable used space bytes for CMS. + // This is required for jhat and other memory monitoring tools jhat has been removed a while ago: jhat => jstat Aside: Why has there not been a new bug filed "Redo: jstat reports incorrect values for OU for CMS GC". It's confusing to look at JDK- 8215523, see it resolved and mention a pushed commit in the comments. Isn't that what's usually been done for backouts? Thanks, Severin From david.holmes at oracle.com Mon Aug 12 08:40:38 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 12 Aug 2019 18:40:38 +1000 Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC In-Reply-To: References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com> <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com> <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com> <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com> <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com> Message-ID: <2eb84736-d28d-0107-f79c-a168faa70b94@oracle.com> Poonam, A new bug must be filed to redo the changes originally done under 8215523. Thanks, David On 12/08/2019 6:22 pm, Severin Gehwolf wrote: > Hi, > > On Sun, 2019-08-11 at 07:25 -0700, Poonam Parhar wrote: >> Hello, >> >> The fix for this bug had to be backed out with '8227178: Backout of >> 8215523' because it had caused timeout failures for some of the CMS >> tests. Those failures get resolved by adding the following check >> before calling recalculate_used_stable() in >> CompactibleFreeListSpace::allocate(): >> >> 1387 // During GC we do not need to recalculate the stable used value for >> 1388 // every allocation in old gen. It is done once at the end of GC instead >> 1389 // for performance reasons. >> 1390 if (!CMSHeap::heap()->is_gc_active()) { >> 1391 recalculate_used_stable(); >> 1392 } >> 1393 >> >> Please review the updated webrev: >> http://cr.openjdk.java.net/~poonam/8215523/webrev.02/ > > + // Returns monotonically increasing stable used space bytes for CMS. > + // This is required for jhat and other memory monitoring tools > > jhat has been removed a while ago: jhat => jstat > > Aside: Why has there not been a new bug filed "Redo: jstat reports > incorrect values for OU for CMS GC". It's confusing to look at JDK- > 8215523, see it resolved and mention a pushed commit in the comments. > Isn't that what's usually been done for backouts? > > Thanks, > Severin > From nick.gasson at arm.com Mon Aug 12 10:24:35 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Mon, 12 Aug 2019 18:24:35 +0800 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> Message-ID: <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> Thanks Andrew. Can someone from the serviceability team check this is OK to push? Nick On 08/08/2019 18:16, Andrew Dinn wrote: > Hi Nick, > > On 08/08/2019 10:32, Nick Gasson wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8229118 >> Webrev: http://cr.openjdk.java.net/~ngasson/8229118/webrev.0/ >> >> This test starts a sub-process with -Xcomp and then uses the SA to get a >> stack trace of it. It expects to see this line: >> >> In code in NMethod for jdk/test/lib/apps/LingeredApp.main >> >> But actually on AArch64 the stack trace looks like this: >> >> - java.lang.Thread.sleep(long) @bci=0, pc=0x0000ffff74603d08, Method*=0x0000ffff031baf98 (Compiled frame; information may be imprecise) >> - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=53, line=502, pc=0x0000ffff6c9276e0, Method*=0x0000ffff03611d48 (Interpreted frame) >> >> The main method is interpreted even though we're running with >> -Xcomp. That's because it is deoptimized almost immediately, because >> main calls some methods on java.nio.file.Paths, but that class hasn't >> been loaded when main is compiled. >> >> X86 can patch in the address of the method on-the-fly, but AArch64 can't >> do this because of restrictions on which instructions can be legally >> rewritten. >> >> This patch lifts the code that uses the java.nio classes out of >> LingeredApp::main into a separate static method. LingeredApp.main now >> only uses classes that are loaded very early in boot, before main is >> compiled. The stack trace now looks like: >> >> "main" #1 prio=5 tid=0x0000ffffb4022800 nid=0xd610 waiting on condition [0x0000ffffbb755000] >> java.lang.Thread.State: TIMED_WAITING (sleeping) >> JavaThread state: _thread_blocked >> - java.lang.Thread.sleep(long) @bci=0, pc=0x0000fffface414c8, Method*=0x0000ffff3dac8a28 (Compiled frame; information may be imprecise) >> - jdk.test.lib.apps.LingeredApp.pollLockFile(java.lang.String) @bci=30, line=499, pc=0x0000ffffa50818e0, Method*=0x0000ffff3c122cf0 (Interpreted frame) >> - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=25, line=529, pc=0x0000ffffa59afd58, Method*=0x0000ffff3c122de0 (Compiled frame) >> >> I.e. pollLockFile was deoptimized to an interpreted frame but >> LingeredApp.main is still a compiled frame which is what ClhsdbFindPC is >> looking for. >> >> This solution does seem a bit hacky, so if it's not acceptable an >> alternative is to just skip the -Xcomp part of the test on AArch64. >> >> Ran a full jtreg test on AArch64/x86 to check for regressions. > Yuck! That's a nice hack to avoid the indeterminate effect of -Xcomp. > However, my gut feeling is still that relying on -Xcomp in tests is just > a /really/ bad idea and I'd prefer to omit it but . . . > > I'm not 100% clear what the point of this test is but it looks like it > is meant to exercise the stack backtrace code when there is a compiled > method on the stack. If so then I guess your hack fits the bill while > removing the -Xcomp flag from the command line would not fulfil the > test's remit. If that is the point of the test then I agree, > reluctantly, that your hack is the right solution. On those grounds I'm > happy to accept the patch. However, I'd prefer someone else (Andrew > Haley?) also to review this before it gets pushed. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From robbin.ehn at oracle.com Mon Aug 12 12:22:40 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 12 Aug 2019 14:22:40 +0200 Subject: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> Message-ID: Hi Daniil, I took a new deeper dive into this. This line seems to have some issues: if (ThreadTable::is_initialized() && thread->in_thread_table() && !thread->is_attaching_via_jni()) { If you create new threads which attaches and then dies, the table will just keep growing. So you must remove them also ? Secondly you should not use volatile semantics for _in_thread_table. The load in the if-statement can be reordered with _is_initialized. Which could lead to a leak, rogue pointer in the table. So both "static volatile bool _is_initialized;" and "volatile bool _in_thread_table; " should be stored with store_release and loaded with load_acquire. Unfortunately it looks like there still would be races if ThreadTable::add_thread e.g. context switch at: if (_local_table->insert(thread, lookup, entry)) { // HERE java_thread->set_in_thread_table(true); *Remove side can pass the if-statement without removing. Since this thread also maybe exiting at any moment, e.g. context switch: if (tobj != NULL && !thread->is_exiting() && java_tid == java_lang_Thread::thread_id(tobj)) { // HERE ThreadTable::add_thread(java_tid, thread); *Add side can add a thread that is exiting. Mixing in a third thread looking up a random tid and getting a JavaThread*, it must validate it against it's ThreadsList. Making the hashtable useless. So I think the only one adding and removing should be the thread itself. 1:Add to ThreadsList 2:Add to ThreadTable 3:Remove from ThreadTable 4:Remove ThreadsList Between 1-2 and 3-4 the thread would be looked-up via linear scan. I don't see an easy way around the start-up issue with this. Maybe have the cache in Java. Pass in the thread obj into a java_sun_management_ThreadImpl_getThreadTotalCpuTime3 instead, thus skipping any look-ups in native. Thanks, Robbin On 8/12/19 5:49 AM, Daniil Titov wrote: > Hi David, Robbin, Daniel, and Serguei, > > Please review a new version of the fix. > > As David suggested I created a separated Jira issue [1] to cover additional optimization for > some callers of find_JavaThread_from_java_tid() and this version of the fix no longer includes > changes in management.cpp ( and the test related with these changes). > > Regarding the impact the previous version of the fix had on the thread startup time at heavy load (e.g. > when 5000 threads are created and destroyed every second) I tried a different approach that makes > calls to ThreadTable::add_thread and ThreadTable::remove_thread asynchronous and offloads the > work for actual modifications of the thread table to a periodic task that runs every 5 seconds. With the > same stress test scenario (the test does some warm-up and then measures the time it takes to create > and start 100,000 threads; every thread just sleeps for 100 ms) the impact on the thread startup time > was reduced to 1.2% ( from 2.7%). > > The cause of this impact in this stress test scenario is that as soon as the thread table is initialized, > an additional work to insert and delete entries in the thread table should be performed, even if > com.sun.management.ThreadMXBean methods are no longer called. For example, In the stress test > mentioned above, every second about 5000 entries had to be inserted in the table and then deleted. > > That doesn't look right and the new version of the fix uses the different approach: the thread is added to > the thread table only when this thread is requested by com.sun.management.ThreadMXBean bean. Every > time when find_JavaThread_from_java_tid() is called for a new tid, the thread is found by the iterating over > the thread list and added to the thread table. All consequent calls to find_JavaThread_from_java_tid() for > the same tid returns the thread from the thread table. > > Running stress test for the cases when the thread table is enabled and not showed no difference in the > average thread startup times. > > [1] : https://bugs.openjdk.java.net/browse/JDK-8229391 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.05/ > > Thanks, > Daniil > > ?On 8/4/19, 7:54 PM, "David Holmes" wrote: > > Hi Daniil, > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > Hi David, > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > the changes you suggested: > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > - ThreadTableCreate_lock is made _safepoint_check_always; > > Okay. > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > the thread table is changed to grow on demand by the thread that is doing the addition; > > Okay - I'm happy to get the serviceThread out of the picture here. > > > - fixed nits and formatting issues. > > Okay. > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > >>> as Daniel suggested. > >> Not sure it's best to combine these, but if they are limited to the > >> changes in management.cpp only then that may be okay. > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > limited to management.cpp (plus a new test) so I left them in the webrev but > > I also could move it in the separate issue if required. > > I'd prefer this part of be separated out, but won't insist. Let's see if > Dan or Serguei have a strong opinion. > > > > src/hotspot/share/runtime/threadSMR.cpp > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > I think it cleaner/better to just use > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > non-null threadObj. > > > > I had to leave this code unchanged since it turned out the threadObj is null > > when VM is destroyed: > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > C [libjli.so+0x4333] JavaMain+0x2c3 > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > This is actually nothing to do with the VM being destroyed, but is an > issue with JNI_AttachCurrentThread and its interaction with the > ThreadSMR iterators. The attach process is: > - create JavaThread > - mark as "is attaching via jni" > - add to ThreadsList > - create java.lang.Thread object (you can only execute Java code after > you are attached) > - mark as "attach completed" > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > iterator but will have a NULL java.lang.Thread object. > > We special-case attaching threads in a number of places in the VM and I > think we should be explicitly doing something here to filter out > attaching threads, rather than just being tolerant of a NULL j.l.Thread > object. Specifically in ThreadsSMRSupport::add_thread: > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > ThreadTable::add_thread(tid, thread); > } > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > which covers the case the JNI attach encountered an error trying to > create the j.l.Thread object. > > >> src/hotspot/share/services/threadTable.cpp > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > >> The is_dead parameter still bothers me here. I can't make enough sense > >> out of the template code in ConcurrentHashtable to see why we have to > >> have it, but I'm concerned that its very existence means we perhaps > >> should not be trying to extend CHT in this context. ?? > > > > My understanding is that is_dead parameter provides a mechanism for > > ConcurrentHashtable to remove stale entries that were not explicitly > > removed by calling ConcurrentHashTable::remove() method. > > I think that just because in our case we don't use this mechanism doesn't > > mean we should not use ConcurrentHashTable. > > Can you confirm that this usage is okay with Robbin Ehn please. He's > back from vacation this week. > > >> I would still want to see what impact this has on thread > >> startup cost, both with and without the table being initialized. > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > 100,000 threads are created and started for about 15200 ms. If the thread table > > is off the test takes about 14800 ms. Based on this information the enabled > > thread table makes the thread startup about 2.7% slower. > > That doesn't sound very good. I think we may need to Claes involved to > help investigate overall performance impact here. > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > No further code comments. > > I didn't look at the test in detail. > > Thanks, > David > > > Thanks! > > --Daniil > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > > > Hi Daniil, > > > > Overall I think this is a reasonable approach but I would still like to > > see some performance and footprint numbers, both to verify it fixes the > > problem reported, and that we are not getting penalized elsewhere. > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > Hi David, Daniel, and Serguei, > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > The initialization allows the created but unpopulated, or partially > > populated, table to be seen by other threads - is that your intention? > > It seems it should be okay as the other threads will then race with the > > initializing thread to add specific entries, and this is a concurrent > > map so that should be functionally correct. But if so then I think you > > can also reduce the scope of the ThreadTableCreate_lock so that it > > covers creation of the table only, not the initial population of the table. > > > > I like the approach of only initializing the table when needed and using > > that to control when the add/remove-thread code needs to update the > > table. But I would still want to see what impact this has on thread > > startup cost, both with and without the table being initialized. > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > as Daniel suggested. > > > > Not sure it's best to combine these, but if they are limited to the > > changes in management.cpp only then that may be okay. It helps to be > > able to focus on the table related changes without being distracted by > > other optimizations. > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > to strip it of the all functionality that is not required in the thread table case. > > > > The revised version seems better in that regard. But I still have a > > concern, see below. > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > growing the thread table when required. > > > > Yes but why? Why can't this table be grown on demand by the thread that > > is doing the addition? For other tables we may have to delegate to the > > service thread because the current thread cannot perform the action, or > > it doesn't want to perform it at the time the need for the resize is > > detected (e.g. its detected at a safepoint and you want the resize to > > happen later outside the safepoint). It's not apparent to me that such > > restrictions apply here. > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > Ok. > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > Some specific code comments: > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > false, Monitor::_safepoint_check_never); > > > > I think this needs to be a _safepoint_check_always lock. The table will > > be created by regular JavaThreads and they should (nearly) always be > > checking for safepoints if they are going to block acquiring the lock. > > And it isn't at all obvious that the thread doing the creation can't go > > to a safepoint whilst this lock is held. > > > > --- > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > Nit: > > > > 618 JavaThread* thread = thread_at(i); > > > > you could reuse the new java_thread local you introduced at line 613 and > > just rename that "new" variable to "thread" so you don't have to change > > all other uses. > > > > 628 } else if (java_thread != NULL && ... > > > > You don't need to check != NULL here as you only get here when > > java_thread is not NULL. > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > I think it cleaner/better to just use > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > non-null threadObj. > > > > --- > > > > src/hotspot/share/services/management.cpp > > > > 1323 if (THREAD->is_Java_thread()) { > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > These calls can only be made on a JavaThread so this be simplified to > > remove the is_Java_thread() call. Similarly in other places. > > > > --- > > > > src/hotspot/share/services/threadTable.cpp > > > > 55 class ThreadTableEntry : public CHeapObj { > > 56 private: > > 57 jlong _tid; > > > > I believe hotspot style is to not indent the access modifiers in C++ > > class declarations, so the above would just be: > > > > 55 class ThreadTableEntry : public CHeapObj { > > 56 private: > > 57 jlong _tid; > > > > etc. > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > 61 _tid(tid),_java_thread(java_thread) {} > > > > line 61 should be indented as it continues line 60. > > > > 67 class ThreadTableConfig : public AllStatic { > > ... > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > The is_dead parameter still bothers me here. I can't make enough sense > > out of the template code in ConcurrentHashtable to see why we have to > > have it, but I'm concerned that its very existence means we perhaps > > should not be trying to extend CHT in this context. ?? > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > line 116 should be indented, though in this case I think a better layout > > would be: > > > > 115 size_t start_size_log = > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > DefaultThreadTableSizeLog; > > > > 131 double ThreadTable::get_load_factor() { > > 132 return (double)_items_count/_current_size; > > 133 } > > > > Not sure that is doing what you want/expect. It will perform integer > > division and then cast that whole integer to a double. If you want > > double arithmetic you need: > > > > return ((double)_items_count)/_current_size; > > > > 180 jlong _tid; > > 181 uintx _hash; > > > > Nit: no need for all those spaces before the variable name. > > > > 183 ThreadTableLookup(jlong tid) > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > line 184 should be indented. > > > > 201 ThreadGet():_return(NULL) {} > > > > Nit: need space after : > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > 212 _has_work = false; > > > > line 211 is indented one space too far. > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > Nit: need space after , > > > > 252 return _local_table->remove(thread,lookup); > > > > Nit: need space after , > > > > Thanks, > > David > > ------ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > --Daniil > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > Hi Serguei and David, > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > in src/hotspot/share/services/management.cpp so I think that table > > > needs to enabled and populated only if it is going to be used. > > > > > > I've taken a look at the webrev below and I see that David has > > > followed up with additional comments. Before I do a crawl through > > > code review for this, I would like to see the ThreadTable stuff > > > made optional and David's other comments addressed. > > > > > > Another possible optimization is for callers of > > > find_JavaThread_from_java_tid() to save the calling thread's > > > tid value before they loop and if the current tid == saved_tid > > > then use the current JavaThread* instead of calling > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > Dan > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > --Daniil > > > > > > > > From: > > > > Organization: Oracle Corporation > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > Hi Daniil, > > > > > > > > I have several quick comments. > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > 619 // to the thread table. > > > > 620 for (uint i = 0; i < length(); i++) { > > > > 621 JavaThread* thread = thread_at(i); > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > 624 return thread; > > > > 625 } > > > > 626 } > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > 628 return java_thread; > > > > 629 } > > > > 630 return NULL; > > > > 631 } > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > 633 oop tobj = java_thread->threadObj(); > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > 635 // or is starting to exit. > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > 638 } > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > A space is missed after the comma: > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > An empty line is needed before L632. > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > It'd better to list parameters in the opposite order. > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > Thanks, > > > > Serguei > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > > > Hi Daniil, > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > implementation!) will need careful examination. We have to be concerned > > > > about the cost of maintaining it when it may never even be queried. You > > > > would need to look at footprint cost and performance impact. > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > next few days. I will try to look at this asap next week, but we will > > > > need a lot more data on it. > > > > > > > > Thanks, > > > > David > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > in the thread table. > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > > > > > Best regards, > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From poonam.bajaj at oracle.com Mon Aug 12 12:56:01 2019 From: poonam.bajaj at oracle.com (Poonam Parhar) Date: Mon, 12 Aug 2019 05:56:01 -0700 Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC In-Reply-To: References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com> <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com> <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com> <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com> <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com> Message-ID: Hello Severin, On 8/12/19 1:22 AM, Severin Gehwolf wrote: > Hi, > > On Sun, 2019-08-11 at 07:25 -0700, Poonam Parhar wrote: >> Hello, >> >> The fix for this bug had to be backed out with '8227178: Backout of >> 8215523' because it had caused timeout failures for some of the CMS >> tests. Those failures get resolved by adding the following check >> before calling recalculate_used_stable() in >> CompactibleFreeListSpace::allocate(): >> >> 1387 // During GC we do not need to recalculate the stable used value for >> 1388 // every allocation in old gen. It is done once at the end of GC instead >> 1389 // for performance reasons. >> 1390 if (!CMSHeap::heap()->is_gc_active()) { >> 1391 recalculate_used_stable(); >> 1392 } >> 1393 >> >> Please review the updated webrev: >> http://cr.openjdk.java.net/~poonam/8215523/webrev.02/ > + // Returns monotonically increasing stable used space bytes for CMS. > + // This is required for jhat and other memory monitoring tools > > jhat has been removed a while ago: jhat => jstat A typo from the previous changes. Will fix it. > > Aside: Why has there not been a new bug filed "Redo: jstat reports > incorrect values for OU for CMS GC". It's confusing to look at JDK- > 8215523, see it resolved and mention a pushed commit in the comments. > Isn't that what's usually been done for backouts? My mistake. I will file another bug and will then re-submit the review request. Thanks, Poonam > Thanks, > Severin > From adam.farley at uk.ibm.com Mon Aug 12 14:34:21 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Mon, 12 Aug 2019 15:34:21 +0100 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow Message-ID: Hi All, This is a known bug, mentioned in a code comment. Here is the fix for that bug. Reviewers and sponsors requested. Short version: if you set sun.boot.library.path to something beyond a system's max path length, the current code will return an empty string (rather than printing a useful error message and shutting down). This is also a problem if you've specified multiple paths with a separator, as this code seems to wrongly assess whether the *total* length exceeds max path length. So two 200 char paths on windows will cause failure, as the total length is 400 (which is beyond max length for windows). Note that the os.cpp bit of the webrev will not be included in the final webrev, it just makes this change trivially testable. Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ Best Regards Adam Farley IBM Runtimes Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Aug 12 20:35:06 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 12 Aug 2019 13:35:06 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Aug 12 20:45:02 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 12 Aug 2019 13:45:02 -0700 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> Message-ID: Hi Nick, Adding to Andrew comments, maybe the solution is to have the test extend LingeredApp so it can produce a more reliable stack trace other than the default one you get with LingeredApp. If that's too much trouble, I don't mind the solution you came up with, but seems writing a LingeredApp subclass that is specific for this test would be cleaner. thanks, Chris On 8/12/19 3:24 AM, Nick Gasson wrote: > Thanks Andrew. Can someone from the serviceability team check this is > OK to push? > > Nick > > > On 08/08/2019 18:16, Andrew Dinn wrote: >> Hi Nick, >> >> On 08/08/2019 10:32, Nick Gasson wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229118 >>> Webrev: http://cr.openjdk.java.net/~ngasson/8229118/webrev.0/ >>> >>> This test starts a sub-process with -Xcomp and then uses the SA to >>> get a >>> stack trace of it. It expects to see this line: >>> >>> ?? In code in NMethod for jdk/test/lib/apps/LingeredApp.main >>> >>> But actually on AArch64 the stack trace looks like this: >>> >>> ? - java.lang.Thread.sleep(long) @bci=0, pc=0x0000ffff74603d08, >>> Method*=0x0000ffff031baf98 (Compiled frame; information may be >>> imprecise) >>> ? - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=53, >>> line=502, pc=0x0000ffff6c9276e0, Method*=0x0000ffff03611d48 >>> (Interpreted frame) >>> >>> The main method is interpreted even though we're running with >>> -Xcomp. That's because it is deoptimized almost immediately, because >>> main calls some methods on java.nio.file.Paths, but that class hasn't >>> been loaded when main is compiled. >>> >>> X86 can patch in the address of the method on-the-fly, but AArch64 >>> can't >>> do this because of restrictions on which instructions can be legally >>> rewritten. >>> >>> This patch lifts the code that uses the java.nio classes out of >>> LingeredApp::main into a separate static method. LingeredApp.main now >>> only uses classes that are loaded very early in boot, before main is >>> compiled. The stack trace now looks like: >>> >>> "main" #1 prio=5 tid=0x0000ffffb4022800 nid=0xd610 waiting on >>> condition [0x0000ffffbb755000] >>> ??? java.lang.Thread.State: TIMED_WAITING (sleeping) >>> ??? JavaThread state: _thread_blocked >>> ? - java.lang.Thread.sleep(long) @bci=0, pc=0x0000fffface414c8, >>> Method*=0x0000ffff3dac8a28 (Compiled frame; information may be >>> imprecise) >>> ? - jdk.test.lib.apps.LingeredApp.pollLockFile(java.lang.String) >>> @bci=30, line=499, pc=0x0000ffffa50818e0, Method*=0x0000ffff3c122cf0 >>> (Interpreted frame) >>> ? - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=25, >>> line=529, pc=0x0000ffffa59afd58, Method*=0x0000ffff3c122de0 >>> (Compiled frame) >>> >>> I.e. pollLockFile was deoptimized to an interpreted frame but >>> LingeredApp.main is still a compiled frame which is what >>> ClhsdbFindPC is >>> looking for. >>> >>> This solution does seem a bit hacky, so if it's not acceptable an >>> alternative is to just skip the -Xcomp part of the test on AArch64. >>> >>> Ran a full jtreg test on AArch64/x86 to check for regressions. >> Yuck! That's a nice hack to avoid the indeterminate effect of -Xcomp. >> However, my gut feeling is still that relying on -Xcomp in tests is just >> a /really/ bad idea and I'd prefer to omit it but . . . >> >> I'm not 100% clear what the point of this test is but it looks like it >> is meant to exercise the stack backtrace code when there is a compiled >> method on the stack. If so then I guess your hack fits the bill while >> removing the -Xcomp flag from the command line would not fulfil the >> test's remit. If that is the point of the test then I agree, >> reluctantly, that your hack is the right solution. On those grounds I'm >> happy to accept the patch. However, I'd prefer someone else (Andrew >> Haley?) also to review this before it gets pushed. >> >> regards, >> >> >> Andrew Dinn >> ----------- >> Senior Principal Software Engineer >> Red Hat UK Ltd >> Registered in England and Wales under Company Registration No. 03798903 >> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander >> From mikhailo.seledtsov at oracle.com Mon Aug 12 22:59:05 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Mon, 12 Aug 2019 15:59:05 -0700 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> Message-ID: Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits? for the file to exist before running the test cases. Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. Thank you, Misha On 8/7/19 5:11 PM, David Holmes wrote: > On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >> Hi Severin, Bob, >> >> ?? Thank you for reviewing the code. >> >> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>> Can?t you come up with a better way of synchronizing the test by >>> possibly writing a >>> file and waiting for it to exist with a timeout? >> I will try out this approach. > > This seems like a fundamental problem with jcmd - so cc'ing > serviceability-dev. > > But I'm pretty sure they recently addressed a similar issue with the > premature sending of the attach signal? > > David > ----- > >> Thanks, >> Misha >>> Isn?t there a shared volume between the two >>> processes? >>> >>> We?ve been fighting test reliability for a while now.? I can only >>> hope we?re getting >>> to the end. >>> >>> Bob. >>> >>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf? >>>> wrote: >>>> >>>> Hi Misha, >>>> >>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com >>>> wrote: >>>>> Please review this change that fixes a container test >>>>> TestJcmdWithSideCar. >>>>> >>>>> My investigation indicated that a root cause for this failure is: >>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>> been loaded yet. >>>>> The target test JVM has started, it is initializing, but has not >>>>> loaded >>>>> the main test class. >>>> That's what I've found too. >>>> >>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>> sleep in between. >>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>> >>>>> Also I have commented out the testCase02() due to another bug: >>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>> sub-case than to skip the entire test. >>>>> >>>>> ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>> ???? Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>> Looks OK to me. >>>> >>>> Thanks, >>>> Severin >>>> From daniil.x.titov at oracle.com Mon Aug 12 23:24:30 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 12 Aug 2019 16:24:30 -0700 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> Message-ID: <48311B39-43F9-49E5-BEC7-A64F9D2588AF@oracle.com> Hi Robbin, Thank you very much for reviewing this version of the fix! Based on your findings it seems as it makes sense to make a step back and continue with the approach we took before in the previous version of the webrev (webrev.04), and get more information about the impact on the startup time it has. I will consult with Claus regarding this and then share the findings. Thanks again, --Daniil ?On 8/12/19, 5:22 AM, "Robbin Ehn" wrote: Hi Daniil, I took a new deeper dive into this. This line seems to have some issues: if (ThreadTable::is_initialized() && thread->in_thread_table() && !thread->is_attaching_via_jni()) { If you create new threads which attaches and then dies, the table will just keep growing. So you must remove them also ? Secondly you should not use volatile semantics for _in_thread_table. The load in the if-statement can be reordered with _is_initialized. Which could lead to a leak, rogue pointer in the table. So both "static volatile bool _is_initialized;" and "volatile bool _in_thread_table; " should be stored with store_release and loaded with load_acquire. Unfortunately it looks like there still would be races if ThreadTable::add_thread e.g. context switch at: if (_local_table->insert(thread, lookup, entry)) { // HERE java_thread->set_in_thread_table(true); *Remove side can pass the if-statement without removing. Since this thread also maybe exiting at any moment, e.g. context switch: if (tobj != NULL && !thread->is_exiting() && java_tid == java_lang_Thread::thread_id(tobj)) { // HERE ThreadTable::add_thread(java_tid, thread); *Add side can add a thread that is exiting. Mixing in a third thread looking up a random tid and getting a JavaThread*, it must validate it against it's ThreadsList. Making the hashtable useless. So I think the only one adding and removing should be the thread itself. 1:Add to ThreadsList 2:Add to ThreadTable 3:Remove from ThreadTable 4:Remove ThreadsList Between 1-2 and 3-4 the thread would be looked-up via linear scan. I don't see an easy way around the start-up issue with this. Maybe have the cache in Java. Pass in the thread obj into a java_sun_management_ThreadImpl_getThreadTotalCpuTime3 instead, thus skipping any look-ups in native. Thanks, Robbin On 8/12/19 5:49 AM, Daniil Titov wrote: > Hi David, Robbin, Daniel, and Serguei, > > Please review a new version of the fix. > > As David suggested I created a separated Jira issue [1] to cover additional optimization for > some callers of find_JavaThread_from_java_tid() and this version of the fix no longer includes > changes in management.cpp ( and the test related with these changes). > > Regarding the impact the previous version of the fix had on the thread startup time at heavy load (e.g. > when 5000 threads are created and destroyed every second) I tried a different approach that makes > calls to ThreadTable::add_thread and ThreadTable::remove_thread asynchronous and offloads the > work for actual modifications of the thread table to a periodic task that runs every 5 seconds. With the > same stress test scenario (the test does some warm-up and then measures the time it takes to create > and start 100,000 threads; every thread just sleeps for 100 ms) the impact on the thread startup time > was reduced to 1.2% ( from 2.7%). > > The cause of this impact in this stress test scenario is that as soon as the thread table is initialized, > an additional work to insert and delete entries in the thread table should be performed, even if > com.sun.management.ThreadMXBean methods are no longer called. For example, In the stress test > mentioned above, every second about 5000 entries had to be inserted in the table and then deleted. > > That doesn't look right and the new version of the fix uses the different approach: the thread is added to > the thread table only when this thread is requested by com.sun.management.ThreadMXBean bean. Every > time when find_JavaThread_from_java_tid() is called for a new tid, the thread is found by the iterating over > the thread list and added to the thread table. All consequent calls to find_JavaThread_from_java_tid() for > the same tid returns the thread from the thread table. > > Running stress test for the cases when the thread table is enabled and not showed no difference in the > average thread startup times. > > [1] : https://bugs.openjdk.java.net/browse/JDK-8229391 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.05/ > > Thanks, > Daniil > > ?On 8/4/19, 7:54 PM, "David Holmes" wrote: > > Hi Daniil, > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > Hi David, > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > the changes you suggested: > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > - ThreadTableCreate_lock is made _safepoint_check_always; > > Okay. > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > the thread table is changed to grow on demand by the thread that is doing the addition; > > Okay - I'm happy to get the serviceThread out of the picture here. > > > - fixed nits and formatting issues. > > Okay. > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > >>> as Daniel suggested. > >> Not sure it's best to combine these, but if they are limited to the > >> changes in management.cpp only then that may be okay. > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > limited to management.cpp (plus a new test) so I left them in the webrev but > > I also could move it in the separate issue if required. > > I'd prefer this part of be separated out, but won't insist. Let's see if > Dan or Serguei have a strong opinion. > > > > src/hotspot/share/runtime/threadSMR.cpp > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > I think it cleaner/better to just use > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > non-null threadObj. > > > > I had to leave this code unchanged since it turned out the threadObj is null > > when VM is destroyed: > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > C [libjli.so+0x4333] JavaMain+0x2c3 > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > This is actually nothing to do with the VM being destroyed, but is an > issue with JNI_AttachCurrentThread and its interaction with the > ThreadSMR iterators. The attach process is: > - create JavaThread > - mark as "is attaching via jni" > - add to ThreadsList > - create java.lang.Thread object (you can only execute Java code after > you are attached) > - mark as "attach completed" > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > iterator but will have a NULL java.lang.Thread object. > > We special-case attaching threads in a number of places in the VM and I > think we should be explicitly doing something here to filter out > attaching threads, rather than just being tolerant of a NULL j.l.Thread > object. Specifically in ThreadsSMRSupport::add_thread: > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > ThreadTable::add_thread(tid, thread); > } > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > which covers the case the JNI attach encountered an error trying to > create the j.l.Thread object. > > >> src/hotspot/share/services/threadTable.cpp > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > >> The is_dead parameter still bothers me here. I can't make enough sense > >> out of the template code in ConcurrentHashtable to see why we have to > >> have it, but I'm concerned that its very existence means we perhaps > >> should not be trying to extend CHT in this context. ?? > > > > My understanding is that is_dead parameter provides a mechanism for > > ConcurrentHashtable to remove stale entries that were not explicitly > > removed by calling ConcurrentHashTable::remove() method. > > I think that just because in our case we don't use this mechanism doesn't > > mean we should not use ConcurrentHashTable. > > Can you confirm that this usage is okay with Robbin Ehn please. He's > back from vacation this week. > > >> I would still want to see what impact this has on thread > >> startup cost, both with and without the table being initialized. > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > 100,000 threads are created and started for about 15200 ms. If the thread table > > is off the test takes about 14800 ms. Based on this information the enabled > > thread table makes the thread startup about 2.7% slower. > > That doesn't sound very good. I think we may need to Claes involved to > help investigate overall performance impact here. > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > No further code comments. > > I didn't look at the test in detail. > > Thanks, > David > > > Thanks! > > --Daniil > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > > > Hi Daniil, > > > > Overall I think this is a reasonable approach but I would still like to > > see some performance and footprint numbers, both to verify it fixes the > > problem reported, and that we are not getting penalized elsewhere. > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > Hi David, Daniel, and Serguei, > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > The initialization allows the created but unpopulated, or partially > > populated, table to be seen by other threads - is that your intention? > > It seems it should be okay as the other threads will then race with the > > initializing thread to add specific entries, and this is a concurrent > > map so that should be functionally correct. But if so then I think you > > can also reduce the scope of the ThreadTableCreate_lock so that it > > covers creation of the table only, not the initial population of the table. > > > > I like the approach of only initializing the table when needed and using > > that to control when the add/remove-thread code needs to update the > > table. But I would still want to see what impact this has on thread > > startup cost, both with and without the table being initialized. > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > as Daniel suggested. > > > > Not sure it's best to combine these, but if they are limited to the > > changes in management.cpp only then that may be okay. It helps to be > > able to focus on the table related changes without being distracted by > > other optimizations. > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > to strip it of the all functionality that is not required in the thread table case. > > > > The revised version seems better in that regard. But I still have a > > concern, see below. > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > growing the thread table when required. > > > > Yes but why? Why can't this table be grown on demand by the thread that > > is doing the addition? For other tables we may have to delegate to the > > service thread because the current thread cannot perform the action, or > > it doesn't want to perform it at the time the need for the resize is > > detected (e.g. its detected at a safepoint and you want the resize to > > happen later outside the safepoint). It's not apparent to me that such > > restrictions apply here. > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > Ok. > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > Some specific code comments: > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > false, Monitor::_safepoint_check_never); > > > > I think this needs to be a _safepoint_check_always lock. The table will > > be created by regular JavaThreads and they should (nearly) always be > > checking for safepoints if they are going to block acquiring the lock. > > And it isn't at all obvious that the thread doing the creation can't go > > to a safepoint whilst this lock is held. > > > > --- > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > Nit: > > > > 618 JavaThread* thread = thread_at(i); > > > > you could reuse the new java_thread local you introduced at line 613 and > > just rename that "new" variable to "thread" so you don't have to change > > all other uses. > > > > 628 } else if (java_thread != NULL && ... > > > > You don't need to check != NULL here as you only get here when > > java_thread is not NULL. > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > I think it cleaner/better to just use > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > non-null threadObj. > > > > --- > > > > src/hotspot/share/services/management.cpp > > > > 1323 if (THREAD->is_Java_thread()) { > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > These calls can only be made on a JavaThread so this be simplified to > > remove the is_Java_thread() call. Similarly in other places. > > > > --- > > > > src/hotspot/share/services/threadTable.cpp > > > > 55 class ThreadTableEntry : public CHeapObj { > > 56 private: > > 57 jlong _tid; > > > > I believe hotspot style is to not indent the access modifiers in C++ > > class declarations, so the above would just be: > > > > 55 class ThreadTableEntry : public CHeapObj { > > 56 private: > > 57 jlong _tid; > > > > etc. > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > 61 _tid(tid),_java_thread(java_thread) {} > > > > line 61 should be indented as it continues line 60. > > > > 67 class ThreadTableConfig : public AllStatic { > > ... > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > The is_dead parameter still bothers me here. I can't make enough sense > > out of the template code in ConcurrentHashtable to see why we have to > > have it, but I'm concerned that its very existence means we perhaps > > should not be trying to extend CHT in this context. ?? > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > line 116 should be indented, though in this case I think a better layout > > would be: > > > > 115 size_t start_size_log = > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > DefaultThreadTableSizeLog; > > > > 131 double ThreadTable::get_load_factor() { > > 132 return (double)_items_count/_current_size; > > 133 } > > > > Not sure that is doing what you want/expect. It will perform integer > > division and then cast that whole integer to a double. If you want > > double arithmetic you need: > > > > return ((double)_items_count)/_current_size; > > > > 180 jlong _tid; > > 181 uintx _hash; > > > > Nit: no need for all those spaces before the variable name. > > > > 183 ThreadTableLookup(jlong tid) > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > line 184 should be indented. > > > > 201 ThreadGet():_return(NULL) {} > > > > Nit: need space after : > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > 212 _has_work = false; > > > > line 211 is indented one space too far. > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > Nit: need space after , > > > > 252 return _local_table->remove(thread,lookup); > > > > Nit: need space after , > > > > Thanks, > > David > > ------ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > Thanks! > > > --Daniil > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > Hi Serguei and David, > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > in src/hotspot/share/services/management.cpp so I think that table > > > needs to enabled and populated only if it is going to be used. > > > > > > I've taken a look at the webrev below and I see that David has > > > followed up with additional comments. Before I do a crawl through > > > code review for this, I would like to see the ThreadTable stuff > > > made optional and David's other comments addressed. > > > > > > Another possible optimization is for callers of > > > find_JavaThread_from_java_tid() to save the calling thread's > > > tid value before they loop and if the current tid == saved_tid > > > then use the current JavaThread* instead of calling > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > Dan > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > --Daniil > > > > > > > > From: > > > > Organization: Oracle Corporation > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > Hi Daniil, > > > > > > > > I have several quick comments. > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > 619 // to the thread table. > > > > 620 for (uint i = 0; i < length(); i++) { > > > > 621 JavaThread* thread = thread_at(i); > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > 624 return thread; > > > > 625 } > > > > 626 } > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > 628 return java_thread; > > > > 629 } > > > > 630 return NULL; > > > > 631 } > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > 633 oop tobj = java_thread->threadObj(); > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > 635 // or is starting to exit. > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > 638 } > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > A space is missed after the comma: > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > An empty line is needed before L632. > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > It'd better to list parameters in the opposite order. > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > Thanks, > > > > Serguei > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > > > Hi Daniil, > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > implementation!) will need careful examination. We have to be concerned > > > > about the cost of maintaining it when it may never even be queried. You > > > > would need to look at footprint cost and performance impact. > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > next few days. I will try to look at this asap next week, but we will > > > > need a lot more data on it. > > > > > > > > Thanks, > > > > David > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > in the thread table. > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > > > > > Best regards, > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From david.holmes at oracle.com Mon Aug 12 23:34:54 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 13 Aug 2019 09:34:54 +1000 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <48311B39-43F9-49E5-BEC7-A64F9D2588AF@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <48311B39-43F9-49E5-BEC7-A64F9D2588AF@oracle.com> Message-ID: <81f607e5-4a65-9e14-8ad0-fdcb8e03c6a4@oracle.com> Hi Daniil, On 13/08/2019 9:24 am, Daniil Titov wrote: > Hi Robbin, > > Thank you very much for reviewing this version of the fix! Based on your findings > it seems as it makes sense to make a step back and continue with the > approach we took before in the previous version of the webrev (webrev.04), > and get more information about the impact on the startup time it has. I will > consult with Claus regarding this and then share the findings. That seems a good approach to me. It wasn't at all clear to me that the latest proposed approach would actually solve the original problem in a satisfactory way - it would depend on how constant the set of threads being queried was. There is no perfect solution here as any fix to the reported problem incurs overhead elsewhere. Even evaluating the merits of the different trade-offs is hard to do - we could end up with a compromise solution that fails to satisfy anyone. David ----- > Thanks again, > --Daniil > > > > > > > ?On 8/12/19, 5:22 AM, "Robbin Ehn" wrote: > > Hi Daniil, > > I took a new deeper dive into this. > > This line seems to have some issues: > > if (ThreadTable::is_initialized() && thread->in_thread_table() && > !thread->is_attaching_via_jni()) { > > If you create new threads which attaches and then dies, the table will just keep > growing. So you must remove them also ? > > Secondly you should not use volatile semantics for _in_thread_table. > The load in the if-statement can be reordered with _is_initialized. > Which could lead to a leak, rogue pointer in the table. > > So both "static volatile bool _is_initialized;" and "volatile bool > _in_thread_table; " > should be stored with store_release and loaded with load_acquire. > > Unfortunately it looks like there still would be races if > ThreadTable::add_thread e.g. context switch at: > > if (_local_table->insert(thread, lookup, entry)) { > // HERE > java_thread->set_in_thread_table(true); > > *Remove side can pass the if-statement without removing. > > Since this thread also maybe exiting at any moment, e.g. context switch: > > if (tobj != NULL && !thread->is_exiting() && > java_tid == java_lang_Thread::thread_id(tobj)) { > // HERE > ThreadTable::add_thread(java_tid, thread); > > *Add side can add a thread that is exiting. > > Mixing in a third thread looking up a random tid and getting a JavaThread*, it > must validate it against it's ThreadsList. Making the hashtable useless. > > So I think the only one adding and removing should be the thread itself. > 1:Add to ThreadsList > 2:Add to ThreadTable > 3:Remove from ThreadTable > 4:Remove ThreadsList > > Between 1-2 and 3-4 the thread would be looked-up via linear scan. > I don't see an easy way around the start-up issue with this. > > Maybe have the cache in Java. > Pass in the thread obj into a > java_sun_management_ThreadImpl_getThreadTotalCpuTime3 instead, > thus skipping any look-ups in native. > > Thanks, Robbin > > > On 8/12/19 5:49 AM, Daniil Titov wrote: > > Hi David, Robbin, Daniel, and Serguei, > > > > Please review a new version of the fix. > > > > As David suggested I created a separated Jira issue [1] to cover additional optimization for > > some callers of find_JavaThread_from_java_tid() and this version of the fix no longer includes > > changes in management.cpp ( and the test related with these changes). > > > > Regarding the impact the previous version of the fix had on the thread startup time at heavy load (e.g. > > when 5000 threads are created and destroyed every second) I tried a different approach that makes > > calls to ThreadTable::add_thread and ThreadTable::remove_thread asynchronous and offloads the > > work for actual modifications of the thread table to a periodic task that runs every 5 seconds. With the > > same stress test scenario (the test does some warm-up and then measures the time it takes to create > > and start 100,000 threads; every thread just sleeps for 100 ms) the impact on the thread startup time > > was reduced to 1.2% ( from 2.7%). > > > > The cause of this impact in this stress test scenario is that as soon as the thread table is initialized, > > an additional work to insert and delete entries in the thread table should be performed, even if > > com.sun.management.ThreadMXBean methods are no longer called. For example, In the stress test > > mentioned above, every second about 5000 entries had to be inserted in the table and then deleted. > > > > That doesn't look right and the new version of the fix uses the different approach: the thread is added to > > the thread table only when this thread is requested by com.sun.management.ThreadMXBean bean. Every > > time when find_JavaThread_from_java_tid() is called for a new tid, the thread is found by the iterating over > > the thread list and added to the thread table. All consequent calls to find_JavaThread_from_java_tid() for > > the same tid returns the thread from the thread table. > > > > Running stress test for the cases when the thread table is enabled and not showed no difference in the > > average thread startup times. > > > > [1] : https://bugs.openjdk.java.net/browse/JDK-8229391 > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.05/ > > > > Thanks, > > Daniil > > > > ?On 8/4/19, 7:54 PM, "David Holmes" wrote: > > > > Hi Daniil, > > > > On 3/08/2019 8:16 am, Daniil Titov wrote: > > > Hi David, > > > > > > Thank you for your detailed review. Please review a new version of the fix that includes > > > the changes you suggested: > > > - ThreadTableCreate_lock scope is reduced to cover the creation of the table only; > > > - ThreadTableCreate_lock is made _safepoint_check_always; > > > > Okay. > > > > > - ServiceThread is no longer responsible for the resizing of the thread table, instead, > > > the thread table is changed to grow on demand by the thread that is doing the addition; > > > > Okay - I'm happy to get the serviceThread out of the picture here. > > > > > - fixed nits and formatting issues. > > > > Okay. > > > > >>> The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > >>> as Daniel suggested. > > >> Not sure it's best to combine these, but if they are limited to the > > >> changes in management.cpp only then that may be okay. > > > > > > The additional optimization for some callers of find_JavaThread_from_java_tid() is > > > limited to management.cpp (plus a new test) so I left them in the webrev but > > > I also could move it in the separate issue if required. > > > > I'd prefer this part of be separated out, but won't insist. Let's see if > > Dan or Serguei have a strong opinion. > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > >755 jlong tid = SharedRuntime::get_java_tid(thread); > > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > I think it cleaner/better to just use > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > > non-null threadObj. > > > > > > I had to leave this code unchanged since it turned out the threadObj is null > > > when VM is destroyed: > > > > > > V [libjvm.so+0xe165d7] oopDesc::long_field(int) const+0x67 > > > V [libjvm.so+0x16e06c6] ThreadsSMRSupport::add_thread(JavaThread*)+0x116 > > > V [libjvm.so+0x16d1302] Threads::add(JavaThread*, bool)+0x82 > > > V [libjvm.so+0xef8369] attach_current_thread.part.197+0xc9 > > > V [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c > > > C [libjli.so+0x4333] JavaMain+0x2c3 > > > C [libjli.so+0x8159] ThreadJavaMain+0x9 > > > > This is actually nothing to do with the VM being destroyed, but is an > > issue with JNI_AttachCurrentThread and its interaction with the > > ThreadSMR iterators. The attach process is: > > - create JavaThread > > - mark as "is attaching via jni" > > - add to ThreadsList > > - create java.lang.Thread object (you can only execute Java code after > > you are attached) > > - mark as "attach completed" > > > > So while a thread "is attaching" it will be seen by the ThreadSMR thread > > iterator but will have a NULL java.lang.Thread object. > > > > We special-case attaching threads in a number of places in the VM and I > > think we should be explicitly doing something here to filter out > > attaching threads, rather than just being tolerant of a NULL j.l.Thread > > object. Specifically in ThreadsSMRSupport::add_thread: > > > > if (ThreadTable::is_initialized() && !thread->is_attaching_via_jni()) { > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > ThreadTable::add_thread(tid, thread); > > } > > > > Note that in ThreadsSMRSupport::remove_thread we can use the same guard, > > which covers the case the JNI attach encountered an error trying to > > create the j.l.Thread object. > > > > >> src/hotspot/share/services/threadTable.cpp > > >> 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > >> The is_dead parameter still bothers me here. I can't make enough sense > > >> out of the template code in ConcurrentHashtable to see why we have to > > >> have it, but I'm concerned that its very existence means we perhaps > > >> should not be trying to extend CHT in this context. ?? > > > > > > My understanding is that is_dead parameter provides a mechanism for > > > ConcurrentHashtable to remove stale entries that were not explicitly > > > removed by calling ConcurrentHashTable::remove() method. > > > I think that just because in our case we don't use this mechanism doesn't > > > mean we should not use ConcurrentHashTable. > > > > Can you confirm that this usage is okay with Robbin Ehn please. He's > > back from vacation this week. > > > > >> I would still want to see what impact this has on thread > > >> startup cost, both with and without the table being initialized. > > > > > > I run a test that initializes the table by calling ThreadMXBean.get getThreadInfo(), > > > starts some threads as a worm-up, and then creates and starts 100,000 threads > > > (each thread just sleeps for 100 ms). In case when the thread table is enabled > > > 100,000 threads are created and started for about 15200 ms. If the thread table > > > is off the test takes about 14800 ms. Based on this information the enabled > > > thread table makes the thread startup about 2.7% slower. > > > > That doesn't sound very good. I think we may need to Claes involved to > > help investigate overall performance impact here. > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > No further code comments. > > > > I didn't look at the test in detail. > > > > Thanks, > > David > > > > > Thanks! > > > --Daniil > > > > > > > > > ?On 7/29/19, 12:53 AM, "David Holmes" wrote: > > > > > > Hi Daniil, > > > > > > Overall I think this is a reasonable approach but I would still like to > > > see some performance and footprint numbers, both to verify it fixes the > > > problem reported, and that we are not getting penalized elsewhere. > > > > > > On 25/07/2019 3:21 am, Daniil Titov wrote: > > > > Hi David, Daniel, and Serguei, > > > > > > > > Please review the new version of the fix, that makes the thread table initialization on demand and > > > > moves it inside ThreadsList::find_JavaThread_from_java_tid(). At the creation time the thread table > > > > is initialized with the threads from the current thread list. We don't want to hold Threads_lock > > > > inside find_JavaThread_from_java_tid(), thus new threads still could be created while the thread > > > > table is being initialized . Such threads will be found by the linear search and added to the thread table > > > > later, in ThreadsList::find_JavaThread_from_java_tid(). > > > > > > The initialization allows the created but unpopulated, or partially > > > populated, table to be seen by other threads - is that your intention? > > > It seems it should be okay as the other threads will then race with the > > > initializing thread to add specific entries, and this is a concurrent > > > map so that should be functionally correct. But if so then I think you > > > can also reduce the scope of the ThreadTableCreate_lock so that it > > > covers creation of the table only, not the initial population of the table. > > > > > > I like the approach of only initializing the table when needed and using > > > that to control when the add/remove-thread code needs to update the > > > table. But I would still want to see what impact this has on thread > > > startup cost, both with and without the table being initialized. > > > > > > > The change also includes additional optimization for some callers of find_JavaThread_from_java_tid() > > > > as Daniel suggested. > > > > > > Not sure it's best to combine these, but if they are limited to the > > > changes in management.cpp only then that may be okay. It helps to be > > > able to focus on the table related changes without being distracted by > > > other optimizations. > > > > > > > That is correct that ResolvedMethodTable was used as a blueprint for the thread table, however, I tried > > > > to strip it of the all functionality that is not required in the thread table case. > > > > > > The revised version seems better in that regard. But I still have a > > > concern, see below. > > > > > > > We need to have the thread table resizable and allow it to grow as the number of threads increases to avoid > > > > reserving excessive memory a-priori or deteriorating lookup times. The ServiceThread is responsible for > > > > growing the thread table when required. > > > > > > Yes but why? Why can't this table be grown on demand by the thread that > > > is doing the addition? For other tables we may have to delegate to the > > > service thread because the current thread cannot perform the action, or > > > it doesn't want to perform it at the time the need for the resize is > > > detected (e.g. its detected at a safepoint and you want the resize to > > > happen later outside the safepoint). It's not apparent to me that such > > > restrictions apply here. > > > > > > > There is no ConcurrentHashTable available in Java 8 and for backporting this fix to Java 8 another implementation > > > > of the hash table, probably originally suggested in the patch attached to the JBS issue, should be used. It will make > > > > the backporting more complicated, however, adding a new Implementation of the hash table in Java 14 while it > > > > already has ConcurrentHashTable doesn't seem reasonable for me. > > > > > > Ok. > > > > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 > > > > > > Some specific code comments: > > > > > > src/hotspot/share/runtime/mutexLocker.cpp > > > > > > + def(ThreadTableCreate_lock , PaddedMutex , special, > > > false, Monitor::_safepoint_check_never); > > > > > > I think this needs to be a _safepoint_check_always lock. The table will > > > be created by regular JavaThreads and they should (nearly) always be > > > checking for safepoints if they are going to block acquiring the lock. > > > And it isn't at all obvious that the thread doing the creation can't go > > > to a safepoint whilst this lock is held. > > > > > > --- > > > > > > src/hotspot/share/runtime/threadSMR.cpp > > > > > > Nit: > > > > > > 618 JavaThread* thread = thread_at(i); > > > > > > you could reuse the new java_thread local you introduced at line 613 and > > > just rename that "new" variable to "thread" so you don't have to change > > > all other uses. > > > > > > 628 } else if (java_thread != NULL && ... > > > > > > You don't need to check != NULL here as you only get here when > > > java_thread is not NULL. > > > > > > 755 jlong tid = SharedRuntime::get_java_tid(thread); > > > 926 jlong tid = SharedRuntime::get_java_tid(thread); > > > > > > I think it cleaner/better to just use > > > > > > jlong tid = java_lang_Thread::thread_id(thread->threadObj()); > > > > > > as we know thread is not NULL, it is a JavaThread and it has to have a > > > non-null threadObj. > > > > > > --- > > > > > > src/hotspot/share/services/management.cpp > > > > > > 1323 if (THREAD->is_Java_thread()) { > > > 1324 JavaThread* current_thread = (JavaThread*)THREAD; > > > > > > These calls can only be made on a JavaThread so this be simplified to > > > remove the is_Java_thread() call. Similarly in other places. > > > > > > --- > > > > > > src/hotspot/share/services/threadTable.cpp > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > 56 private: > > > 57 jlong _tid; > > > > > > I believe hotspot style is to not indent the access modifiers in C++ > > > class declarations, so the above would just be: > > > > > > 55 class ThreadTableEntry : public CHeapObj { > > > 56 private: > > > 57 jlong _tid; > > > > > > etc. > > > > > > 60 ThreadTableEntry(jlong tid, JavaThread* java_thread) : > > > 61 _tid(tid),_java_thread(java_thread) {} > > > > > > line 61 should be indented as it continues line 60. > > > > > > 67 class ThreadTableConfig : public AllStatic { > > > ... > > > 71 static uintx get_hash(Value const& value, bool* is_dead) { > > > > > > The is_dead parameter still bothers me here. I can't make enough sense > > > out of the template code in ConcurrentHashtable to see why we have to > > > have it, but I'm concerned that its very existence means we perhaps > > > should not be trying to extend CHT in this context. ?? > > > > > > 115 size_t start_size_log = size_log > DefaultThreadTableSizeLog > > > 116 ? size_log : DefaultThreadTableSizeLog; > > > > > > line 116 should be indented, though in this case I think a better layout > > > would be: > > > > > > 115 size_t start_size_log = > > > 116 size_log > DefaultThreadTableSizeLog ? size_log : > > > DefaultThreadTableSizeLog; > > > > > > 131 double ThreadTable::get_load_factor() { > > > 132 return (double)_items_count/_current_size; > > > 133 } > > > > > > Not sure that is doing what you want/expect. It will perform integer > > > division and then cast that whole integer to a double. If you want > > > double arithmetic you need: > > > > > > return ((double)_items_count)/_current_size; > > > > > > 180 jlong _tid; > > > 181 uintx _hash; > > > > > > Nit: no need for all those spaces before the variable name. > > > > > > 183 ThreadTableLookup(jlong tid) > > > 184 : _tid(tid), _hash(primitive_hash(tid)) {} > > > > > > line 184 should be indented. > > > > > > 201 ThreadGet():_return(NULL) {} > > > > > > Nit: need space after : > > > > > > 211 assert(_is_initialized, "Thread table is not initialized"); > > > 212 _has_work = false; > > > > > > line 211 is indented one space too far. > > > > > > 229 ThreadTableEntry* entry = new ThreadTableEntry(tid,java_thread); > > > > > > Nit: need space after , > > > > > > 252 return _local_table->remove(thread,lookup); > > > > > > Nit: need space after , > > > > > > Thanks, > > > David > > > ------ > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > Thanks! > > > > --Daniil > > > > > > > > > > > > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" wrote: > > > > > > > > On 6/29/19 12:06 PM, Daniil Titov wrote: > > > > > Hi Serguei and David, > > > > > > > > > > Serguei is right, ThreadTable::find_thread(java_tid) cannot return a JavaThread with an unmatched java_tid. > > > > > > > > > > Please find a new version of the fix that includes the changes Serguei suggested. > > > > > > > > > > Regarding the concern about the maintaining the thread table when it may never even be queried, one of > > > > > the options could be to add ThreadTable ::isEnabled flag, set it to "false" by default, and wrap the calls to the thread table > > > > > in ThreadsSMRSupport add_thread() and remove_thread() methods to check this flag. > > > > > > > > > > When ThreadsList::find_JavaThread_from_java_tid() is called for the first time it could check if ThreadTable ::isEnabled > > > > > Is on and if not then set it on and populate the thread table with all existing threads from the thread list. > > > > > > > > I have the same concerns as David H. about this new ThreadTable. > > > > ThreadsList::find_JavaThread_from_java_tid() is only called from code > > > > in src/hotspot/share/services/management.cpp so I think that table > > > > needs to enabled and populated only if it is going to be used. > > > > > > > > I've taken a look at the webrev below and I see that David has > > > > followed up with additional comments. Before I do a crawl through > > > > code review for this, I would like to see the ThreadTable stuff > > > > made optional and David's other comments addressed. > > > > > > > > Another possible optimization is for callers of > > > > find_JavaThread_from_java_tid() to save the calling thread's > > > > tid value before they loop and if the current tid == saved_tid > > > > then use the current JavaThread* instead of calling > > > > find_JavaThread_from_java_tid() to get the JavaThread*. > > > > > > > > Dan > > > > > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > --Daniil > > > > > > > > > > From: > > > > > Organization: Oracle Corporation > > > > > Date: Friday, June 28, 2019 at 7:56 PM > > > > > To: Daniil Titov , OpenJDK Serviceability , "hotspot-runtime-dev at openjdk.java.net" , "jmx-dev at openjdk.java.net" > > > > > Subject: Re: RFR: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) > > > > > > > > > > Hi Daniil, > > > > > > > > > > I have several quick comments. > > > > > > > > > > The indent in the hotspot c/c++ files has to be 2, not 4. > > > > > > > > > > https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html > > > > > 614 JavaThread* ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > 616 if (java_thread == NULL && java_tid == PMIMORDIAL_JAVA_TID) { > > > > > 617 // ThreadsSMRSupport::add_thread() is not called for the primordial > > > > > 618 // thread. Thus, we find this thread with a linear search and add it > > > > > 619 // to the thread table. > > > > > 620 for (uint i = 0; i < length(); i++) { > > > > > 621 JavaThread* thread = thread_at(i); > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > 623 ThreadTable::add_thread(java_tid, thread); > > > > > 624 return thread; > > > > > 625 } > > > > > 626 } > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > 628 return java_thread; > > > > > 629 } > > > > > 630 return NULL; > > > > > 631 } > > > > > 632 bool ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* java_thread) { > > > > > 633 oop tobj = java_thread->threadObj(); > > > > > 634 // Ignore the thread if it hasn't run yet, has exited > > > > > 635 // or is starting to exit. > > > > > 636 return (tobj != NULL && !java_thread->is_exiting() && > > > > > 637 java_tid == java_lang_Thread::thread_id(tobj)); > > > > > 638 } > > > > > > > > > > 615 JavaThread* java_thread = ThreadTable::find_thread(java_tid); > > > > > > > > > > I'd suggest to rename find_thread() to find_thread_by_tid(). > > > > > > > > > > A space is missed after the comma: > > > > > 622 if (is_valid_java_thread(java_tid,thread)) { > > > > > > > > > > An empty line is needed before L632. > > > > > > > > > > The name 'is_valid_java_thread' looks wrong (or confusing) to me. > > > > > Something like 'is_alive_java_thread_with_tid()' would be better. > > > > > It'd better to list parameters in the opposite order. > > > > > > > > > > The call to is_valid_java_thread() is confusing: > > > > > 627 } else if (java_thread != NULL && is_valid_java_thread(java_tid, java_thread)) { > > > > > > > > > > Why would the call ThreadTable::find_thread(java_tid) return a JavaThread with an unmatched java_tid? > > > > > > > > > > > > > > > Thanks, > > > > > Serguei > > > > > > > > > > On 6/28/19, 9:40 PM, "David Holmes" wrote: > > > > > > > > > > Hi Daniil, > > > > > > > > > > The definition and use of this hashtable (yet another hashtable > > > > > implementation!) will need careful examination. We have to be concerned > > > > > about the cost of maintaining it when it may never even be queried. You > > > > > would need to look at footprint cost and performance impact. > > > > > > > > > > Unfortunately I'm just about to board a plane and will be out for the > > > > > next few days. I will try to look at this asap next week, but we will > > > > > need a lot more data on it. > > > > > > > > > > Thanks, > > > > > David > > > > > > > > > > On 6/28/19 3:31 PM, Daniil Titov wrote: > > > > > Please review the change that improves performance of ThreadMXBean MXBean methods returning the > > > > > information for specific threads. The change introduces the thread table that uses ConcurrentHashTable > > > > > to store one-to-one the mapping between the thread ids and JavaThread objects and replaces the linear > > > > > search over the thread list in ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the lookup > > > > > in the thread table. > > > > > > > > > > Testing: Mach5 tier1,tier2 and tier3 tests successfully passed. > > > > > > > > > > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 > > > > > > > > > > Thanks! > > > > > > > > > > Best regards, > > > > > Daniil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From mikhailo.seledtsov at oracle.com Tue Aug 13 00:22:43 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Mon, 12 Aug 2019 17:22:43 -0700 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> Message-ID: <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. Thanks, Misha On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: > Here is an updated webrev: > http://cr.openjdk.java.net/~mseledtsov/8228960.01/ > > I am using a simple file-based mechanism to communicate between the > processes. The "EventGeneratorLoop" process creates a specific > "signal" file on a shared mounted volume, while the main test process > waits? for the file to exist before running the test cases. > > Passes on Linux-x64 Docker-enabled host. Testing in the test cluster > is in progress. > > > Thank you, > > Misha > > On 8/7/19 5:11 PM, David Holmes wrote: >> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>> Hi Severin, Bob, >>> >>> ?? Thank you for reviewing the code. >>> >>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>> Can?t you come up with a better way of synchronizing the test by >>>> possibly writing a >>>> file and waiting for it to exist with a timeout? >>> I will try out this approach. >> >> This seems like a fundamental problem with jcmd - so cc'ing >> serviceability-dev. >> >> But I'm pretty sure they recently addressed a similar issue with the >> premature sending of the attach signal? >> >> David >> ----- >> >>> Thanks, >>> Misha >>>> Isn?t there a shared volume between the two >>>> processes? >>>> >>>> We?ve been fighting test reliability for a while now.? I can only >>>> hope we?re getting >>>> to the end. >>>> >>>> Bob. >>>> >>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf? >>>>> wrote: >>>>> >>>>> Hi Misha, >>>>> >>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com >>>>> wrote: >>>>>> Please review this change that fixes a container test >>>>>> TestJcmdWithSideCar. >>>>>> >>>>>> My investigation indicated that a root cause for this failure is: >>>>>> JCMD -l shows 'Unknown' for class name because the main class has >>>>>> not >>>>>> been loaded yet. >>>>>> The target test JVM has started, it is initializing, but has not >>>>>> loaded >>>>>> the main test class. >>>>> That's what I've found too. >>>>> >>>>>> The proposed solution is to try 'jcmd -l' several times, with a >>>>>> short >>>>>> sleep in between. >>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>> >>>>>> Also I have commented out the testCase02() due to another bug: >>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>> which is not a test bug. IMO, it is better to run the test and >>>>>> skip a >>>>>> sub-case than to skip the entire test. >>>>>> >>>>>> ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>> ???? Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>> Looks OK to me. >>>>> >>>>> Thanks, >>>>> Severin >>>>> From nick.gasson at arm.com Tue Aug 13 09:38:07 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 13 Aug 2019 17:38:07 +0800 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> Message-ID: <178290d1-7904-d59e-3f64-3f57a51ac071@arm.com> Hi Chris, > > Adding to Andrew comments, maybe the solution is to have the test extend > LingeredApp so it can produce a more reliable stack trace other than the > default one you get with LingeredApp. If that's too much trouble, I > don't mind the solution you came up with, but seems writing a > LingeredApp subclass that is specific for this test would be cleaner. Thanks for the suggestion, this does seem much cleaner. Please check the updated webrev here: http://cr.openjdk.java.net/~ngasson/8229118/webrev.1/ Thanks, Nick From bob.vandette at oracle.com Tue Aug 13 13:29:09 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 13 Aug 2019 09:29:09 -0400 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> Message-ID: What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. Bob. > On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: > > Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". > > The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. > > > Thanks, > > Misha > > > On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >> >> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >> >> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >> >> >> Thank you, >> >> Misha >> >> On 8/7/19 5:11 PM, David Holmes wrote: >>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>> Hi Severin, Bob, >>>> >>>> Thank you for reviewing the code. >>>> >>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>> file and waiting for it to exist with a timeout? >>>> I will try out this approach. >>> >>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>> >>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>> >>> David >>> ----- >>> >>>> Thanks, >>>> Misha >>>>> Isn?t there a shared volume between the two >>>>> processes? >>>>> >>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>> to the end. >>>>> >>>>> Bob. >>>>> >>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>> >>>>>> Hi Misha, >>>>>> >>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>> >>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>> been loaded yet. >>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>> the main test class. >>>>>> That's what I've found too. >>>>>> >>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>> sleep in between. >>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>> >>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>> sub-case than to skip the entire test. >>>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>> Looks OK to me. >>>>>> >>>>>> Thanks, >>>>>> Severin >>>>>> From bob.vandette at oracle.com Tue Aug 13 13:34:01 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 13 Aug 2019 09:34:01 -0400 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> Message-ID: Sorry, I just looked at the webrev and you are trying the approach I suggested. I thought you were trying to use file change notification. Where does the workdir get created? Does it have 777 permissions? Bob. > On Aug 13, 2019, at 9:29 AM, Bob Vandette wrote: > > What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. > > Bob. > > >> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". >> >> The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. >> >> >> Thanks, >> >> Misha >> >> >> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>> >>> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >>> >>> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >>> >>> >>> Thank you, >>> >>> Misha >>> >>> On 8/7/19 5:11 PM, David Holmes wrote: >>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>> Hi Severin, Bob, >>>>> >>>>> Thank you for reviewing the code. >>>>> >>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>>> file and waiting for it to exist with a timeout? >>>>> I will try out this approach. >>>> >>>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>>> >>>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>>> >>>> David >>>> ----- >>>> >>>>> Thanks, >>>>> Misha >>>>>> Isn?t there a shared volume between the two >>>>>> processes? >>>>>> >>>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>>> to the end. >>>>>> >>>>>> Bob. >>>>>> >>>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>>> >>>>>>> Hi Misha, >>>>>>> >>>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>>> >>>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>>> been loaded yet. >>>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>>> the main test class. >>>>>>> That's what I've found too. >>>>>>> >>>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>>> sleep in between. >>>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>>> >>>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>>> sub-case than to skip the entire test. >>>>>>>> >>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>> Looks OK to me. >>>>>>> >>>>>>> Thanks, >>>>>>> Severin >>>>>>> > From adam.farley at uk.ibm.com Tue Aug 13 15:41:38 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Tue, 13 Aug 2019 16:41:38 +0100 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: Message-ID: Hi Chris, Thanks! I understand we need a second reviewer/sponsor to get this change in. Any volunteers? Best Regards Adam Farley IBM Runtimes Chris Plummer wrote on 12/08/2019 21:35:06: > From: Chris Plummer > To: Adam Farley8 , serviceability-dev at openjdk.java.net > Date: 12/08/2019 21:35 > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > quietly truncates on buffer overflow > > Hi Adam, > > It looks good to me. > > thanks, > > Chris > > On 8/12/19 7:34 AM, Adam Farley8 wrote: > Hi All, > > This is a known bug, mentioned in a code comment. > > Here is the fix for that bug. > > Reviewers and sponsors requested. > > Short version: if you set sun.boot.library.path to > something beyond a system's max path length, the > current code will return an empty string (rather than > printing a useful error message and shutting down). > > This is also a problem if you've specified multiple > paths with a separator, as this code seems to wrongly > assess whether the *total* length exceeds max path > length. So two 200 char paths on windows will cause > failure, as the total length is 400 (which is beyond > max length for windows). > > Note that the os.cpp bit of the webrev will not be included > in the final webrev, it just makes this change trivially > testable. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > Best Regards > > Adam Farley > IBM Runtimes > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with > number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Tue Aug 13 15:48:31 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 13 Aug 2019 11:48:31 -0400 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: Message-ID: I don't see any information about how this change was tested... Is there something on another email thread? Dan On 8/13/19 11:41 AM, Adam Farley8 wrote: > Hi Chris, > > Thanks! > > I understand we need a second reviewer/sponsor to get this change in. > Any volunteers? > > Best Regards > > Adam Farley > IBM Runtimes > > > Chris Plummer wrote on 12/08/2019 21:35:06: > > > From: Chris Plummer > > To: Adam Farley8 , serviceability-dev at openjdk.java.net > > Date: 12/08/2019 21:35 > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > quietly truncates on buffer overflow > > > > Hi Adam, > > > > It looks good to me. > > > > thanks, > > > > Chris > > > > On 8/12/19 7:34 AM, Adam Farley8 wrote: > > Hi All, > > > > This is a known bug, mentioned in a code comment. > > > > Here is the fix for that bug. > > > > Reviewers and sponsors requested. > > > > Short version: if you set sun.boot.library.path to > > something beyond a system's max path length, the > > current code will return an empty string (rather than > > printing a useful error message and shutting down). > > > > This is also a problem if you've specified multiple > > paths with a separator, as this code seems to wrongly > > assess whether the *total* length exceeds max path > > length. So two 200 char paths on windows will cause > > failure, as the total length is 400 (which is beyond > > max length for windows). > > > > Note that the os.cpp bit of the webrev will not be included > > in the final webrev, it just makes this change trivially > > testable. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 > > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with > > number 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire > PO6 3AU > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with > number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam.farley at uk.ibm.com Tue Aug 13 16:04:40 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Tue, 13 Aug 2019 17:04:40 +0100 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: Message-ID: Hi Daniel, We meet again. :) The test case for this defect can be found in the bug. I broke it down into multiple steps for readability. Do you feel more testing is required? Best Regards Adam Farley IBM Runtimes "Daniel D. Daugherty" wrote on 13/08/2019 16:48:31: > From: "Daniel D. Daugherty" > To: Adam Farley8 , Chris Plummer > > Cc: serviceability-dev at openjdk.java.net > Date: 13/08/2019 16:54 > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > quietly truncates on buffer overflow > > I don't see any information about how this change was tested... > Is there something on another email thread? > > Dan > > On 8/13/19 11:41 AM, Adam Farley8 wrote: > Hi Chris, > > Thanks! > > I understand we need a second reviewer/sponsor to get this change > in. Any volunteers? > > Best Regards > > Adam Farley > IBM Runtimes > > > Chris Plummer wrote on 12/08/2019 21:35:06: > > > From: Chris Plummer > > To: Adam Farley8 , serviceability- > dev at openjdk.java.net > > Date: 12/08/2019 21:35 > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > quietly truncates on buffer overflow > > > > Hi Adam, > > > > It looks good to me. > > > > thanks, > > > > Chris > > > > On 8/12/19 7:34 AM, Adam Farley8 wrote: > > Hi All, > > > > This is a known bug, mentioned in a code comment. > > > > Here is the fix for that bug. > > > > Reviewers and sponsors requested. > > > > Short version: if you set sun.boot.library.path to > > something beyond a system's max path length, the > > current code will return an empty string (rather than > > printing a useful error message and shutting down). > > > > This is also a problem if you've specified multiple > > paths with a separator, as this code seems to wrongly > > assess whether the *total* length exceeds max path > > length. So two 200 char paths on windows will cause > > failure, as the total length is 400 (which is beyond > > max length for windows). > > > > Note that the os.cpp bit of the webrev will not be included > > in the final webrev, it just makes this change trivially > > testable. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 > > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with > > number 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with > number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Aug 13 16:04:43 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 13 Aug 2019 09:04:43 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: Message-ID: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> An HTML attachment was scrubbed... URL: From adam.farley at uk.ibm.com Tue Aug 13 16:28:23 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Tue, 13 Aug 2019 17:28:23 +0100 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> Message-ID: Hi Serguei, Daniel, My testing was limited to the bug specific test case I mentioned, and the following jdwp tests: test/jdk/com/sun/jdi/Jdwp* test/hotspot/jtreg/serviceability/jdwp Best Regards Adam Farley IBM Runtimes "serguei.spitsyn at oracle.com" wrote on 13/08/2019 17:04:43: > From: "serguei.spitsyn at oracle.com" > To: daniel.daugherty at oracle.com, Adam Farley8 > , Chris Plummer > Cc: serviceability-dev at openjdk.java.net > Date: 13/08/2019 17:08 > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > quietly truncates on buffer overflow > > Hi Adam, > > I'm looking at your fix. > Also interested about your testing. > > Thanks, > Serguei > > On 8/13/19 08:48, Daniel D. Daugherty wrote: > I don't see any information about how this change was tested... > Is there something on another email thread? > > Dan > > On 8/13/19 11:41 AM, Adam Farley8 wrote: > Hi Chris, > > Thanks! > > I understand we need a second reviewer/sponsor to get this change > in. Any volunteers? > > Best Regards > > Adam Farley > IBM Runtimes > > > Chris Plummer wrote on 12/08/2019 21:35:06: > > > From: Chris Plummer > > To: Adam Farley8 , serviceability- > dev at openjdk.java.net > > Date: 12/08/2019 21:35 > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > quietly truncates on buffer overflow > > > > Hi Adam, > > > > It looks good to me. > > > > thanks, > > > > Chris > > > > On 8/12/19 7:34 AM, Adam Farley8 wrote: > > Hi All, > > > > This is a known bug, mentioned in a code comment. > > > > Here is the fix for that bug. > > > > Reviewers and sponsors requested. > > > > Short version: if you set sun.boot.library.path to > > something beyond a system's max path length, the > > current code will return an empty string (rather than > > printing a useful error message and shutting down). > > > > This is also a problem if you've specified multiple > > paths with a separator, as this code seems to wrongly > > assess whether the *total* length exceeds max path > > length. So two 200 char paths on windows will cause > > failure, as the total length is 400 (which is beyond > > max length for windows). > > > > Note that the os.cpp bit of the webrev will not be included > > in the final webrev, it just makes this change trivially > > testable. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 > > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with > > number 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with > number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Aug 13 16:55:51 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 13 Aug 2019 09:55:51 -0700 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: <178290d1-7904-d59e-3f64-3f57a51ac071@arm.com> References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> <178290d1-7904-d59e-3f64-3f57a51ac071@arm.com> Message-ID: On 8/13/19 2:38 AM, Nick Gasson wrote: > Hi Chris, > >> >> Adding to Andrew comments, maybe the solution is to have the test extend >> LingeredApp so it can produce a more reliable stack trace other than the >> default one you get with LingeredApp. If that's too much trouble, I >> don't mind the solution you came up with, but seems writing a >> LingeredApp subclass that is specific for this test would be cleaner. > > Thanks for the suggestion, this does seem much cleaner. Please check > the updated webrev here: > > http://cr.openjdk.java.net/~ngasson/8229118/webrev.1/ > > Thanks, > Nick The changes look good, although I think the new file should go in the serviceability/sa test directory, unless you think this is a generally useful class that might be used by tests outside of the sa. thanks, Chris From mikhailo.seledtsov at oracle.com Tue Aug 13 18:57:30 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Tue, 13 Aug 2019 11:57:30 -0700 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> Message-ID: Hi Bob, ? The workdir (JTwork/scratch) is created with the "test user" permissions. Let me try to place the "signal" file in /tmp instead, since /tmp should normally have a 777 permission on Linux. If this works, I will have to add some unique number to the file name, perhaps a PID of a child process. I will try this, and let you know how it works. Thank you, Misha On 8/13/19 6:34 AM, Bob Vandette wrote: > Sorry, I just looked at the webrev and you are trying the approach I suggested. I thought you > were trying to use file change notification. > > Where does the workdir get created? Does it have 777 permissions? > > Bob. > > >> On Aug 13, 2019, at 9:29 AM, Bob Vandette wrote: >> >> What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. >> >> Bob. >> >> >>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". >>> >>> The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. >>> >>> >>> Thanks, >>> >>> Misha >>> >>> >>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>> >>>> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >>>> >>>> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >>>> >>>> >>>> Thank you, >>>> >>>> Misha >>>> >>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>> Hi Severin, Bob, >>>>>> >>>>>> Thank you for reviewing the code. >>>>>> >>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>>>> file and waiting for it to exist with a timeout? >>>>>> I will try out this approach. >>>>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>>>> >>>>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Thanks, >>>>>> Misha >>>>>>> Isn?t there a shared volume between the two >>>>>>> processes? >>>>>>> >>>>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>>>> to the end. >>>>>>> >>>>>>> Bob. >>>>>>> >>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>>>> >>>>>>>> Hi Misha, >>>>>>>> >>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>>>> >>>>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>>>> been loaded yet. >>>>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>>>> the main test class. >>>>>>>> That's what I've found too. >>>>>>>> >>>>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>>>> sleep in between. >>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>>>> >>>>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>>>> sub-case than to skip the entire test. >>>>>>>>> >>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>> Looks OK to me. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Severin >>>>>>>> From bob.vandette at oracle.com Tue Aug 13 19:06:06 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 13 Aug 2019 15:06:06 -0400 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> Message-ID: > On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: > > Hi Bob, > > The workdir (JTwork/scratch) is created with the "test user" permissions. Let me try to place the "signal" file in /tmp instead, since /tmp should normally have a 777 permission on Linux. Aren?t you creating a file inside a docker container and then checking for its existence outside of the container? Isn?t the root user running inside the container? Both processes don?t see the same /tmp right? So that shouldn?t help. If scratch has 777 permissions, anyone can create a file. You have to be careful that you can clean up the file from outside the container. I?d make sure to create it with 777. Bob. > > If this works, I will have to add some unique number to the file name, perhaps a PID of a child process. > > I will try this, and let you know how it works. > > > Thank you, > > Misha > > On 8/13/19 6:34 AM, Bob Vandette wrote: >> Sorry, I just looked at the webrev and you are trying the approach I suggested. I thought you >> were trying to use file change notification. >> >> Where does the workdir get created? Does it have 777 permissions? >> >> Bob. >> >> >>> On Aug 13, 2019, at 9:29 AM, Bob Vandette wrote: >>> >>> What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. >>> >>> Bob. >>> >>> >>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>>> >>>> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". >>>> >>>> The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. >>>> >>>> >>>> Thanks, >>>> >>>> Misha >>>> >>>> >>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>> >>>>> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >>>>> >>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >>>>> >>>>> >>>>> Thank you, >>>>> >>>>> Misha >>>>> >>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>> Hi Severin, Bob, >>>>>>> >>>>>>> Thank you for reviewing the code. >>>>>>> >>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>>>>> file and waiting for it to exist with a timeout? >>>>>>> I will try out this approach. >>>>>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>>>>> >>>>>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Thanks, >>>>>>> Misha >>>>>>>> Isn?t there a shared volume between the two >>>>>>>> processes? >>>>>>>> >>>>>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>>>>> to the end. >>>>>>>> >>>>>>>> Bob. >>>>>>>> >>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>>>>> >>>>>>>>> Hi Misha, >>>>>>>>> >>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>>>>> >>>>>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>>>>> been loaded yet. >>>>>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>>>>> the main test class. >>>>>>>>> That's what I've found too. >>>>>>>>> >>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>>>>> sleep in between. >>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>>>>> >>>>>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>> >>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>> Looks OK to me. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Severin >>>>>>>>> From mikhailo.seledtsov at oracle.com Tue Aug 13 19:28:37 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Tue, 13 Aug 2019 12:28:37 -0700 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> Message-ID: On 8/13/19 12:06 PM, Bob Vandette wrote: > >> On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Hi Bob, >> >> The workdir (JTwork/scratch) is created with the "test user" permissions. Let me try to place the "signal" file in /tmp instead, since /tmp should normally have a 777 permission on Linux. > Aren?t you creating a file inside a docker container and then checking for its existence outside of the container? Correct > Isn?t the root user running inside the container? By default it is. But it still fails to create a file, for some reason. Can be related to selinux settings (for instance, see this article: https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443), I can not change those. My hope is that /tmp is configured to be accessed by a container engine as a general purpose directory, hence I was thinking to try it out. > > Both processes don?t see the same /tmp right? So that shouldn?t help. In my next experiment, I will map a /tmp from host to be a /host-tmp inside the container (--volume /tmp:/host-tmp), then write a signal file to /host-tmp. > > If scratch has 777 permissions, anyone can create a file. scratch has? "rwxr-xr-x" > You have to be careful that you can clean up the > file from outside the container. I?d make sure to create it with 777. I do use deleteOnExit(), so it should work (unless the JVM crashes). I guess I could add extra layer of safety here, and set the permissions to 777. Thank you for advice. Thank you, Misha > > Bob. > >> If this works, I will have to add some unique number to the file name, perhaps a PID of a child process. >> >> I will try this, and let you know how it works. >> >> >> Thank you, >> >> Misha >> >> On 8/13/19 6:34 AM, Bob Vandette wrote: >>> Sorry, I just looked at the webrev and you are trying the approach I suggested. I thought you >>> were trying to use file change notification. >>> >>> Where does the workdir get created? Does it have 777 permissions? >>> >>> Bob. >>> >>> >>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette wrote: >>>> >>>> What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. >>>> >>>> Bob. >>>> >>>> >>>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>>>> >>>>> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". >>>>> >>>>> The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Misha >>>>> >>>>> >>>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>>> >>>>>> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >>>>>> >>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >>>>>> >>>>>> >>>>>> Thank you, >>>>>> >>>>>> Misha >>>>>> >>>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>>> Hi Severin, Bob, >>>>>>>> >>>>>>>> Thank you for reviewing the code. >>>>>>>> >>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>>>>>> file and waiting for it to exist with a timeout? >>>>>>>> I will try out this approach. >>>>>>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>>>>>> >>>>>>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> Thanks, >>>>>>>> Misha >>>>>>>>> Isn?t there a shared volume between the two >>>>>>>>> processes? >>>>>>>>> >>>>>>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>>>>>> to the end. >>>>>>>>> >>>>>>>>> Bob. >>>>>>>>> >>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>>>>>> >>>>>>>>>> Hi Misha, >>>>>>>>>> >>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>>>>>> >>>>>>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>>>>>> been loaded yet. >>>>>>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>>>>>> the main test class. >>>>>>>>>> That's what I've found too. >>>>>>>>>> >>>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>>>>>> sleep in between. >>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>>>>>> >>>>>>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>>> >>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>>> Looks OK to me. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Severin >>>>>>>>>> From bob.vandette at oracle.com Tue Aug 13 21:05:02 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 13 Aug 2019 17:05:02 -0400 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> Message-ID: <3143B636-9729-4238-9149-3A562B288643@oracle.com> > On Aug 13, 2019, at 3:28 PM, mikhailo.seledtsov at oracle.com wrote: > > > On 8/13/19 12:06 PM, Bob Vandette wrote: >> >>> On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> Hi Bob, >>> >>> The workdir (JTwork/scratch) is created with the "test user" permissions. Let me try to place the "signal" file in /tmp instead, since /tmp should normally have a 777 permission on Linux. >> Aren?t you creating a file inside a docker container and then checking for its existence outside of the container? > Correct >> Isn?t the root user running inside the container? > > By default it is. But it still fails to create a file, for some reason. Can be related to selinux settings (for instance, see this article: https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443), I can not change those. Is your JTWork/scratch on an NFS mounted file system? If this is the case then the problem is that root is equivalent to nobody on mounted file systems and can?t create files unless the directory has 777 permissions. I just confirmed this. You?d have to either run the container test as test-user or change the scratch directory permission. Bob. > > My hope is that /tmp is configured to be accessed by a container engine as a general purpose directory, hence I was thinking to try it out. > >> >> Both processes don?t see the same /tmp right? So that shouldn?t help. > In my next experiment, I will map a /tmp from host to be a /host-tmp inside the container (--volume /tmp:/host-tmp), then write a signal file to /host-tmp. >> >> If scratch has 777 permissions, anyone can create a file. > scratch has "rwxr-xr-x" >> You have to be careful that you can clean up the >> file from outside the container. I?d make sure to create it with 777. > > I do use deleteOnExit(), so it should work (unless the JVM crashes). I guess I could add extra layer of safety here, and set the permissions to 777. Thank you for advice. > > > Thank you, > > Misha > >> >> Bob. >> >>> If this works, I will have to add some unique number to the file name, perhaps a PID of a child process. >>> >>> I will try this, and let you know how it works. >>> >>> >>> Thank you, >>> >>> Misha >>> >>> On 8/13/19 6:34 AM, Bob Vandette wrote: >>>> Sorry, I just looked at the webrev and you are trying the approach I suggested. I thought you >>>> were trying to use file change notification. >>>> >>>> Where does the workdir get created? Does it have 777 permissions? >>>> >>>> Bob. >>>> >>>> >>>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette wrote: >>>>> >>>>> What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. >>>>> >>>>> Bob. >>>>> >>>>> >>>>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>> >>>>>> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". >>>>>> >>>>>> The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Misha >>>>>> >>>>>> >>>>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>>>> >>>>>>> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >>>>>>> >>>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >>>>>>> >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> Misha >>>>>>> >>>>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>>>> Hi Severin, Bob, >>>>>>>>> >>>>>>>>> Thank you for reviewing the code. >>>>>>>>> >>>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>>>>>>> file and waiting for it to exist with a timeout? >>>>>>>>> I will try out this approach. >>>>>>>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>>>>>>> >>>>>>>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>>>>>>> >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Misha >>>>>>>>>> Isn?t there a shared volume between the two >>>>>>>>>> processes? >>>>>>>>>> >>>>>>>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>>>>>>> to the end. >>>>>>>>>> >>>>>>>>>> Bob. >>>>>>>>>> >>>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Misha, >>>>>>>>>>> >>>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>>>>>>> >>>>>>>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>>>>>>> been loaded yet. >>>>>>>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>>>>>>> the main test class. >>>>>>>>>>> That's what I've found too. >>>>>>>>>>> >>>>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>>>>>>> sleep in between. >>>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>>>>>>> >>>>>>>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>>>> >>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>>>> Looks OK to me. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Severin >>>>>>>>>>> From chris.plummer at oracle.com Tue Aug 13 22:47:25 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 13 Aug 2019 15:47:25 -0700 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> References: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> Message-ID: <8837c2e8-745c-380f-d12a-4710ef5d1f43@oracle.com> Hi Yasumasa, The changes look ok to me, although I've got to admit the language and library features used by toolMap are a bit beyond what I'm comfortable with (I'm one of those that find many uses of newer language and library feature to be more of a hindrance to understanding code than they are a benefit to simplifying or streamlining code). But I'm ok with it and assume it works as the reader would expect (after staring at it for a bit). I likely won't be able to do any re-review if more changes are needed since I'll be out of the office for a while. I think Serguei is going to do the 2nd review, so assuming he's ok with it, and any additional changes are minor, you can still count me as a reviewer. thanks, Chris On 8/10/19 4:14 AM, Yasumasa Suenaga wrote: > PING: Could you review it? > >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ > > > Yasumasa > > > On 2019/07/24 10:18, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this change: >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >> >> This enhancement has been proposed in [1]. >> >> SALauncher (jhsdb implementation) processes the option for each >> subcommand (e.g. jstack, hsdb). >> But they exist in many place with similar code. >> So there is some room for refactoring. >> >> This change has passed the tests on submit repo and serviceability/sa >> tests. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028376.html From nick.gasson at arm.com Wed Aug 14 01:26:31 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Wed, 14 Aug 2019 09:26:31 +0800 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> <178290d1-7904-d59e-3f64-3f57a51ac071@arm.com> Message-ID: Hi Chris, > The changes look good, although I think the new file should go in the > serviceability/sa test directory, unless you think this is a generally > useful class that might be used by tests outside of the sa. > The new file is under test/hotspot/jtreg/serviceability/sa/ - the same directory as ClhsdbFindPC.java - did you mean somewhere else? Thanks, Nick From chris.plummer at oracle.com Wed Aug 14 02:28:52 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 13 Aug 2019 19:28:52 -0700 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> <178290d1-7904-d59e-3f64-3f57a51ac071@arm.com> Message-ID: <04ae9472-05e5-91a9-b986-428806a7ee15@oracle.com> On 8/13/19 6:26 PM, Nick Gasson wrote: > Hi Chris, > >> The changes look good, although I think the new file should go in the >> serviceability/sa test directory, unless you think this is a generally >> useful class that might be used by tests outside of the sa. >> > > The new file is under test/hotspot/jtreg/serviceability/sa/ - the same > directory as ClhsdbFindPC.java - did you mean somewhere else? > > Thanks, > Nick > Oh, sorry. For some reason I thought it was in the lib directory with LingeredApp. Yes, it's good the way it is. thanks, Chris From yasuenag at gmail.com Wed Aug 14 04:43:03 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 14 Aug 2019 13:43:03 +0900 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: <8837c2e8-745c-380f-d12a-4710ef5d1f43@oracle.com> References: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> <8837c2e8-745c-380f-d12a-4710ef5d1f43@oracle.com> Message-ID: Thanks Chris! I'm waiting for Serguei's review. Yasumasa On 2019/08/14 7:47, Chris Plummer wrote: > Hi Yasumasa, > > The changes look ok to me, although I've got to admit the language and library features used by toolMap are a bit beyond what I'm comfortable with (I'm one of those that find many uses of newer language and library feature to be more of a hindrance to understanding code than they are a benefit to simplifying or streamlining code). But I'm ok with it and assume it works as the reader would expect (after staring at it for a bit). > > I likely won't be able to do any re-review if more changes are needed since I'll be out of the office for a while. I think Serguei is going to do the 2nd review, so assuming he's ok with it, and any additional changes are minor, you can still count me as a reviewer. > > thanks, > > Chris > > On 8/10/19 4:14 AM, Yasumasa Suenaga wrote: >> PING: Could you review it? >> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >> >> >> Yasumasa >> >> >> On 2019/07/24 10:18, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >>> >>> This enhancement has been proposed in [1]. >>> >>> SALauncher (jhsdb implementation) processes the option for each subcommand (e.g. jstack, hsdb). >>> But they exist in many place with similar code. >>> So there is some room for refactoring. >>> >>> This change has passed the tests on submit repo and serviceability/sa tests. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028376.html > > From adinn at redhat.com Wed Aug 14 08:10:42 2019 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 14 Aug 2019 09:10:42 +0100 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: <04ae9472-05e5-91a9-b986-428806a7ee15@oracle.com> References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> <178290d1-7904-d59e-3f64-3f57a51ac071@arm.com> <04ae9472-05e5-91a9-b986-428806a7ee15@oracle.com> Message-ID: <4c9f5bb8-2a67-258e-9b65-acc91237055f@redhat.com> On 14/08/2019 03:28, Chris Plummer wrote: > On 8/13/19 6:26 PM, Nick Gasson wrote: >> Hi Chris, >> >>> The changes look good, although I think the new file should go in the >>> serviceability/sa test directory, unless you think this is a generally >>> useful class that might be used by tests outside of the sa. >>> >> >> The new file is under test/hotspot/jtreg/serviceability/sa/ - the same >> directory as ClhsdbFindPC.java - did you mean somewhere else? >> >> Thanks, >> Nick >> > Oh, sorry. For some reason I thought it was in the lib directory with > LingeredApp. Yes, it's good the way it is. I'm still happy with this patch to go in after these changes. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From poonam.bajaj at oracle.com Wed Aug 14 13:13:53 2019 From: poonam.bajaj at oracle.com (Poonam Parhar) Date: Wed, 14 Aug 2019 06:13:53 -0700 Subject: RFR 8229420: [Redo] jstat reports incorrect values for OU for CMS GC Message-ID: <573749a6-2389-6de2-4f2f-abc3b0e1dd6d@oracle.com> Hello, The fix for JDK-8215523 had to be backed out with '8227178: Backout of 8215523' because it had caused timeout failures for some of the CMS tests. Changeset of JDK-8215523: http://hg.openjdk.java.net/jdk/jdk/rev/734e58d8477b Those failures get resolved by adding the following check before calling recalculate_used_stable() in CompactibleFreeListSpace::allocate(): 1387 // During GC we do not need to recalculate the stable used value for 1388 // every allocation in old gen. It is done once at the end of GC instead 1389 // for performance reasons. 1390 if (!CMSHeap::heap()->is_gc_active()) { 1391 recalculate_used_stable(); 1392 } 1393 Please review the webrev with the updated fix: http://cr.openjdk.java.net/~poonam/8229420/webrev.00/ Thanks, Poonam -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Aug 14 18:25:07 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 14 Aug 2019 14:25:07 -0400 Subject: 8185005: Improve performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) In-Reply-To: <81f607e5-4a65-9e14-8ad0-fdcb8e03c6a4@oracle.com> References: <4C4212D0-BFFF-4C85-ACC6-05200F220C3F@oracle.com> <2d6dede1-aa79-99ce-a823-773fa2e19827@oracle.com> <6E7B043A-4647-4931-977C-1854CA7EBEC1@oracle.com> <76BCC96D-DB5D-409A-95D5-3A64B893832D@oracle.com> <7e0ba39e-e5b7-f56b-66ea-820a0a35ec2c@oracle.com> <87748188-3BD4-4A8B-938A-89DBC8F3C57A@oracle.com> <48311B39-43F9-49E5-BEC7-A64F9D2588AF@oracle.com> <81f607e5-4a65-9e14-8ad0-fdcb8e03c6a4@oracle.com> Message-ID: <2bf948a5-1397-3459-030a-be9598b4e7fc@oracle.com> So based on the last three message on this thread: webrev.05 is withdrawn for the moment webrev.04 is the current webrev, but needs to have some startup time ?? ? ? ?? issues resolved before moving forward. I'm going to hold off on re-reviewing anything at the moment until the dust settles... Dan On 8/12/19 7:34 PM, David Holmes wrote: > Hi Daniil, > > On 13/08/2019 9:24 am, Daniil Titov wrote: >> Hi Robbin, >> >> Thank you very much for reviewing this version of the fix! Based on >> your findings >> it seems as it makes sense to make a step back and? continue with the >> approach we took before in the previous version of the webrev >> (webrev.04), >> and get more information about the impact on the startup time it has. >> I will >> consult with Claus regarding this and then share the findings. > > That seems a good approach to me. It wasn't at all clear to me that > the latest proposed approach would actually solve the original problem > in a satisfactory way - it would depend on how constant the set of > threads being queried was. > > There is no perfect solution here as any fix to the reported problem > incurs overhead elsewhere. Even evaluating the merits of the different > trade-offs is hard to do - we could end up with a compromise solution > that fails to satisfy anyone. > > David > ----- > >> Thanks again, >> --Daniil >> >> >> >> >> >> >> ?On 8/12/19, 5:22 AM, "Robbin Ehn" wrote: >> >> ???? Hi Daniil, >> ???? ???? I took a new deeper dive into this. >> ???? ???? This line seems to have some issues: >> ???? ???? if (ThreadTable::is_initialized() && >> thread->in_thread_table() && >> ???? !thread->is_attaching_via_jni()) { >> ???? ???? If you create new threads which attaches and then dies, the >> table will just keep >> ???? growing. So you must remove them also ? >> ???? ???? Secondly you should not use volatile semantics for >> _in_thread_table. >> ???? The load in the if-statement can be reordered with _is_initialized. >> ???? Which could lead to a leak, rogue pointer in the table. >> ???? ???? So both "static volatile bool _is_initialized;" and >> "volatile bool >> ???? _in_thread_table; " >> ???? should be stored with store_release and loaded with load_acquire. >> ???? ???? Unfortunately it looks like there still would be races if >> ???? ThreadTable::add_thread e.g. context switch at: >> ???? ???? if (_local_table->insert(thread, lookup, entry)) { >> ???? // HERE >> ??????? java_thread->set_in_thread_table(true); >> ???? ???? *Remove side can pass the if-statement without removing. >> ???? ???? Since this thread also maybe exiting at any moment, e.g. >> context switch: >> ???? ??????????? if (tobj != NULL && !thread->is_exiting() && >> ??????????????? java_tid == java_lang_Thread::thread_id(tobj)) { >> ???????? // HERE >> ????????????? ThreadTable::add_thread(java_tid, thread); >> ???? ???? *Add side can add a thread that is exiting. >> ???? ???? Mixing in a third thread looking up a random tid and >> getting a JavaThread*, it >> ???? must validate it against it's ThreadsList. Making the hashtable >> useless. >> ???? ???? So I think the only one adding and removing should be the >> thread itself. >> ???? 1:Add to ThreadsList >> ???? 2:Add to ThreadTable >> ???? 3:Remove from ThreadTable >> ???? 4:Remove ThreadsList >> ???? ???? Between 1-2 and 3-4 the thread would be looked-up via >> linear scan. >> ???? I don't see an easy way around the start-up issue with this. >> ???? ???? Maybe have the cache in Java. >> ???? Pass in the thread obj into a >> ???? java_sun_management_ThreadImpl_getThreadTotalCpuTime3 instead, >> ???? thus skipping any look-ups in native. >> ???? ???? Thanks, Robbin >> ???? ???? ???? On 8/12/19 5:49 AM, Daniil Titov wrote: >> ???? > Hi David, Robbin, Daniel, and Serguei, >> ???? > >> ???? > Please review a new version of the fix. >> ???? > >> ???? > As David suggested I created a separated Jira issue [1] to >> cover? additional optimization for >> ???? > some callers of find_JavaThread_from_java_tid() and this >> version of the fix no longer includes >> ???? > changes in management.cpp ( and the test related with these >> changes). >> ???? > >> ???? > Regarding the impact the previous version of the fix had on >> the thread startup time at heavy load (e.g. >> ???? > when 5000 threads are created and destroyed every second) I >> tried a different approach that makes >> ???? > calls to ThreadTable::add_thread? and >> ThreadTable::remove_thread? asynchronous and offloads the >> ???? > work for actual modifications of the thread table to a >> periodic task that runs every 5 seconds. With the >> ???? > same? stress test scenario (the? test does some warm-up and >> then measures the time it takes to create >> ???? > and start 100,000 threads; every? thread just sleeps for 100 >> ms) the impact on the thread startup time >> ???? > was reduced to 1.2% ( from 2.7%). >> ???? > >> ???? > The cause of this impact in this stress test scenario is that >> as soon as the thread table is initialized, >> ???? > an additional work to insert? and delete entries in the thread >> table should be performed, even if >> ???? > com.sun.management.ThreadMXBean methods are no longer called. >> For example, In the stress test >> ???? > mentioned above, every second about 5000 entries had to be >> inserted in the table and then deleted. >> ???? > >> ???? > That doesn't look right and the new version of the fix uses >> the different approach: the thread is added to >> ???? > the thread table only when this thread is requested by >> com.sun.management.ThreadMXBean bean. Every >> ???? > time when find_JavaThread_from_java_tid() is called for a new >> tid, the thread? is found by the iterating over >> ???? > the thread list and added to the thread table. All consequent >> calls to find_JavaThread_from_java_tid() for >> ???? > the same tid returns the thread from the thread table. >> ???? > >> ???? > Running stress test for the cases when the thread table is >> enabled and not showed no difference in the >> ???? > average thread startup times. >> ???? > >> ???? > [1] : https://bugs.openjdk.java.net/browse/JDK-8229391 >> ???? > >> ???? > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? > Webrev: https://cr.openjdk.java.net/~dtitov/8185005/webrev.05/ >> ???? > >> ???? > Thanks, >> ???? > Daniil >> ???? > >> ???? > ?On 8/4/19, 7:54 PM, "David Holmes" >> wrote: >> ???? > >> ???? >????? Hi Daniil, >> ???? > >> ???? >????? On 3/08/2019 8:16 am, Daniil Titov wrote: >> ???? >????? > Hi David, >> ???? >????? > >> ???? >????? > Thank you for your detailed review. Please review a new >> version of the fix that includes >> ???? >????? > the changes you suggested: >> ???? >????? > - ThreadTableCreate_lock scope is reduced to cover the >> creation of the table only; >> ???? >????? > - ThreadTableCreate_lock is made _safepoint_check_always; >> ???? > >> ???? >????? Okay. >> ???? > >> ???? >????? > - ServiceThread is no longer responsible for the >> resizing of the thread table, instead, >> ???? >????? >??? the thread table is changed to grow on demand by the >> thread that is doing the addition; >> ???? > >> ???? >????? Okay - I'm happy to get the serviceThread out of the >> picture here. >> ???? > >> ???? >????? > - fixed nits and formatting issues. >> ???? > >> ???? >????? Okay. >> ???? > >> ???? >????? >>> The change also includes additional optimization for >> some callers of find_JavaThread_from_java_tid() >> ???? >????? >>>?? as Daniel suggested. >> ???? >????? >> Not sure it's best to combine these, but if they are >> limited to the >> ???? >????? >> changes in management.cpp only then that may be okay. >> ???? >????? > >> ???? >????? > The additional optimization for some callers of >> find_JavaThread_from_java_tid() is >> ???? >????? > limited to management.cpp (plus a new test) so I left >> them in the webrev? but >> ???? >????? > I also could move it in the separate issue if required. >> ???? > >> ???? >????? I'd prefer this part of be separated out, but won't >> insist. Let's see if >> ???? >????? Dan or Serguei have a strong opinion. >> ???? > >> ???? >????? >??? > src/hotspot/share/runtime/threadSMR.cpp >> ???? >????? >??? >755???? jlong tid = >> SharedRuntime::get_java_tid(thread); >> ???? >????? >??? > 926???? jlong tid = >> SharedRuntime::get_java_tid(thread); >> ???? >????? >?? >? I think it cleaner/better to just use >> ???? >????? >?? > jlong tid = >> java_lang_Thread::thread_id(thread->threadObj()); >> ???? >????? >?? > as we know thread is not NULL, it is a JavaThread >> and it has to have a >> ???? >????? >?? > non-null threadObj. >> ???? >????? > >> ???? >????? > I had to leave this code unchanged since it turned out >> the threadObj is null >> ???? >????? > when VM is destroyed: >> ???? >????? > >> ???? >????? > V? [libjvm.so+0xe165d7] oopDesc::long_field(int) >> const+0x67 >> ???? >????? > V? [libjvm.so+0x16e06c6] >> ThreadsSMRSupport::add_thread(JavaThread*)+0x116 >> ???? >????? > V? [libjvm.so+0x16d1302] Threads::add(JavaThread*, >> bool)+0x82 >> ???? >????? > V? [libjvm.so+0xef8369] >> attach_current_thread.part.197+0xc9 >> ???? >????? > V? [libjvm.so+0xec136c] jni_DestroyJavaVM+0x6c >> ???? >????? > C? [libjli.so+0x4333]? JavaMain+0x2c3 >> ???? >????? > C? [libjli.so+0x8159]? ThreadJavaMain+0x9 >> ???? > >> ???? >????? This is actually nothing to do with the VM being >> destroyed, but is an >> ???? >????? issue with JNI_AttachCurrentThread and its interaction >> with the >> ???? >????? ThreadSMR iterators. The attach process is: >> ???? >????? - create JavaThread >> ???? >????? - mark as "is attaching via jni" >> ???? >????? - add to ThreadsList >> ???? >????? - create java.lang.Thread object (you can only execute >> Java code after >> ???? >????? you are attached) >> ???? >????? - mark as "attach completed" >> ???? > >> ???? >????? So while a thread "is attaching" it will be seen by the >> ThreadSMR thread >> ???? >????? iterator but will have a NULL java.lang.Thread object. >> ???? > >> ???? >????? We special-case attaching threads in a number of places >> in the VM and I >> ???? >????? think we should be explicitly doing something here to >> filter out >> ???? >????? attaching threads, rather than just being tolerant of a >> NULL j.l.Thread >> ???? >????? object. Specifically in ThreadsSMRSupport::add_thread: >> ???? > >> ???? >????? if (ThreadTable::is_initialized() && >> !thread->is_attaching_via_jni()) { >> ???? >???????? jlong tid = >> java_lang_Thread::thread_id(thread->threadObj()); >> ???? >???????? ThreadTable::add_thread(tid, thread); >> ???? >????? } >> ???? > >> ???? >????? Note that in ThreadsSMRSupport::remove_thread we can use >> the same guard, >> ???? >????? which covers the case the JNI attach encountered an error >> trying to >> ???? >????? create the j.l.Thread object. >> ???? > >> ???? >????? >> src/hotspot/share/services/threadTable.cpp >> ???? >????? >> 71???? static uintx get_hash(Value const& value, bool* >> is_dead) { >> ???? >????? > >> ???? >????? >> The is_dead parameter still bothers me here. I can't >> make enough sense >> ???? >????? >> out of the template code in ConcurrentHashtable to see >> why we have to >> ???? >????? >> have it, but I'm concerned that its very existence >> means we perhaps >> ???? >????? >> should not be trying to extend CHT in this context. ?? >> ???? >????? > >> ???? >????? > My understanding is that is_dead parameter provides a >> mechanism for >> ???? >????? > ConcurrentHashtable to remove stale entries that were >> not explicitly >> ???? >????? > removed by calling ConcurrentHashTable::remove() method. >> ???? >????? > I think that just because in our case we don't use this >> mechanism doesn't >> ???? >????? > mean we should not use ConcurrentHashTable. >> ???? > >> ???? >????? Can you confirm that this usage is okay with Robbin Ehn >> please. He's >> ???? >????? back from vacation this week. >> ???? > >> ???? >????? >> I would still want to see what impact this has on thread >> ???? >????? >> startup cost, both with and without the table being >> initialized. >> ???? >????? > >> ???? >????? > I run a test that initializes the table by calling >> ThreadMXBean.get getThreadInfo(), >> ???? >????? > starts some threads as a worm-up, and then creates and >> starts 100,000 threads >> ???? >????? > (each thread just sleeps for 100 ms). In case when the >> thread table is enabled >> ???? >????? > 100,000 threads are created and started? for about >> 15200 ms. If the thread table >> ???? >????? > is off the test takes about 14800 ms. Based on this >> information the enabled >> ???? >????? > thread table makes the thread startup about 2.7% slower. >> ???? > >> ???? >????? That doesn't sound very good. I think we may need to >> Claes involved to >> ???? >????? help investigate overall performance impact here. >> ???? > >> ???? >????? > Webrev: >> https://cr.openjdk.java.net/~dtitov/8185005/webrev.04/ >> ???? >????? > Bug: https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? > >> ???? >????? No further code comments. >> ???? > >> ???? >????? I didn't look at the test in detail. >> ???? > >> ???? >????? Thanks, >> ???? >????? David >> ???? > >> ???? >????? > Thanks! >> ???? >????? > --Daniil >> ???? >????? > >> ???? >????? > >> ???? >????? > ?On 7/29/19, 12:53 AM, "David Holmes" >> wrote: >> ???? >????? > >> ???? >????? >????? Hi Daniil, >> ???? >????? > >> ???? >????? >????? Overall I think this is a reasonable approach but >> I would still like to >> ???? >????? >????? see some performance and footprint numbers, both >> to verify it fixes the >> ???? >????? >????? problem reported, and that we are not getting >> penalized elsewhere. >> ???? >????? > >> ???? >????? >????? On 25/07/2019 3:21 am, Daniil Titov wrote: >> ???? >????? >????? > Hi David, Daniel, and Serguei, >> ???? >????? >????? > >> ???? >????? >????? > Please review the new version of the fix, that >> makes the thread table initialization on demand and >> ???? >????? >????? > moves it inside >> ThreadsList::find_JavaThread_from_java_tid(). At the creation time >> the thread table >> ???? >????? >????? >?? is initialized with the threads from the >> current thread list. We don't want to hold Threads_lock >> ???? >????? >????? > inside find_JavaThread_from_java_tid(),? thus >> new threads still could be created? while the thread >> ???? >????? >????? > table is being initialized . Such threads will >> be found by the linear search and added to the thread table >> ???? >????? >????? > later, in >> ThreadsList::find_JavaThread_from_java_tid(). >> ???? >????? > >> ???? >????? >????? The initialization allows the created but >> unpopulated, or partially >> ???? >????? >????? populated, table to be seen by other threads - is >> that your intention? >> ???? >????? >????? It seems it should be okay as the other threads >> will then race with the >> ???? >????? >????? initializing thread to add specific entries, and >> this is a concurrent >> ???? >????? >????? map so that should be functionally correct. But if >> so then I think you >> ???? >????? >????? can also reduce the scope of the >> ThreadTableCreate_lock so that it >> ???? >????? >????? covers creation of the table only, not the initial >> population of the table. >> ???? >????? > >> ???? >????? >????? I like the approach of only initializing the table >> when needed and using >> ???? >????? >????? that to control when the add/remove-thread code >> needs to update the >> ???? >????? >????? table. But I would still want to see what impact >> this has on thread >> ???? >????? >????? startup cost, both with and without the table >> being initialized. >> ???? >????? > >> ???? >????? >????? > The change also includes additional optimization >> for some callers of find_JavaThread_from_java_tid() >> ???? >????? >????? > as Daniel suggested. >> ???? >????? > >> ???? >????? >????? Not sure it's best to combine these, but if they >> are limited to the >> ???? >????? >????? changes in management.cpp only then that may be >> okay. It helps to be >> ???? >????? >????? able to focus on the table related changes without >> being distracted by >> ???? >????? >????? other optimizations. >> ???? >????? > >> ???? >????? >????? > That is correct that ResolvedMethodTable was >> used as a blueprint for the thread table, however, I tried >> ???? >????? >????? > to strip it of the all functionality that is not >> required in the thread table case. >> ???? >????? > >> ???? >????? >????? The revised version seems better in that regard. >> But I still have a >> ???? >????? >????? concern, see below. >> ???? >????? > >> ???? >????? >????? > We need to have the thread table resizable and >> allow it to grow as the number of threads increases to avoid >> ???? >????? >????? > reserving excessive memory a-priori or >> deteriorating lookup times. The ServiceThread is responsible for >> ???? >????? >????? > growing the thread table when required. >> ???? >????? > >> ???? >????? >????? Yes but why? Why can't this table be grown on >> demand by the thread that >> ???? >????? >????? is doing the addition? For other tables we may >> have to delegate to the >> ???? >????? >????? service thread because the current thread cannot >> perform the action, or >> ???? >????? >????? it doesn't want to perform it at the time the need >> for the resize is >> ???? >????? >????? detected (e.g. its detected at a safepoint and you >> want the resize to >> ???? >????? >????? happen later outside the safepoint). It's not >> apparent to me that such >> ???? >????? >????? restrictions apply here. >> ???? >????? > >> ???? >????? >????? > There is no ConcurrentHashTable available in >> Java 8 and for backporting this fix to Java 8 another implementation >> ???? >????? >????? > of the hash table, probably originally suggested >> in the patch attached to the JBS issue, should be used.? It will make >> ???? >????? >????? > the backporting more complicated, however, >> adding a new Implementation of the hash table in Java 14 while it >> ???? >????? >????? > already has ConcurrentHashTable doesn't seem? >> reasonable for me. >> ???? >????? > >> ???? >????? >????? Ok. >> ???? >????? > >> ???? >????? >????? > Webrev: >> http://cr.openjdk.java.net/~dtitov/8185005/webrev.03 >> ???? >????? > >> ???? >????? >????? Some specific code comments: >> ???? >????? > >> ???? >????? > src/hotspot/share/runtime/mutexLocker.cpp >> ???? >????? > >> ???? >????? >????? +?? def(ThreadTableCreate_lock?????? , >> PaddedMutex? , special, >> ???? >????? >????? false, Monitor::_safepoint_check_never); >> ???? >????? > >> ???? >????? >????? I think this needs to be a _safepoint_check_always >> lock. The table will >> ???? >????? >????? be created by regular JavaThreads and they should >> (nearly) always be >> ???? >????? >????? checking for safepoints if they are going to block >> acquiring the lock. >> ???? >????? >????? And it isn't at all obvious that the thread doing >> the creation can't go >> ???? >????? >????? to a safepoint whilst this lock is held. >> ???? >????? > >> ???? >????? >????? --- >> ???? >????? > >> ???? >????? >????? src/hotspot/share/runtime/threadSMR.cpp >> ???? >????? > >> ???? >????? >????? Nit: >> ???? >????? > >> ???? >????? >??????? 618?????? JavaThread* thread = thread_at(i); >> ???? >????? > >> ???? >????? >????? you could reuse the new java_thread local you >> introduced at line 613 and >> ???? >????? >????? just rename that "new" variable to "thread" so you >> don't have to change >> ???? >????? >????? all other uses. >> ???? >????? > >> ???? >????? >????? 628?? } else if (java_thread != NULL && ... >> ???? >????? > >> ???? >????? >????? You don't need to check != NULL here as you only >> get here when >> ???? >????? >????? java_thread is not NULL. >> ???? >????? > >> ???? >????? >??????? 755???? jlong tid = >> SharedRuntime::get_java_tid(thread); >> ???? >????? >??????? 926???? jlong tid = >> SharedRuntime::get_java_tid(thread); >> ???? >????? > >> ???? >????? >????? I think it cleaner/better to just use >> ???? >????? > >> ???? >????? >????? jlong tid = >> java_lang_Thread::thread_id(thread->threadObj()); >> ???? >????? > >> ???? >????? >????? as we know thread is not NULL, it is a JavaThread >> and it has to have a >> ???? >????? >????? non-null threadObj. >> ???? >????? > >> ???? >????? >????? --- >> ???? >????? > >> ???? >????? > src/hotspot/share/services/management.cpp >> ???? >????? > >> ???? >????? >????? 1323???????? if (THREAD->is_Java_thread()) { >> ???? >????? >????? 1324?????????? JavaThread* current_thread = >> (JavaThread*)THREAD; >> ???? >????? > >> ???? >????? >????? These calls can only be made on a JavaThread so >> this be simplified to >> ???? >????? >????? remove the is_Java_thread() call. Similarly in >> other places. >> ???? >????? > >> ???? >????? >????? --- >> ???? >????? > >> ???? >????? > src/hotspot/share/services/threadTable.cpp >> ???? >????? > >> ???? >????? >???????? 55 class ThreadTableEntry : public >> CHeapObj { >> ???? >????? >???????? 56?? private: >> ???? >????? >???????? 57???? jlong _tid; >> ???? >????? > >> ???? >????? >????? I believe hotspot style is to not indent the >> access modifiers in C++ >> ???? >????? >????? class declarations, so the above would just be: >> ???? >????? > >> ???? >????? >???????? 55 class ThreadTableEntry : public >> CHeapObj { >> ???? >????? >???????? 56 private: >> ???? >????? >???????? 57?? jlong _tid; >> ???? >????? > >> ???? >????? >????? etc. >> ???? >????? > >> ???? >????? >??????? 60???? ThreadTableEntry(jlong tid, JavaThread* >> java_thread) : >> ???? >????? >??????? 61 _tid(tid),_java_thread(java_thread) {} >> ???? >????? > >> ???? >????? >????? line 61 should be indented as it continues line 60. >> ???? >????? > >> ???? >????? >???????? 67 class ThreadTableConfig : public AllStatic { >> ???? >????? >???????? ... >> ???? >????? >???????? 71???? static uintx get_hash(Value const& >> value, bool* is_dead) { >> ???? >????? > >> ???? >????? >????? The is_dead parameter still bothers me here. I >> can't make enough sense >> ???? >????? >????? out of the template code in ConcurrentHashtable to >> see why we have to >> ???? >????? >????? have it, but I'm concerned that its very existence >> means we perhaps >> ???? >????? >????? should not be trying to extend CHT in this >> context. ?? >> ???? >????? > >> ???? >????? >??????? 115?? size_t start_size_log = size_log > >> DefaultThreadTableSizeLog >> ???? >????? >??????? 116?? ? size_log : DefaultThreadTableSizeLog; >> ???? >????? > >> ???? >????? >????? line 116 should be indented, though in this case I >> think a better layout >> ???? >????? >????? would be: >> ???? >????? > >> ???? >????? >??????? 115?? size_t start_size_log = >> ???? >????? >??????? 116?????? size_log > DefaultThreadTableSizeLog ? >> size_log : >> ???? >????? >????? DefaultThreadTableSizeLog; >> ???? >????? > >> ???? >????? >??????? 131 double ThreadTable::get_load_factor() { >> ???? >????? >??????? 132?? return (double)_items_count/_current_size; >> ???? >????? >??????? 133 } >> ???? >????? > >> ???? >????? >????? Not sure that is doing what you want/expect. It >> will perform integer >> ???? >????? >????? division and then cast that whole integer to a >> double. If you want >> ???? >????? >????? double arithmetic you need: >> ???? >????? > >> ???? >????? >????? return ((double)_items_count)/_current_size; >> ???? >????? > >> ???? >????? >????? 180???? jlong????????? _tid; >> ???? >????? >????? 181???? uintx???????? _hash; >> ???? >????? > >> ???? >????? >????? Nit: no need for all those spaces before the >> variable name. >> ???? >????? > >> ???? >????? >??????? 183???? ThreadTableLookup(jlong tid) >> ???? >????? >??????? 184???? : _tid(tid), _hash(primitive_hash(tid)) {} >> ???? >????? > >> ???? >????? >????? line 184 should be indented. >> ???? >????? > >> ???? >????? >????? 201???? ThreadGet():_return(NULL) {} >> ???? >????? > >> ???? >????? >????? Nit: need space after : >> ???? >????? > >> ???? >????? >??????? 211??? assert(_is_initialized, "Thread table is >> not initialized"); >> ???? >????? >??????? 212?? _has_work = false; >> ???? >????? > >> ???? >????? >????? line 211 is indented one space too far. >> ???? >????? > >> ???? >????? >????? 229???? ThreadTableEntry* entry = new >> ThreadTableEntry(tid,java_thread); >> ???? >????? > >> ???? >????? >????? Nit: need space after , >> ???? >????? > >> ???? >????? >????? 252?? return _local_table->remove(thread,lookup); >> ???? >????? > >> ???? >????? >????? Nit: need space after , >> ???? >????? > >> ???? >????? >????? Thanks, >> ???? >????? >????? David >> ???? >????? >????? ------ >> ???? >????? > >> ???? >????? >????? > Bug: >> https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? >????? >????? > >> ???? >????? >????? > Thanks! >> ???? >????? >????? > --Daniil >> ???? >????? >????? > >> ???? >????? >????? > >> ???? >????? >????? > ?On 7/8/19, 3:24 PM, "Daniel D. Daugherty" >> wrote: >> ???? >????? >????? > >> ???? >????? >????? >????? On 6/29/19 12:06 PM, Daniil Titov wrote: >> ???? >????? >????? >????? > Hi Serguei and David, >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Serguei is right, >> ThreadTable::find_thread(java_tid) cannot? return a JavaThread with >> an unmatched java_tid. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Please find a new version of the fix that >> includes the changes Serguei suggested. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Regarding the concern about the >> maintaining the thread table when it may never even be queried, one of >> ???? >????? >????? >????? > the options could be to add ThreadTable >> ::isEnabled flag, set it to "false" by default, and wrap the calls to >> the thread table >> ???? >????? >????? >????? > in ThreadsSMRSupport add_thread() and >> remove_thread() methods to check this flag. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > When >> ThreadsList::find_JavaThread_from_java_tid() is called for the first >> time it could check if ThreadTable ::isEnabled >> ???? >????? >????? >????? > Is on and if not then set it on and >> populate the thread table with all existing threads from the thread >> list. >> ???? >????? >????? > >> ???? >????? >????? >????? I have the same concerns as David H. about >> this new ThreadTable. >> ???? >????? >????? > ThreadsList::find_JavaThread_from_java_tid() is >> only called from code >> ???? >????? >????? >????? in >> src/hotspot/share/services/management.cpp so I think that table >> ???? >????? >????? >????? needs to enabled and populated only if it >> is going to be used. >> ???? >????? >????? > >> ???? >????? >????? >????? I've taken a look at the webrev below and I >> see that David has >> ???? >????? >????? >????? followed up with additional comments. >> Before I do a crawl through >> ???? >????? >????? >????? code review for this, I would like to see >> the ThreadTable stuff >> ???? >????? >????? >????? made optional and David's other comments >> addressed. >> ???? >????? >????? > >> ???? >????? >????? >????? Another possible optimization is for >> callers of >> ???? >????? >????? > find_JavaThread_from_java_tid() to save the >> calling thread's >> ???? >????? >????? >????? tid value before they loop and if the >> current tid == saved_tid >> ???? >????? >????? >????? then use the current JavaThread* instead of >> calling >> ???? >????? >????? > find_JavaThread_from_java_tid() to get the >> JavaThread*. >> ???? >????? >????? > >> ???? >????? >????? >????? Dan >> ???? >????? >????? > >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Webrev: >> https://cr.openjdk.java.net/~dtitov/8185005/webrev.02/ >> ???? >????? >????? >????? > Bug: >> https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Thanks! >> ???? >????? >????? >????? > --Daniil >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > From: >> ???? >????? >????? >????? > Organization: Oracle Corporation >> ???? >????? >????? >????? > Date: Friday, June 28, 2019 at 7:56 PM >> ???? >????? >????? >????? > To: Daniil Titov >> , OpenJDK Serviceability >> , >> "hotspot-runtime-dev at openjdk.java.net" >> , "jmx-dev at openjdk.java.net" >> >> ???? >????? >????? >????? > Subject: Re: RFR: 8185005: Improve >> performance of ThreadMXBean.getThreadInfo(long ids[], int maxDepth) >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Hi Daniil, >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > I have several quick comments. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > The indent in the hotspot c/c++ files has >> to be 2, not 4. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > >> https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/src/hotspot/share/runtime/threadSMR.cpp.frames.html >> ???? >????? >????? >????? > 614 JavaThread* >> ThreadsList::find_JavaThread_from_java_tid(jlong java_tid) const { >> ???? >????? >????? >????? >?? 615???? JavaThread* java_thread = >> ThreadTable::find_thread(java_tid); >> ???? >????? >????? >????? >?? 616???? if (java_thread == NULL && >> java_tid == PMIMORDIAL_JAVA_TID) { >> ???? >????? >????? >????? >?? 617???????? // >> ThreadsSMRSupport::add_thread() is not called for the primordial >> ???? >????? >????? >????? >?? 618???????? // thread. Thus, we find >> this thread with a linear search and add it >> ???? >????? >????? >????? >?? 619???????? // to the thread table. >> ???? >????? >????? >????? >?? 620???????? for (uint i = 0; i < >> length(); i++) { >> ???? >????? >????? >????? >?? 621 JavaThread* thread = thread_at(i); >> ???? >????? >????? >????? >?? 622???????????? if >> (is_valid_java_thread(java_tid,thread)) { >> ???? >????? >????? >????? >?? 623 ThreadTable::add_thread(java_tid, >> thread); >> ???? >????? >????? >????? >?? 624 return thread; >> ???? >????? >????? >????? >?? 625???????????? } >> ???? >????? >????? >????? >?? 626???????? } >> ???? >????? >????? >????? >?? 627???? } else if (java_thread != NULL >> && is_valid_java_thread(java_tid, java_thread)) { >> ???? >????? >????? >????? >?? 628???????? return java_thread; >> ???? >????? >????? >????? >?? 629???? } >> ???? >????? >????? >????? >?? 630???? return NULL; >> ???? >????? >????? >????? >?? 631 } >> ???? >????? >????? >????? >?? 632 bool >> ThreadsList::is_valid_java_thread(jlong java_tid, JavaThread* >> java_thread) { >> ???? >????? >????? >????? >?? 633???? oop tobj = >> java_thread->threadObj(); >> ???? >????? >????? >????? >?? 634???? // Ignore the thread if it >> hasn't run yet, has exited >> ???? >????? >????? >????? >?? 635???? // or is starting to exit. >> ???? >????? >????? >????? >?? 636???? return (tobj != NULL && >> !java_thread->is_exiting() && >> ???? >????? >????? >????? >?? 637 java_tid == >> java_lang_Thread::thread_id(tobj)); >> ???? >????? >????? >????? >?? 638 } >> ???? >????? >????? >????? > >> ???? >????? >????? >????? >?? 615???? JavaThread* java_thread = >> ThreadTable::find_thread(java_tid); >> ???? >????? >????? >????? > >> ???? >????? >????? >????? >??? I'd suggest to rename find_thread() to >> find_thread_by_tid(). >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > A space is missed after the comma: >> ???? >????? >????? >????? >??? 622 if >> (is_valid_java_thread(java_tid,thread)) { >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > An empty line is needed before L632. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > The name 'is_valid_java_thread' looks >> wrong (or confusing) to me. >> ???? >????? >????? >????? > Something like >> 'is_alive_java_thread_with_tid()' would be better. >> ???? >????? >????? >????? > It'd better to list parameters in the >> opposite order. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > The call to is_valid_java_thread() is >> confusing: >> ???? >????? >????? >????? >???? 627 } else if (java_thread != NULL && >> is_valid_java_thread(java_tid, java_thread)) { >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Why would the call >> ThreadTable::find_thread(java_tid) return a JavaThread with an >> unmatched java_tid? >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Thanks, >> ???? >????? >????? >????? > Serguei >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > On 6/28/19, 9:40 PM, "David Holmes" >> wrote: >> ???? >????? >????? >????? > >> ???? >????? >????? >????? >????? Hi Daniil, >> ???? >????? >????? >????? > >> ???? >????? >????? >????? >????? The definition and use of this >> hashtable (yet another hashtable >> ???? >????? >????? >????? >????? implementation!) will need careful >> examination. We have to be concerned >> ???? >????? >????? >????? >????? about the cost of maintaining it >> when it may never even be queried. You >> ???? >????? >????? >????? >????? would need to look at footprint cost >> and performance impact. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? >????? Unfortunately I'm just about to >> board a plane and will be out for the >> ???? >????? >????? >????? >????? next few days. I will try to look at >> this asap next week, but we will >> ???? >????? >????? >????? >????? need a lot more data on it. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? >????? Thanks, >> ???? >????? >????? >????? >????? David >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > On 6/28/19 3:31 PM, Daniil Titov wrote: >> ???? >????? >????? >????? > Please review the change that improves >> performance of ThreadMXBean MXBean methods returning the >> ???? >????? >????? >????? > information for specific threads. The >> change introduces the thread table that uses ConcurrentHashTable >> ???? >????? >????? >????? > to store one-to-one the mapping between >> the thread ids and JavaThread objects and replaces the linear >> ???? >????? >????? >????? > search over the thread list in >> ThreadsList::find_JavaThread_from_java_tid(jlong tid) method with the >> lookup >> ???? >????? >????? >????? > in the thread table. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Testing: Mach5 tier1,tier2 and tier3 >> tests successfully passed. >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Webrev: >> https://cr.openjdk.java.net/~dtitov/8185005/webrev.01/ >> ???? >????? >????? >????? > Bug: >> https://bugs.openjdk.java.net/browse/JDK-8185005 >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Thanks! >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > Best regards, >> ???? >????? >????? >????? > Daniil >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > >> ???? >????? >????? >????? > >> ???? >????? >????? > >> ???? >????? >????? > >> ???? >????? >????? > >> ???? >????? >????? > >> ???? >????? > >> ???? >????? > >> ???? >????? > >> ???? > >> ???? > >> ???? > >> >> From david.holmes at oracle.com Thu Aug 15 06:22:35 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 15 Aug 2019 16:22:35 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor Message-ID: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 Preliminary webrev (still has rough edges): http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ Background: We've had this comment for a long time: // The raw monitor subsystem is entirely distinct from normal // java-synchronization or jni-synchronization. raw monitors are not // associated with objects. They can be implemented in any manner // that makes sense. The original implementors decided to piggy-back // the raw-monitor implementation on the existing Java objectMonitor mechanism. // This flaw needs to fixed. We should reimplement raw monitors as sui-generis. // Specifically, we should not implement raw monitors via java monitors. // Time permitting, we should disentangle and deconvolve the two implementations // and move the resulting raw monitor implementation over to the JVMTI directories. // Ideally, the raw monitor implementation would be built on top of // park-unpark and nothing else. This is an attempt to do that disentangling so that we can then consider changes to ObjectMonitor without having to worry about JvmtiRawMonitors. But rather than building on low-level park/unpark (which would require the same manual queue management and much of the same complex code as exists in ObjectMonitor) I decided to try and do this on top of PlatformMonitor. The reason this is just a RFC rather than RFR is that I overlooked a non-trivial aspect of JvmtiRawMonitors: like Java monitors (as implemented by ObjectMonitor) they interact with the Thread.interrupt mechanism. This is not clearly stated in the JVM TI specification [1] but only in passing by the possible errors for RawMonitorWait: JVMTI_ERROR_INTERRUPT Wait was interrupted, try again As I explain in the bug report there is no way to build in proper interrupt support using PlatformMonitor as there is no way we can "interrupt" the low-level pthread_cond_wait. But we can approximate it. What I've done in this preliminary version is just check interrupt state before and after the actual "wait" but we won't get woken by the interrupt once we have actually blocked. Alternatively we could use a periodic polling approach and wakeup every Nms to check for interruption. The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not affected by this choice as that code ignores the interrupt until the real action it was waiting for has occurred. The interrupt is then reposted later. But more generally there could be users of JvmtiRawMonitors that expect/require that RawMonitorWait is responsive to Thread.interrupt in a manner similar to Object.wait. And if any of them are reading this then I'd like to know - hence this RFC :) FYI testing to date: - tiers 1 -3 all platforms - hotspot: serviceability/jvmti /jdwp vmTestbase/nsk/jvmti /jdwp - JDK: com/sun/jdi Comments/opinions appreciated. Thanks, David [1] https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait From nick.gasson at arm.com Thu Aug 15 06:54:45 2019 From: nick.gasson at arm.com (Nick Gasson) Date: Thu, 15 Aug 2019 14:54:45 +0800 Subject: RFR: 8229118: [TESTBUG] serviceability/sa/ClhsdbFindPC fails on AArch64 In-Reply-To: <4c9f5bb8-2a67-258e-9b65-acc91237055f@redhat.com> References: <4a646485-4230-9632-dfd4-4368d79ba4cd@arm.com> <39f8fd3f-fcce-0200-3922-33405978e297@redhat.com> <5071d6c9-73f8-4c7d-4ae8-3b3d1bcddf82@arm.com> <178290d1-7904-d59e-3f64-3f57a51ac071@arm.com> <04ae9472-05e5-91a9-b986-428806a7ee15@oracle.com> <4c9f5bb8-2a67-258e-9b65-acc91237055f@redhat.com> Message-ID: Thanks Andrew and Chris. Pushed here: https://hg.openjdk.java.net/jdk/jdk/rev/902cef494e66 Nick On 14/08/2019 16:10, Andrew Dinn wrote: > On 14/08/2019 03:28, Chris Plummer wrote: >> On 8/13/19 6:26 PM, Nick Gasson wrote: >>> Hi Chris, >>> >>>> The changes look good, although I think the new file should go in the >>>> serviceability/sa test directory, unless you think this is a generally >>>> useful class that might be used by tests outside of the sa. >>>> >>> >>> The new file is under test/hotspot/jtreg/serviceability/sa/ - the same >>> directory as ClhsdbFindPC.java - did you mean somewhere else? >>> >>> Thanks, >>> Nick >>> >> Oh, sorry. For some reason I thought it was in the lib directory with >> LingeredApp. Yes, it's good the way it is. > I'm still happy with this patch to go in after these changes. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From serguei.spitsyn at oracle.com Thu Aug 15 08:12:16 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 15 Aug 2019 01:12:16 -0700 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> References: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> Message-ID: <4a61d2ac-0a27-fc31-307d-4a818c4b4094@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Aug 15 08:25:36 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 15 Aug 2019 01:25:36 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> Message-ID: <4ea44805-4810-c6a4-b47c-b2ab7bba9d49@oracle.com> An HTML attachment was scrubbed... URL: From adam.farley at uk.ibm.com Thu Aug 15 11:38:34 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Thu, 15 Aug 2019 12:38:34 +0100 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: <4ea44805-4810-c6a4-b47c-b2ab7bba9d49@oracle.com> References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <4ea44805-4810-c6a4-b47c-b2ab7bba9d49@oracle.com> Message-ID: Hi Serguei, Daniel, Good to hear you like the fix. My intention with the testing was to make sure my change didn't break anything else. I didn't do a code paths check before I ran it though; saturation run. As for writing a new test, I'm finding it tricky. Here's the current flow: Step 1) VM initialises. Step 2) VM loads a couple of libraries and shuts down if one or more paths is too long in sun.boot.library.path. Step 3) JDWP initializes Step 4) JDWP loads a library and shuts down if one or more paths is too long in sun.boot.library.path. As you can see, Step 2 prevents us from reaching Step 4 with a too-long-path (required to cause failure). I worked around that with my webrev by disabling the bit in os.cpp that enacts Step 2. Since my hack will be removed in the final webrev, we need another way to reach step 4. So what we need to test this change, I believe, is a way to insert Step 2.5) Change the property to include a too-long path. This allows the VM to start up properly, but gives us the excessive path we need to test the jdwp fix. Right now, I'm not seeing a way to do this outside of using the JNI. 1) shell script launches cpp file. 2) cpp starts vm without jdwp. 3) change the property. 4) call jdwp library-loading method directly. 5) check the return code. This seems messy, but I'm not seeing a way to initialise jdwp from inside java code (which sounds better to me). I welcome anyone who can think of a better way to do this. Best Regards Adam Farley IBM Runtimes "serguei.spitsyn at oracle.com" wrote on 15/08/2019 09:25:36: > From: "serguei.spitsyn at oracle.com" > To: Adam Farley8 > Cc: Chris Plummer , > daniel.daugherty at oracle.com, serviceability-dev at openjdk.java.net > Date: 15/08/2019 09:25 > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > quietly truncates on buffer overflow > > Hi Adam, > > The fix itself looks Okay to me. > I'm not sure there is any test case in these test suites which > provide a coverage for it. > It looks like you need to develop a unit jtreg unit test for this. > > Thanks, > Serguei > > > On 8/13/19 09:28, Adam Farley8 wrote: > Hi Serguei, Daniel, > > My testing was limited to the bug specific test case I mentioned, > and the following jdwp tests: > > test/jdk/com/sun/jdi/Jdwp* > test/hotspot/jtreg/serviceability/jdwp > > Best Regards > > Adam Farley > IBM Runtimes > > > "serguei.spitsyn at oracle.com" wrote on > 13/08/2019 17:04:43: > > > From: "serguei.spitsyn at oracle.com" > > To: daniel.daugherty at oracle.com, Adam Farley8 > > , Chris Plummer > > Cc: serviceability-dev at openjdk.java.net > > Date: 13/08/2019 17:08 > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > quietly truncates on buffer overflow > > > > Hi Adam, > > > > I'm looking at your fix. > > Also interested about your testing. > > > > Thanks, > > Serguei > > > > On 8/13/19 08:48, Daniel D. Daugherty wrote: > > I don't see any information about how this change was tested... > > Is there something on another email thread? > > > > Dan > > > > > On 8/13/19 11:41 AM, Adam Farley8 wrote: > > Hi Chris, > > > > Thanks! > > > > I understand we need a second reviewer/sponsor to get this change > > in. Any volunteers? > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > > > Chris Plummer wrote on 12/08/2019 21:35:06: > > > > > From: Chris Plummer > > > To: Adam Farley8 , serviceability- > > dev at openjdk.java.net > > > Date: 12/08/2019 21:35 > > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > > quietly truncates on buffer overflow > > > > > > Hi Adam, > > > > > > It looks good to me. > > > > > > thanks, > > > > > > Chris > > > > > > On 8/12/19 7:34 AM, Adam Farley8 wrote: > > > Hi All, > > > > > > This is a known bug, mentioned in a code comment. > > > > > > Here is the fix for that bug. > > > > > > Reviewers and sponsors requested. > > > > > > Short version: if you set sun.boot.library.path to > > > something beyond a system's max path length, the > > > current code will return an empty string (rather than > > > printing a useful error message and shutting down). > > > > > > This is also a problem if you've specified multiple > > > paths with a separator, as this code seems to wrongly > > > assess whether the *total* length exceeds max path > > > length. So two 200 char paths on windows will cause > > > failure, as the total length is 400 (which is beyond > > > max length for windows). > > > > > > Note that the os.cpp bit of the webrev will not be included > > > in the final webrev, it just makes this change trivially > > > testable. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 > > > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > > > > > > Best Regards > > > > > > Adam Farley > > > IBM Runtimes > > > > > > Unless stated otherwise above: > > > IBM United Kingdom Limited - Registered in England and Wales with > > > number 741598. > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with > > number 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with > number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Thu Aug 15 11:46:30 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 15 Aug 2019 13:46:30 +0200 Subject: RFR: 8229258: Rework markOop and markOopDesc into a simpler mark word value carrier In-Reply-To: References: <72567e47-3b52-8e30-ea6f-9a64fa043c07@redhat.com> <32ccb4ba-0500-2bf4-3761-63a6285f94dc@oracle.com> <5913EBBD-2A13-43A5-9F37-D36E97397365@oracle.com> Message-ID: <1670ce59-e7b5-0023-b1fd-f9f7c19a0eda@oracle.com> Thanks Kim, Roman, Dan and Coleen for reviews and feedback. I rebased the patch, fixed more alignments, renamed the bug, and rerun the test through tier1-3. https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03.delta/ https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03/ Could I get reviews for this version? I'd also like to ask others to at least partially look at this: 1) Platform maintainers probably want to run this patch through their build system. 2) SA maintainers (CC:ed serviceability-dev) 3) JVMCI maintainers Thanks, StefanK On 2019-08-14 11:11, Roman Kennke wrote: > > Am 14.08.19 um 01:26 schrieb Kim Barrett: >>> On Aug 12, 2019, at 12:19 PM, Stefan Karlsson wrote: >>> >>> Hi Roman, >>> >>> Kim helped me figuring out how to get past the volatile issues I had with the class markWord { uintptr_t value; ... } version. So, I've created a version with that: >>> >>> https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.01/ >>> >>> I can go with either approach, so let me now what you all think. >> I've finally had time to look at the first proposed change. >> >> Comparing the first approach (an AllStatic MarkWord class and markWord >> typedef'd to uintptr_t) vs the second approach (markWord is a thin >> class wrapping around uintptr_t), I prefer the second. >> >> * I think the markWord class provides better type safety. It still >> involves too many casts sprinkled over the code base, but I think it >> also provides a better basis for further cast reduction and >> prevention. >> >> * I think having one markWord class for the data and behavior is >> better / more natural than having a markWord typedef for the data and >> a MarkWord AllStatic class for the behaviour. >> >> * I like that the markWord class eliminates the markWord vs MarkWord >> homonyms, which I think will be annoying. >> >> * The markWord class is a trivially copyable class, allowing it to be >> efficiently passed around by value, so no disadvantage there. >> >> I haven't found anything that I think argues for the first over the >> second. Other folks might have different priorities or taste. I think >> either is better than the status quo. >> >> I'm still reviewing webrev.valueMarkWord.02, but so far haven't found >> anything that makes me want to suggest backing off from that direction. >> >> Note that the bug summary doesn't describe the second approach. > +1 :-) > > Roman > From yasuenag at gmail.com Thu Aug 15 14:14:02 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 15 Aug 2019 23:14:02 +0900 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: <4a61d2ac-0a27-fc31-307d-4a818c4b4094@oracle.com> References: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> <4a61d2ac-0a27-fc31-307d-4a818c4b4094@oracle.com> Message-ID: Hi Serguei, I added the explanation as a comment in parseOptions what is longOptsMap, and how parseOptions() work in new webrev. http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.01/ I'm not good at English, so comments are welcome :) Thanks, Yasumasa On 2019/08/15 17:12, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > In fact, I have a problem to understand what the parseOptions() method is doing. > Could you add necessary comments explaining what is done in the loop? > I'm sure I'll be not alone in having trouble to read this code. > Also, it is not clear the approach with the longOptsMap's. > Why do you need to map "exe=" to "exe" but "mixed" to "-m" and"clstats" to "-clstats"? > It is better to be explained in the parseOptions() method as well. > > Thanks, > Serguei > > > On 8/10/19 04:14, Yasumasa Suenaga wrote: >> PING: Could you review it? >> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >> >> >> Yasumasa >> >> >> On 2019/07/24 10:18, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >>> >>> This enhancement has been proposed in [1]. >>> >>> SALauncher (jhsdb implementation) processes the option for each subcommand (e.g. jstack, hsdb). >>> But they exist in many place with similar code. >>> So there is some room for refactoring. >>> >>> This change has passed the tests on submit repo and serviceability/sa tests. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028376.html > From rkennke at redhat.com Thu Aug 15 17:06:56 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 15 Aug 2019 19:06:56 +0200 Subject: RFR: 8229258: Rework markOop and markOopDesc into a simpler mark word value carrier In-Reply-To: <1670ce59-e7b5-0023-b1fd-f9f7c19a0eda@oracle.com> References: <72567e47-3b52-8e30-ea6f-9a64fa043c07@redhat.com> <32ccb4ba-0500-2bf4-3761-63a6285f94dc@oracle.com> <5913EBBD-2A13-43A5-9F37-D36E97397365@oracle.com> <1670ce59-e7b5-0023-b1fd-f9f7c19a0eda@oracle.com> Message-ID: <86e9cb09-d822-7d9a-2f4c-b19c6c5aebe0@redhat.com> Hi Stefan, I looked over the changes again. I like this much better, a huge improvement over current state, and also better than your first proposal. I also prefer the explicit value() calls. I also built+tested Shenandoah GC again, seems all fine. Didn't know that C++ has an 'explicit' specifier. Oh man. Still seems to have foobared alignment (it was partly kaputted before already): src/hotspot/share/oops/oopsHierarchy.hpp Out of curiosity, what's with the changes in objectMonitor.inline.hpp to access the markWord atomically?: -inline markOop ObjectMonitor::header() const { - return _header; +inline markWord ObjectMonitor::header() const { + return Atomic::load(&_header); } I guess this is good (equal or stronger than before) but is there a rationale behind these changes? I say ship it! Thanks, Roman > Thanks Kim, Roman, Dan and Coleen for reviews and feedback. > > I rebased the patch, fixed more alignments, renamed the bug, and rerun > the test through tier1-3. > > https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03.delta/ > https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03/ > > Could I get reviews for this version? I'd also like to ask others to at > least partially look at this: > > 1) Platform maintainers probably want to run this patch through their > build system. > 2) SA maintainers (CC:ed serviceability-dev) > 3) JVMCI maintainers > > Thanks, > StefanK > > On 2019-08-14 11:11, Roman Kennke wrote: >> >> Am 14.08.19 um 01:26 schrieb Kim Barrett: >>>> On Aug 12, 2019, at 12:19 PM, Stefan Karlsson >>>> wrote: >>>> >>>> Hi Roman, >>>> >>>> Kim helped me figuring out how to get past the volatile issues I had >>>> with the class markWord { uintptr_t value; ... } version. So, I've >>>> created a version with that: >>>> >>>> https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.01/ >>>> >>>> I can go with either approach, so let me now what you all think. >>> I've finally had time to look at the first proposed change. >>> >>> Comparing the first approach (an AllStatic MarkWord class and markWord >>> typedef'd to uintptr_t) vs the second approach (markWord is a thin >>> class wrapping around uintptr_t), I prefer the second. >>> >>> * I think the markWord class provides better type safety. It still >>> involves too many casts sprinkled over the code base, but I think it >>> also provides a better basis for further cast reduction and >>> prevention. >>> >>> * I think having one markWord class for the data and behavior is >>> better / more natural than having a markWord typedef for the data and >>> a MarkWord AllStatic class for the behaviour. >>> >>> * I like that the markWord class eliminates the markWord vs MarkWord >>> homonyms, which I think will be annoying. >>> >>> * The markWord class is a trivially copyable class, allowing it to be >>> efficiently passed around by value, so no disadvantage there. >>> >>> I haven't found anything that I think argues for the first over the >>> second. Other folks might have different priorities or taste. I think >>> either is better than the status quo. >>> >>> I'm still reviewing webrev.valueMarkWord.02, but so far haven't found >>> anything that makes me want to suggest backing off from that direction. >>> >>> Note that the bug summary doesn't describe the second approach. >> +1 :-) >> >> Roman >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From stefan.karlsson at oracle.com Thu Aug 15 19:26:34 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 15 Aug 2019 21:26:34 +0200 Subject: RFR: 8229258: Rework markOop and markOopDesc into a simpler mark word value carrier In-Reply-To: <86e9cb09-d822-7d9a-2f4c-b19c6c5aebe0@redhat.com> References: <72567e47-3b52-8e30-ea6f-9a64fa043c07@redhat.com> <32ccb4ba-0500-2bf4-3761-63a6285f94dc@oracle.com> <5913EBBD-2A13-43A5-9F37-D36E97397365@oracle.com> <1670ce59-e7b5-0023-b1fd-f9f7c19a0eda@oracle.com> <86e9cb09-d822-7d9a-2f4c-b19c6c5aebe0@redhat.com> Message-ID: <6d432589-afc6-6f70-f6b7-3837e5b2ae39@oracle.com> Hi Roman, On 2019-08-15 19:06, Roman Kennke wrote: > Hi Stefan, > > I looked over the changes again. I like this much better, a huge > improvement over current state, and also better than your first > proposal. I also prefer the explicit value() calls. Great! > > I also built+tested Shenandoah GC again, seems all fine. > > Didn't know that C++ has an 'explicit' specifier. Oh man. > > Still seems to have foobared alignment (it was partly kaputted before > already): > src/hotspot/share/oops/oopsHierarchy.hpp You're right. I removed one stray whitespace: $ hg diff diff --git a/src/hotspot/share/oops/oopsHierarchy.hpp b/src/hotspot/share/oops/oopsHierarchy.hpp --- a/src/hotspot/share/oops/oopsHierarchy.hpp +++ b/src/hotspot/share/oops/oopsHierarchy.hpp @@ -46,7 +46,7 @@ ?typedef class?? instanceOopDesc*??????????? instanceOop; ?typedef class?? arrayOopDesc*?????????????? arrayOop; ?typedef class???? objArrayOopDesc*??????????? objArrayOop; -typedef class???? typeArrayOopDesc*??????????? typeArrayOop; +typedef class???? typeArrayOopDesc*?????????? typeArrayOop; ?#else I think the other indentation is done on purpose: typedef class oopDesc*??????????????????? oop; typedef class?? instanceOopDesc*??????????? instanceOop; typedef class?? arrayOopDesc*?????????????? arrayOop; typedef class???? objArrayOopDesc*??????????? objArrayOop; typedef class???? typeArrayOopDesc*?????????? typeArrayOop; to show the oops hierarchy. > > Out of curiosity, what's with the changes in objectMonitor.inline.hpp to > access the markWord atomically?: > > -inline markOop ObjectMonitor::header() const { > - return _header; > +inline markWord ObjectMonitor::header() const { > + return Atomic::load(&_header); > } > > I guess this is good (equal or stronger than before) but is there a > rationale behind these changes? Ahh. Right. That was done to solve the problems I were having with volatiles. For example: src/hotspot/share/runtime/objectMonitor.inline.hpp:38:10: error: binding reference of type 'const markWord&' to 'const volatile markWord' discards qualifiers ?? return _header; and: src/hotspot/share/runtime/basicLock.hpp:40:74: error: implicit dereference will not access object of type ?volatile markWord? in statement [-Werror] ? void???????? set_displaced_header(markWord header) { _displaced_header = header; } Kim suggested that the fact that these fields were volatile was an indication that we should be doing some kind of atomic/ordered operation. By replacing these loads and stores with calls to the Atomic APIs, and providing the PrimitiveConversions::Translate specialization, we could solve that problem. > > I say ship it! Thanks a lot for reviewing this! StefanK > > Thanks, > Roman > > >> Thanks Kim, Roman, Dan and Coleen for reviews and feedback. >> >> I rebased the patch, fixed more alignments, renamed the bug, and rerun >> the test through tier1-3. >> >> https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03.delta/ >> https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03/ >> >> Could I get reviews for this version? I'd also like to ask others to at >> least partially look at this: >> >> 1) Platform maintainers probably want to run this patch through their >> build system. >> 2) SA maintainers (CC:ed serviceability-dev) >> 3) JVMCI maintainers >> >> Thanks, >> StefanK >> >> On 2019-08-14 11:11, Roman Kennke wrote: >>> Am 14.08.19 um 01:26 schrieb Kim Barrett: >>>>> On Aug 12, 2019, at 12:19 PM, Stefan Karlsson >>>>> wrote: >>>>> >>>>> Hi Roman, >>>>> >>>>> Kim helped me figuring out how to get past the volatile issues I had >>>>> with the class markWord { uintptr_t value; ... } version. So, I've >>>>> created a version with that: >>>>> >>>>> https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.01/ >>>>> >>>>> I can go with either approach, so let me now what you all think. >>>> I've finally had time to look at the first proposed change. >>>> >>>> Comparing the first approach (an AllStatic MarkWord class and markWord >>>> typedef'd to uintptr_t) vs the second approach (markWord is a thin >>>> class wrapping around uintptr_t), I prefer the second. >>>> >>>> * I think the markWord class provides better type safety. It still >>>> involves too many casts sprinkled over the code base, but I think it >>>> also provides a better basis for further cast reduction and >>>> prevention. >>>> >>>> * I think having one markWord class for the data and behavior is >>>> better / more natural than having a markWord typedef for the data and >>>> a MarkWord AllStatic class for the behaviour. >>>> >>>> * I like that the markWord class eliminates the markWord vs MarkWord >>>> homonyms, which I think will be annoying. >>>> >>>> * The markWord class is a trivially copyable class, allowing it to be >>>> efficiently passed around by value, so no disadvantage there. >>>> >>>> I haven't found anything that I think argues for the first over the >>>> second. Other folks might have different priorities or taste. I think >>>> either is better than the status quo. >>>> >>>> I'm still reviewing webrev.valueMarkWord.02, but so far haven't found >>>> anything that makes me want to suggest backing off from that direction. >>>> >>>> Note that the bug summary doesn't describe the second approach. >>> +1 :-) >>> >>> Roman >>> From kim.barrett at oracle.com Thu Aug 15 22:59:57 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 15 Aug 2019 18:59:57 -0400 Subject: RFR: 8229258: Rework markOop and markOopDesc into a simpler mark word value carrier In-Reply-To: <1670ce59-e7b5-0023-b1fd-f9f7c19a0eda@oracle.com> References: <72567e47-3b52-8e30-ea6f-9a64fa043c07@redhat.com> <32ccb4ba-0500-2bf4-3761-63a6285f94dc@oracle.com> <5913EBBD-2A13-43A5-9F37-D36E97397365@oracle.com> <1670ce59-e7b5-0023-b1fd-f9f7c19a0eda@oracle.com> Message-ID: <00FC5053-D5B8-4DD8-96A3-95E12ED7C18C@oracle.com> > On Aug 15, 2019, at 7:46 AM, Stefan Karlsson wrote: > > Thanks Kim, Roman, Dan and Coleen for reviews and feedback. > > I rebased the patch, fixed more alignments, renamed the bug, and rerun the test through tier1-3. > > https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03.delta/ > https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03/ > > Could I get reviews for this version? I'd also like to ask others to at least partially look at this: Here's my comments through webrev.valueMarkWord.03. Looks good. There are a couple of small fixes for which I don't need a new webrev, which are listed first below. Then there are some broader items which could be addressed in followup improvements. ------------------------------------------------------------------------------ src/hotspot/share/oops/markOop.hpp 109 template operator T(); My mistake in the earlier review comment. Function should be const qualified, e.g. that should be template operator T() const; ------------------------------------------------------------------------------ src/hotspot/share/runtime/biasedLocking.cpp 695 prototype.bias_epoch() == mark.bias_epoch())) { I think one more leading space needs to be deleted to get proper alignment here. Or reformat this long and complex if control expression. ------------------------------------------------------------------------------ src/hotspot/share/runtime/vframe.cpp 244 // FIXME: mark is set but not used below. 245 // Either the comment or the code is broken. Is there a bug for this? ------------------------------------------------------------------------------ The remainder seem like they could be followup improvements. ------------------------------------------------------------------------------ src/hotspot/share/oops/markOop.hpp 138 static const uintptr_t zero = 0; All occurrences of this are either (1) markWord member initializater (2) markWord variable initializer (3) temporary markWord initializer There don't appear to be any bare uses otherwise. I think nicer is static markWord zero() { return markWord(0); } (For C++11: `static constexpr markWord zero = markWord(0);`) This seems like it could be a followup improvement. ------------------------------------------------------------------------------ [This is also related to Coleen's comments about to_pointer.] Looking at the changes, in order to reduce the amount of casting pixie dust sprinkled on our code base, I now think markWord::to_pointer should have a template overload, with the template parameter designating the return type. And I think the return type for the non-template should be const qualified, and the template should handle cv qualifiers. Like so (I think, I haven't actually tested this) const void* to_pointer() const { return reinterpret_cast(_value); } template T* to_pointer() const { typedef typename RemoveCV::type TT; return reinterpret_cast(_value); } If one wants a void* then use m.to_pointer(). (I almost want to call it "pointer_to" so it reads "pointer to T".) Coleen and I talked about a possible alternative to the template overload. Perhaps there is a small and fixed number of pointer types we want to support conversion to, in which case we could have a small number of type-specific to_xxx_pointer variants? But I'm not sure the number is actually small and fixed. This seems like it could be a followup improvement. ------------------------------------------------------------------------------ src/hotspot/share/interpreter/bytecodeInterpreter.cpp In the run() function, there are a lot of casts of markWord::value() results or constants from markWord that I think could be removed. Some of them might want C++11 explicitly typed enums though. This seems like it could be a followup improvement. ------------------------------------------------------------------------------ From serguei.spitsyn at oracle.com Thu Aug 15 23:03:52 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 15 Aug 2019 16:03:52 -0700 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: References: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> <4a61d2ac-0a27-fc31-307d-4a818c4b4094@oracle.com> Message-ID: Hi Yasumasa, Thank you for the update! A couple of suggestions: 213 * This method converts jhsdb-style options (oldArgs) to oldfashioned ? Replace: "oldfashioned " => "old fashioned". ? There are several occurrences of it. 214 * style. SALauncher delegates the work to the entry point of each tools. ? Replace: "each tools" => "each tool" 225 * You also can set the options which cannot be map to oldfashioned 226 * arguments. For example, `jhsdb jmap --binaryheap` cannot be map to ? Replace: "cannot be map" => "cannot be mapped" 231 * This method returns the map which the key is oldfashioned option, 232 * the value is its value. ? I'd suggest to say: ?? * This method returns the map of the old fashioned key/val pairs. This loop still needs to be commented: 242 while ((s = sg.next(null, longOpts)) != null) { 243 var val = longOptsMap.get(s); 244 if (val != null) { // What is done here and why? 245 newArgMap.put(val, null); 246 } else { 247 val = longOptsMap.get(s + "="); // Why the "=" is added 248 if (val != null) { // What is done here and why? 249 newArgMap.put(val, sg.getOptarg()); 250 } // Why there is no else statement, do we just skip the option? ?251 } 252 } ?Such comments will give more context and make this code more readable. Thanks, Serguei On 8/15/19 7:14 AM, Yasumasa Suenaga wrote: > Hi Serguei, > > I added the explanation as a comment in parseOptions what is longOptsMap, > and how parseOptions() work in new webrev. > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.01/ > > I'm not good at English, so comments are welcome :) > > > Thanks, > > Yasumasa > > > On 2019/08/15 17:12, serguei.spitsyn at oracle.com wrote: >> Hi Yasumasa, >> >> In fact, I have a problem to understand what the parseOptions() >> method is doing. >> Could you add necessary comments explaining what is done in the loop? >> I'm sure I'll be not alone in having trouble to read this code. >> Also, it is not clear the approach with the longOptsMap's. >> Why do you need to map "exe=" to "exe" but "mixed" to "-m" >> and"clstats" to "-clstats"? >> It is better to be explained in the parseOptions() method as well. >> >> Thanks, >> Serguei >> >> >> On 8/10/19 04:14, Yasumasa Suenaga wrote: >>> PING: Could you review it? >>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >>> >>> >>> Yasumasa >>> >>> >>> On 2019/07/24 10:18, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> Please review this change: >>>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >>>> >>>> This enhancement has been proposed in [1]. >>>> >>>> SALauncher (jhsdb implementation) processes the option for each >>>> subcommand (e.g. jstack, hsdb). >>>> But they exist in many place with similar code. >>>> So there is some room for refactoring. >>>> >>>> This change has passed the tests on submit repo and >>>> serviceability/sa tests. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] >>>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028376.html >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Fri Aug 16 01:05:53 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Fri, 16 Aug 2019 10:05:53 +0900 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: References: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> <4a61d2ac-0a27-fc31-307d-4a818c4b4094@oracle.com> Message-ID: <7fca26c1-8125-f2fd-d5da-090ba876f86d@gmail.com> Hi Serguei, Thank you for the comment. I fixed / added the comment in new webrev. Could you check again? http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.02/ Yasumasa On 2019/08/16 8:03, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > Thank you for the update! > A couple of suggestions: > > 213 * This method converts jhsdb-style options (oldArgs) to oldfashioned > > ? Replace: "oldfashioned " => "old fashioned". > ? There are several occurrences of it. > > 214 * style. SALauncher delegates the work to the entry point of each tools. > > ? Replace: "each tools" => "each tool" > > 225 * You also can set the options which cannot be map to oldfashioned > 226 * arguments. For example, `jhsdb jmap --binaryheap` cannot be map to > > ? Replace: "cannot be map" => "cannot be mapped" > > 231 * This method returns the map which the key is oldfashioned option, > 232 * the value is its value. > > ? I'd suggest to say: > ?? * This method returns the map of the old fashioned key/val pairs. > > > This loop still needs to be commented: > > 242 while ((s = sg.next(null, longOpts)) != null) { > 243 var val = longOptsMap.get(s); > 244 if (val != null) { > // What is done here and why? > 245 newArgMap.put(val, null); > 246 } else { > 247 val = longOptsMap.get(s + "="); // Why the "=" is added > 248 if (val != null) { > // What is done here and why? > 249 newArgMap.put(val, sg.getOptarg()); > 250 } > // Why there is no else statement, do we just skip the option? > > ?251 } > 252 } > > ?Such comments will give more context and make this code more readable. > > Thanks, > Serguei > > > On 8/15/19 7:14 AM, Yasumasa Suenaga wrote: >> Hi Serguei, >> >> I added the explanation as a comment in parseOptions what is longOptsMap, >> and how parseOptions() work in new webrev. >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.01/ >> >> I'm not good at English, so comments are welcome :) >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/08/15 17:12, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> In fact, I have a problem to understand what the parseOptions() method is doing. >>> Could you add necessary comments explaining what is done in the loop? >>> I'm sure I'll be not alone in having trouble to read this code. >>> Also, it is not clear the approach with the longOptsMap's. >>> Why do you need to map "exe=" to "exe" but "mixed" to "-m" and"clstats" to "-clstats"? >>> It is better to be explained in the parseOptions() method as well. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/10/19 04:14, Yasumasa Suenaga wrote: >>>> PING: Could you review it? >>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/07/24 10:18, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> Please review this change: >>>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >>>>> >>>>> This enhancement has been proposed in [1]. >>>>> >>>>> SALauncher (jhsdb implementation) processes the option for each subcommand (e.g. jstack, hsdb). >>>>> But they exist in many place with similar code. >>>>> So there is some room for refactoring. >>>>> >>>>> This change has passed the tests on submit repo and serviceability/sa tests. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028376.html >>> > From stefan.karlsson at oracle.com Fri Aug 16 08:20:40 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 16 Aug 2019 10:20:40 +0200 Subject: RFR: 8229258: Rework markOop and markOopDesc into a simpler mark word value carrier In-Reply-To: <00FC5053-D5B8-4DD8-96A3-95E12ED7C18C@oracle.com> References: <72567e47-3b52-8e30-ea6f-9a64fa043c07@redhat.com> <32ccb4ba-0500-2bf4-3761-63a6285f94dc@oracle.com> <5913EBBD-2A13-43A5-9F37-D36E97397365@oracle.com> <1670ce59-e7b5-0023-b1fd-f9f7c19a0eda@oracle.com> <00FC5053-D5B8-4DD8-96A3-95E12ED7C18C@oracle.com> Message-ID: <904604bc-a718-8d44-8e1c-629c8feae840@oracle.com> On 2019-08-16 00:59, Kim Barrett wrote: >> On Aug 15, 2019, at 7:46 AM, Stefan Karlsson wrote: >> >> Thanks Kim, Roman, Dan and Coleen for reviews and feedback. >> >> I rebased the patch, fixed more alignments, renamed the bug, and rerun the test through tier1-3. >> >> https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03.delta/ >> https://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.03/ >> >> Could I get reviews for this version? I'd also like to ask others to at least partially look at this: > > Here's my comments through webrev.valueMarkWord.03. > > Looks good. There are a couple of small fixes for which I don't need > a new webrev, which are listed first below. Then there are some > broader items which could be addressed in followup improvements. > > ------------------------------------------------------------------------------ > src/hotspot/share/oops/markOop.hpp > 109 template operator T(); > > My mistake in the earlier review comment. Function should be const > qualified, e.g. that should be > > template operator T() const; > I added this after one of our earlier discussions. However, I don't think we need it (const or not). We already get sensible compiler errors without this function when we try to cast markWords to something else: void* p0 = m; void* p1 = (void*)m; int i0 = m; int i1 = (int)m; error: cannot convert ?markWord? to ?void*? in initialization void* p0 = m; ^ error: invalid cast from type ?markWord? to type ?void*? void* p1 = (void*)m; ^ error: cannot convert ?markWord? to ?int? in initialization int i0 = m; ^ error: invalid cast from type ?markWord? to type ?int? int i1 = (int)m; The poisoned constructor seems to be unnecessary as well, now that we have simplified markWord. Without it, I get appropriate error messages when I try to create a markWord from a pointer: error: invalid conversion from ?void*? to ?uintptr_t? {aka ?long unsigned int?} [-fpermissive] markWord p((void*)0x111); ^~~~~~~~~~~~ note: initializing argument 1 of ?markWord::markWord(uintptr_t)? explicit markWord(uintptr_t value) : _value(value) { } I've removed both of these. > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/biasedLocking.cpp > 695 prototype.bias_epoch() == mark.bias_epoch())) { > > I think one more leading space needs to be deleted to get proper > alignment here. Or reformat this long and complex if control > expression. > OK. I followed the pre-existing alignment, but I can change it anyway. > ------------------------------------------------------------------------------ > src/hotspot/share/runtime/vframe.cpp > 244 // FIXME: mark is set but not used below. > 245 // Either the comment or the code is broken. > > Is there a bug for this? > Created JDK-8229808. > ------------------------------------------------------------------------------ > > > The remainder seem like they could be followup improvements. > > ------------------------------------------------------------------------------ > src/hotspot/share/oops/markOop.hpp > 138 static const uintptr_t zero = 0; > > All occurrences of this are either > (1) markWord member initializater > (2) markWord variable initializer > (3) temporary markWord initializer > > There don't appear to be any bare uses otherwise. I think nicer is > > static markWord zero() { return markWord(0); } > > (For C++11: `static constexpr markWord zero = markWord(0);`) > > This seems like it could be a followup improvement. > I had the same thought and then backed away from it, but I can't remember why. This is a small enough change, so I've gone through the few occurrences and cleaned it up. I'll leave the rest of the comments below for follow-up RFEs. This is the last few cleanups: http://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.04.delta/ http://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.04/ I ran extended testing on .03 (tier1-7 on linux), checked the markWord functions were inlined, and checked that the generated code for G1ParScanThreadState::copy_to_survivor_space was the same before and after the patch. So I intend to run tier1 testing on .04 and then push this patch. Thanks, StefanK > ------------------------------------------------------------------------------ > > [This is also related to Coleen's comments about to_pointer.] > > Looking at the changes, in order to reduce the amount of casting pixie > dust sprinkled on our code base, I now think markWord::to_pointer > should have a template overload, with the template parameter > designating the return type. And I think the return type for the > non-template should be const qualified, and the template should handle > cv qualifiers. Like so (I think, I haven't actually tested this) > > const void* to_pointer() const { > return reinterpret_cast(_value); > } > > template > T* to_pointer() const { > typedef typename RemoveCV::type TT; > return reinterpret_cast(_value); > } > > If one wants a void* then use m.to_pointer(). > > (I almost want to call it "pointer_to" so it reads "pointer to T".) > > Coleen and I talked about a possible alternative to the template > overload. Perhaps there is a small and fixed number of pointer types > we want to support conversion to, in which case we could have a small > number of type-specific to_xxx_pointer variants? But I'm not sure the > number is actually small and fixed. > > This seems like it could be a followup improvement. > > ------------------------------------------------------------------------------ > src/hotspot/share/interpreter/bytecodeInterpreter.cpp > > In the run() function, there are a lot of casts of markWord::value() > results or constants from markWord that I think could be removed. > Some of them might want C++11 explicitly typed enums though. > > This seems like it could be a followup improvement. > > ------------------------------------------------------------------------------ > From sgehwolf at redhat.com Fri Aug 16 08:38:39 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 16 Aug 2019 10:38:39 +0200 Subject: RFR 8229420: [Redo] jstat reports incorrect values for OU for CMS GC In-Reply-To: <573749a6-2389-6de2-4f2f-abc3b0e1dd6d@oracle.com> References: <573749a6-2389-6de2-4f2f-abc3b0e1dd6d@oracle.com> Message-ID: <053248daacbd69bd29b4bcc4db7678323784b7f0.camel@redhat.com> On Wed, 2019-08-14 at 06:13 -0700, Poonam Parhar wrote: > Please review the webrev with the updated fix: > http://cr.openjdk.java.net/~poonam/8229420/webrev.00/ As far as the jhat/jstat typo is concerned this looks good. I haven't reviewed other bits. Thanks for doing this via new bug. Thanks, Severin From kim.barrett at oracle.com Fri Aug 16 18:33:54 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 16 Aug 2019 14:33:54 -0400 Subject: RFR: 8229258: Rework markOop and markOopDesc into a simpler mark word value carrier In-Reply-To: <904604bc-a718-8d44-8e1c-629c8feae840@oracle.com> References: <72567e47-3b52-8e30-ea6f-9a64fa043c07@redhat.com> <32ccb4ba-0500-2bf4-3761-63a6285f94dc@oracle.com> <5913EBBD-2A13-43A5-9F37-D36E97397365@oracle.com> <1670ce59-e7b5-0023-b1fd-f9f7c19a0eda@oracle.com> <00FC5053-D5B8-4DD8-96A3-95E12ED7C18C@oracle.com> <904604bc-a718-8d44-8e1c-629c8feae840@oracle.com> Message-ID: <5C08780B-5DB3-4E6E-A4A4-3257BF90E39B@oracle.com> > On Aug 16, 2019, at 4:20 AM, Stefan Karlsson wrote: > > On 2019-08-16 00:59, Kim Barrett wrote: >> src/hotspot/share/oops/markOop.hpp >> 109 template operator T(); >> My mistake in the earlier review comment. Function should be const >> qualified, e.g. that should be >> template operator T() const; > > I added this after one of our earlier discussions. However, I don't think we need it (const or not). We already get sensible compiler errors without this function when we try to cast markWords to something else: > > void* p0 = m; > void* p1 = (void*)m; > int i0 = m; > int i1 = (int)m; > > [? various errors ?] You?re right. It seems I need to refresh my recollection of what the valid conversions are. > The poisoned constructor seems to be unnecessary as well, now that we have simplified markWord. Without it, I get appropriate error messages when I try to create a markWord from a pointer: > > error: invalid conversion from ?void*? to ?uintptr_t? {aka ?long unsigned int?} [-fpermissive] > markWord p((void*)0x111); > ^~~~~~~~~~~~ > note: initializing argument 1 of ?markWord::markWord(uintptr_t)? > explicit markWord(uintptr_t value) : _value(value) { } I no longer recall why that one was even suggested. But you are right that it isn?t needed. > I've removed both of these. Good. > [?] > > I'll leave the rest of the comments below for follow-up RFEs. OK. > This is the last few cleanups: > http://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.04.delta/ > http://cr.openjdk.java.net/~stefank/8229258/webrev.valueMarkWord.04/ Looks good. > I ran extended testing on .03 (tier1-7 on linux), checked the markWord functions were inlined, and checked that the generated code for G1ParScanThreadState::copy_to_survivor_space was the same before and after the patch. So I intend to run tier1 testing on .04 and then push this patch. > Thanks for checking the generated code. It?s what we expected, but compilers are sometimes surprising. From alexey.menkov at oracle.com Fri Aug 16 23:46:58 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 16 Aug 2019 16:46:58 -0700 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html Message-ID: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> Hi all, Please review the change that fixes accessibility issues in generated jvmti.html There are 2 "general" accessibility issues ("content outside of a region") - fixed by replacing
with
and
with
and huge number (5200+) of table issues: - no row or column header for cells; - table has only one column or row. Most of the tables was updated to have row and column headers, the tables which does not contain table data (like "Phase/Callback Safe/Position/Since" block for functions) were converted to use
s. All table headers/descriptions were converted to . All cases when tables can has only one row/column are handled by xsl (if there is no data for the table,
s are used). jira: https://bugs.openjdk.java.net/browse/JDK-8228547 webrev: http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ generated doc: - old: http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html - new: http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html Visually there are minimal changes (checked in Firefox, Chrome, IE) specdiff: http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html --alex From david.holmes at oracle.com Sat Aug 17 07:04:18 2019 From: david.holmes at oracle.com (David Holmes) Date: Sat, 17 Aug 2019 17:04:18 +1000 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html In-Reply-To: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> References: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> Message-ID: <60a164f8-6e95-7d46-7409-2c4db0929d0d@oracle.com> Hi Alex, Visually this appeared fine to me, so as long as the accessibility checking tool is happy then changes seem good. Thanks, David On 17/08/2019 9:46 am, Alex Menkov wrote: > Hi all, > > Please review the change that fixes accessibility issues in generated > jvmti.html > > There are 2 "general" accessibility issues ("content outside of a > region") - fixed by replacing
with
and
role="main"> with
> and huge number (5200+) of table issues: > - no row or column header for cells; > - table has only one column or row. > Most of the tables was updated to have row and column headers, > the tables which does not contain table data (like "Phase/Callback > Safe/Position/Since" block for functions) were converted to use
s. > All table headers/descriptions were converted to . > All cases when tables can has only one row/column are handled by xsl (if > there is no data for the table,
s are used). > > jira: https://bugs.openjdk.java.net/browse/JDK-8228547 > > webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ > > generated doc: > - old: > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html > > - new: > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html > > > Visually there are minimal changes (checked in Firefox, Chrome, IE) > > specdiff: > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html > > > --alex From jcbeyler at google.com Sun Aug 18 04:15:49 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Sat, 17 Aug 2019 21:15:49 -0700 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html In-Reply-To: <60a164f8-6e95-7d46-7409-2c4db0929d0d@oracle.com> References: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> <60a164f8-6e95-7d46-7409-2c4db0929d0d@oracle.com> Message-ID: Hi Alex, Looks good to me as well. What is surprising (or maybe not) is the slight changes that you do see. The vertical alignment is off for the Position / Since columns it seems (it used to be vertically centered and no longer; see the "Allocate" table for example). And the same table seems a bit wider on my machine than the other tables: - The Phase/Callback Safe/Position/Since table seems a few pixels wider than the Capabilities one for example. But these are really small details on my machine that I think we are fine, so looks good to me too :) Jc On Sat, Aug 17, 2019 at 12:05 AM David Holmes wrote: > Hi Alex, > > Visually this appeared fine to me, so as long as the accessibility > checking tool is happy then changes seem good. > > Thanks, > David > > On 17/08/2019 9:46 am, Alex Menkov wrote: > > Hi all, > > > > Please review the change that fixes accessibility issues in generated > > jvmti.html > > > > There are 2 "general" accessibility issues ("content outside of a > > region") - fixed by replacing
with
and
> role="main"> with
> > and huge number (5200+) of table issues: > > - no row or column header for cells; > > - table has only one column or row. > > Most of the tables was updated to have row and column headers, > > the tables which does not contain table data (like "Phase/Callback > > Safe/Position/Since" block for functions) were converted to use
s. > > All table headers/descriptions were converted to . > > All cases when tables can has only one row/column are handled by xsl (if > > there is no data for the table,
s are used). > > > > jira: https://bugs.openjdk.java.net/browse/JDK-8228547 > > > > webrev: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ > > > > generated doc: > > - old: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html > > > > - new: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html > > > > > > Visually there are minimal changes (checked in Firefox, Chrome, IE) > > > > specdiff: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html > > > > > > --alex > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Sun Aug 18 05:55:16 2019 From: david.holmes at oracle.com (David Holmes) Date: Sun, 18 Aug 2019 15:55:16 +1000 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html In-Reply-To: References: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> <60a164f8-6e95-7d46-7409-2c4db0929d0d@oracle.com> Message-ID: <15ff1c0b-f5b6-b45a-fd6a-387b2459613c@oracle.com> Hi JC, On 18/08/2019 2:15 pm, Jean Christophe Beyler wrote: > Hi Alex, > > Looks good to me as well. What is surprising (or maybe not) is the > slight changes that you do see. The vertical alignment is off for the > Position / Since columns it seems (it used to be vertically centered and > no longer; see the "Allocate" table for example). FWIW I don't observe any differences in that aspect of the tables (Firefox on Windows 7). The only visual difference I see is that the table lines seem thicker. > And the same table seems a bit wider on my machine than the other tables: > ? - The Phase/Callback Safe/Position/Since table seems a few pixels > wider than the Capabilities one for example. I see that too. To me it appears to be because there is an extra column in the phase/callback/position/Since table and the extra line thickness then makes the overall table wider. Cheers, David > > But these are really small details on my machine that I think we are > fine, so looks good to me too :) > Jc > > On Sat, Aug 17, 2019 at 12:05 AM David Holmes > wrote: > > Hi Alex, > > Visually this appeared fine to me, so as long as the accessibility > checking tool is happy then changes seem good. > > Thanks, > David > > On 17/08/2019 9:46 am, Alex Menkov wrote: > > Hi all, > > > > Please review the change that fixes accessibility issues in > generated > > jvmti.html > > > > There are 2 "general" accessibility issues ("content outside of a > > region") - fixed by replacing
with
> and
> role="main"> with
> > and huge number (5200+) of table issues: > > - no row or column header for cells; > > - table has only one column or row. > > Most of the tables was updated to have row and column headers, > > the tables which does not contain table data (like "Phase/Callback > > Safe/Position/Since" block for functions) were converted to use >
s. > > All table headers/descriptions were converted to . > > All cases when tables can has only one row/column are handled by > xsl (if > > there is no data for the table,
s are used). > > > > jira: https://bugs.openjdk.java.net/browse/JDK-8228547 > > > > webrev: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ > > > > generated doc: > > - old: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html > > > > > - new: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html > > > > > > > Visually there are minimal changes (checked in Firefox, Chrome, IE) > > > > specdiff: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html > > > > > > > --alex > > > > -- > > Thanks, > Jc From serguei.spitsyn at oracle.com Mon Aug 19 04:50:23 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 18 Aug 2019 21:50:23 -0700 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: <7fca26c1-8125-f2fd-d5da-090ba876f86d@gmail.com> References: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> <4a61d2ac-0a27-fc31-307d-4a818c4b4094@oracle.com> <7fca26c1-8125-f2fd-d5da-090ba876f86d@gmail.com> Message-ID: <8abc501d-31e8-fc9f-afc8-d32c4baa838c@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 19 05:43:28 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 18 Aug 2019 22:43:28 -0700 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html In-Reply-To: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> References: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> Message-ID: <9beb00e2-877b-30bc-bda1-468b8d19a42a@oracle.com> Hi Alex, It looks great to me. I looked through the old and new documents in sync and did not find any problems. The webrev itself looks good too. Very nice job! I realize it was not easy as many spots in the jvmti.xsl needed an update. It is a big deal to get rid of these accessibility errors/warnings. Thanks, Serguei On 8/16/19 16:46, Alex Menkov wrote: > Hi all, > > Please review the change that fixes accessibility issues in generated > jvmti.html > > There are 2 "general" accessibility issues ("content outside of a > region") - fixed by replacing
with
and >
with
> and huge number (5200+) of table issues: > - no row or column header for cells; > - table has only one column or row. > Most of the tables was updated to have row and column headers, > the tables which does not contain table data (like "Phase/Callback > Safe/Position/Since" block for functions) were converted to use
s. > All table headers/descriptions were converted to . > All cases when tables can has only one row/column are handled by xsl > (if there is no data for the table,
s are used). > > jira: https://bugs.openjdk.java.net/browse/JDK-8228547 > > webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ > > generated doc: > - old: > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html > - new: > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html > > Visually there are minimal changes (checked in Firefox, Chrome, IE) > > specdiff: > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html > > --alex From serguei.spitsyn at oracle.com Mon Aug 19 05:52:00 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 18 Aug 2019 22:52:00 -0700 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> Message-ID: <8ba7c4ee-ee19-d42a-0c7b-d26b257df927@oracle.com> Hi David, Just wanted to tell you that I'm looking at this but need more time. Thanks, Serguei On 8/14/19 23:22, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 > > Preliminary webrev (still has rough edges): > http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ > > Background: > > We've had this comment for a long time: > > ?// The raw monitor subsystem is entirely distinct from normal > ?// java-synchronization or jni-synchronization.? raw monitors are not > ?// associated with objects.? They can be implemented in any manner > ?// that makes sense.? The original implementors decided to piggy-back > ?// the raw-monitor implementation on the existing Java objectMonitor > mechanism. > ?// This flaw needs to fixed.? We should reimplement raw monitors as > sui-generis. > ?// Specifically, we should not implement raw monitors via java monitors. > ?// Time permitting, we should disentangle and deconvolve the two > implementations > ?// and move the resulting raw monitor implementation over to the > JVMTI directories. > ?// Ideally, the raw monitor implementation would be built on top of > ?// park-unpark and nothing else. > > This is an attempt to do that disentangling so that we can then > consider changes to ObjectMonitor without having to worry about > JvmtiRawMonitors. But rather than building on low-level park/unpark > (which would require the same manual queue management and much of the > same complex code as exists in ObjectMonitor) I decided to try and do > this on top of PlatformMonitor. > > The reason this is just a RFC rather than RFR is that I overlooked a > non-trivial aspect of JvmtiRawMonitors: like Java monitors (as > implemented by ObjectMonitor) they interact with the Thread.interrupt > mechanism. This is not clearly stated in the JVM TI specification [1] > but only in passing by the possible errors for RawMonitorWait: > > JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again > > As I explain in the bug report there is no way to build in proper > interrupt support using PlatformMonitor as there is no way we can > "interrupt" the low-level pthread_cond_wait. But we can approximate > it. What I've done in this preliminary version is just check interrupt > state before and after the actual "wait" but we won't get woken by the > interrupt once we have actually blocked. Alternatively we could use a > periodic polling approach and wakeup every Nms to check for interruption. > > The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not > affected by this choice as that code ignores the interrupt until the > real action it was waiting for has occurred. The interrupt is then > reposted later. > > But more generally there could be users of JvmtiRawMonitors that > expect/require that RawMonitorWait is responsive to Thread.interrupt > in a manner similar to Object.wait. And if any of them are reading > this then I'd like to know - hence this RFC :) > > FYI testing to date: > ?- tiers 1 -3 all platforms > ?- hotspot: serviceability/jvmti > ????????????????????????? /jdwp > ??????????? vmTestbase/nsk/jvmti > ????????????????????????? /jdwp > ?- JDK: com/sun/jdi > > Comments/opinions appreciated. > > Thanks, > David > > [1] > https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait From yasuenag at gmail.com Mon Aug 19 10:09:04 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Mon, 19 Aug 2019 19:09:04 +0900 Subject: PING: RFR: 8226204: SA: Refactoring for option processing in SALauncher In-Reply-To: <8abc501d-31e8-fc9f-afc8-d32c4baa838c@oracle.com> References: <2d4bee10-768d-f54d-8bbb-dd70cb25abe1@gmail.com> <4a61d2ac-0a27-fc31-307d-4a818c4b4094@oracle.com> <7fca26c1-8125-f2fd-d5da-090ba876f86d@gmail.com> <8abc501d-31e8-fc9f-afc8-d32c4baa838c@oracle.com> Message-ID: <0f85bd02-d91f-a91e-70ad-c3ad964536f2@gmail.com> Thanks Serguei! I will fix it and will push. Yasumasa On 2019/08/19 13:50, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > > Thank you for the update! > > The typo below still is not fixed (replace: "be map" => "be mapped"): > > 225 * You also can set the options which cannot be map to old fashioned > > > 242 * SAGetopt parses and validates the argument. If he user passes invalid > 243 * option, SAGetoptException will be occurred at SAGetopt::next. > 244 * Thus we need not to validate them in here. > > ?A typo: "he user" => "the user". > ?I'd suggest to replace the line 244 with: > ?? "Thus there is no need to validate it here." > > Thumbs up on the webrev in general. > No need for re-review if you fix the above. > > Thanks, > Serguei > > > On 8/15/19 18:05, Yasumasa Suenaga wrote: >> Hi Serguei, >> >> Thank you for the comment. >> I fixed / added the comment in new webrev. Could you check again? >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.02/ >> >> >> Yasumasa >> >> >> On 2019/08/16 8:03, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> Thank you for the update! >>> A couple of suggestions: >>> >>> 213 * This method converts jhsdb-style options (oldArgs) to oldfashioned >>> >>> ?? Replace: "oldfashioned " => "old fashioned". >>> ?? There are several occurrences of it. >>> >>> 214 * style. SALauncher delegates the work to the entry point of each tools. >>> >>> ?? Replace: "each tools" => "each tool" >>> >>> 225 * You also can set the options which cannot be map to oldfashioned >>> 226 * arguments. For example, `jhsdb jmap --binaryheap` cannot be map to >>> >>> ?? Replace: "cannot be map" => "cannot be mapped" >>> >>> 231 * This method returns the map which the key is oldfashioned option, >>> 232 * the value is its value. >>> >>> ?? I'd suggest to say: >>> ??? * This method returns the map of the old fashioned key/val pairs. >>> >>> >>> This loop still needs to be commented: >>> >>> 242 while ((s = sg.next(null, longOpts)) != null) { >>> 243 var val = longOptsMap.get(s); >>> 244 if (val != null) { >>> ????????????????????? // What is done here and why? >>> ? 245 newArgMap.put(val, null); >>> 246 } else { >>> 247 val = longOptsMap.get(s + "=");? // Why the "=" is added >>> 248 if (val != null) { >>> ????????????????????????? // What is done here and why? >>> ? 249 newArgMap.put(val, sg.getOptarg()); >>> ? 250???????????????? } >>> ????????????????????? // Why there is no else statement, do we just skip the option? >>> >>> ??251???????????? } >>> ? 252???????? } >>> >>> ??Such comments will give more context and make this code more readable. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/15/19 7:14 AM, Yasumasa Suenaga wrote: >>>> Hi Serguei, >>>> >>>> I added the explanation as a comment in parseOptions what is longOptsMap, >>>> and how parseOptions() work in new webrev. >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.01/ >>>> >>>> I'm not good at English, so comments are welcome :) >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/08/15 17:12, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> In fact, I have a problem to understand what the parseOptions() method is doing. >>>>> Could you add necessary comments explaining what is done in the loop? >>>>> I'm sure I'll be not alone in having trouble to read this code. >>>>> Also, it is not clear the approach with the longOptsMap's. >>>>> Why do you need to map "exe=" to "exe" but "mixed" to "-m" and"clstats" to "-clstats"? >>>>> It is better to be explained in the parseOptions() method as well. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 8/10/19 04:14, Yasumasa Suenaga wrote: >>>>>> PING: Could you review it? >>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/07/24 10:18, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this change: >>>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8226204 >>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8226204/webrev.00/ >>>>>>> >>>>>>> This enhancement has been proposed in [1]. >>>>>>> >>>>>>> SALauncher (jhsdb implementation) processes the option for each subcommand (e.g. jstack, hsdb). >>>>>>> But they exist in many place with similar code. >>>>>>> So there is some room for refactoring. >>>>>>> >>>>>>> This change has passed the tests on submit repo and serviceability/sa tests. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028376.html >>>>> >>> > From robbin.ehn at oracle.com Mon Aug 19 12:14:44 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 19 Aug 2019 14:14:44 +0200 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> Message-ID: <73006ef1-03ff-f9ee-4f8f-758c25aacac4@oracle.com> Hi David, > Comments/opinions appreciated. Had I quick look, seems reasonable. And totally worth it since you removed the dependency on object monitors! -#include "runtime/objectMonitor.hpp" :) Thanks for not giving up! /Robbin > > Thanks, > David > > [1] https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait From alexey.menkov at oracle.com Tue Aug 20 21:26:10 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 20 Aug 2019 14:26:10 -0700 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html In-Reply-To: <15ff1c0b-f5b6-b45a-fd6a-387b2459613c@oracle.com> References: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> <60a164f8-6e95-7d46-7409-2c4db0929d0d@oracle.com> <15ff1c0b-f5b6-b45a-fd6a-387b2459613c@oracle.com> Message-ID: <19edcbb7-f04e-3f97-1d61-1bd94a2ccf62@oracle.com> Hi guys, Thanks for review. Once I noticed that "Phase/Callback Safe/Position/Since" and "Phase/Event Type/Number/Enabling/Since" tables are a bit wider I cannot "unnotice" it back :) Bordered div's (without width specified) and bordered tables (with width=100%) have the same width, but div with "display: table" and "width: 100%" (we need it because otherwise it behaves like a table and does't fill whole width) is 2px wider and this is consistent in different browsers (firefox/chrome/IE). I tried different properties, but was not able to remove this difference. So I introduced workaround - one more div with right margin 2px. Also fixed vertical alignment for the same pseudo-tables. webrev (full): http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev.2/ webrev (vs prev. version): http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev2_1/ generated jvmti.html: http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/2/jvmti.html --alex On 08/17/2019 22:55, David Holmes wrote: > Hi JC, > > On 18/08/2019 2:15 pm, Jean Christophe Beyler wrote: >> Hi Alex, >> >> Looks good to me as well. What is surprising (or maybe not) is the >> slight changes that you do see. The vertical alignment is off for the >> Position / Since columns it seems (it used to be vertically centered >> and no longer; see the "Allocate" table for example). > > FWIW I don't observe any differences in that aspect of the tables > (Firefox on Windows 7). The only visual difference I see is that the > table lines seem thicker. > >> And the same table seems a bit wider on my machine than the other tables: >> ?? - The Phase/Callback Safe/Position/Since table seems a few pixels >> wider than the Capabilities one for example. > > I see that too. To me it appears to be because there is an extra column > in the phase/callback/position/Since table and the extra line thickness > then makes the overall table wider. > > Cheers, > David > >> >> But these are really small details on my machine that I think we are >> fine, so looks good to me too :) >> Jc >> >> On Sat, Aug 17, 2019 at 12:05 AM David Holmes > > wrote: >> >> ??? Hi Alex, >> >> ??? Visually this appeared fine to me, so as long as the accessibility >> ??? checking tool is happy then changes seem good. >> >> ??? Thanks, >> ??? David >> >> ??? On 17/08/2019 9:46 am, Alex Menkov wrote: >> ???? > Hi all, >> ???? > >> ???? > Please review the change that fixes accessibility issues in >> ??? generated >> ???? > jvmti.html >> ???? > >> ???? > There are 2 "general" accessibility issues ("content outside of a >> ???? > region") - fixed by replacing
with
>> ??? and
> ???? > role="main"> with
>> ???? > and huge number (5200+) of table issues: >> ???? > - no row or column header for cells; >> ???? > - table has only one column or row. >> ???? > Most of the tables was updated to have row and column headers, >> ???? > the tables which does not contain table data (like "Phase/Callback >> ???? > Safe/Position/Since" block for functions) were converted to use >> ???
s. >> ???? > All table headers/descriptions were converted to . >> ???? > All cases when tables can has only one row/column are handled by >> ??? xsl (if >> ???? > there is no data for the table,
s are used). >> ???? > >> ???? > jira: https://bugs.openjdk.java.net/browse/JDK-8228547 >> ???? > >> ???? > webrev: >> ???? > >> >> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ >> >> ???? > >> ???? > generated doc: >> ???? > - old: >> ???? > >> >> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html >> >> >> ???? > >> ???? > - new: >> ???? > >> >> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html >> >> >> ???? > >> ???? > >> ???? > Visually there are minimal changes (checked in Firefox, Chrome, >> IE) >> ???? > >> ???? > specdiff: >> ???? > >> >> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html >> >> >> ???? > >> ???? > >> ???? > --alex >> >> >> >> -- >> >> Thanks, >> Jc From jcbeyler at google.com Tue Aug 20 23:29:29 2019 From: jcbeyler at google.com (Jean Christophe Beyler) Date: Tue, 20 Aug 2019 16:29:29 -0700 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html In-Reply-To: <19edcbb7-f04e-3f97-1d61-1bd94a2ccf62@oracle.com> References: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> <60a164f8-6e95-7d46-7409-2c4db0929d0d@oracle.com> <15ff1c0b-f5b6-b45a-fd6a-387b2459613c@oracle.com> <19edcbb7-f04e-3f97-1d61-1bd94a2ccf62@oracle.com> Message-ID: Hi Alex, Looks good to me, thanks for looking into it. I admit I could not "unsee" it either ;-) Jc On Tue, Aug 20, 2019 at 2:26 PM Alex Menkov wrote: > Hi guys, > > Thanks for review. > Once I noticed that "Phase/Callback Safe/Position/Since" and > "Phase/Event Type/Number/Enabling/Since" tables are a bit wider I cannot > "unnotice" it back :) > > Bordered div's (without width specified) and bordered tables (with > width=100%) have the same width, but div with "display: table" and > "width: 100%" (we need it because otherwise it behaves like a table and > does't fill whole width) is 2px wider and this is consistent in > different browsers (firefox/chrome/IE). I tried different properties, > but was not able to remove this difference. > So I introduced workaround - one more div with right margin 2px. > > Also fixed vertical alignment for the same pseudo-tables. > > webrev (full): > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev.2/ > webrev (vs prev. version): > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev2_1/ > > generated jvmti.html: > > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/2/jvmti.html > > --alex > > > On 08/17/2019 22:55, David Holmes wrote: > > Hi JC, > > > > On 18/08/2019 2:15 pm, Jean Christophe Beyler wrote: > >> Hi Alex, > >> > >> Looks good to me as well. What is surprising (or maybe not) is the > >> slight changes that you do see. The vertical alignment is off for the > >> Position / Since columns it seems (it used to be vertically centered > >> and no longer; see the "Allocate" table for example). > > > > FWIW I don't observe any differences in that aspect of the tables > > (Firefox on Windows 7). The only visual difference I see is that the > > table lines seem thicker. > > > >> And the same table seems a bit wider on my machine than the other > tables: > >> - The Phase/Callback Safe/Position/Since table seems a few pixels > >> wider than the Capabilities one for example. > > > > I see that too. To me it appears to be because there is an extra column > > in the phase/callback/position/Since table and the extra line thickness > > then makes the overall table wider. > > > > Cheers, > > David > > > >> > >> But these are really small details on my machine that I think we are > >> fine, so looks good to me too :) > >> Jc > >> > >> On Sat, Aug 17, 2019 at 12:05 AM David Holmes >> > wrote: > >> > >> Hi Alex, > >> > >> Visually this appeared fine to me, so as long as the accessibility > >> checking tool is happy then changes seem good. > >> > >> Thanks, > >> David > >> > >> On 17/08/2019 9:46 am, Alex Menkov wrote: > >> > Hi all, > >> > > >> > Please review the change that fixes accessibility issues in > >> generated > >> > jvmti.html > >> > > >> > There are 2 "general" accessibility issues ("content outside of a > >> > region") - fixed by replacing
with
> >> and
>> > role="main"> with
> >> > and huge number (5200+) of table issues: > >> > - no row or column header for cells; > >> > - table has only one column or row. > >> > Most of the tables was updated to have row and column headers, > >> > the tables which does not contain table data (like > "Phase/Callback > >> > Safe/Position/Since" block for functions) were converted to use > >>
s. > >> > All table headers/descriptions were converted to . > >> > All cases when tables can has only one row/column are handled by > >> xsl (if > >> > there is no data for the table,
s are used). > >> > > >> > jira: https://bugs.openjdk.java.net/browse/JDK-8228547 > >> > > >> > webrev: > >> > > >> > >> > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ > >> > >> > > >> > generated doc: > >> > - old: > >> > > >> > >> > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html > >> > >> > >> > > >> > - new: > >> > > >> > >> > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html > >> > >> > >> > > >> > > >> > Visually there are minimal changes (checked in Firefox, Chrome, > >> IE) > >> > > >> > specdiff: > >> > > >> > >> > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html > >> > >> > >> > > >> > > >> > --alex > >> > >> > >> > >> -- > >> > >> Thanks, > >> Jc > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Tue Aug 20 23:32:51 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 20 Aug 2019 16:32:51 -0700 Subject: RFR: 8229957: Harden pid verification in attach mechanism Message-ID: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> Hi Could you review following fix which add sanity check of pid value in attach mechanism on *nix based platforms. PID for java process is always positive on affected OS. Hotspot internally uses signal (SIGQUIT) while attaching. So using negative numbers as pid might cause very unexpected results and should be prevented. webrev: http://cr.openjdk.java.net/~lmesnik/8229957/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8229957 I checked that jcmd doesn't allow to connect to negative pids. Leonid From serguei.spitsyn at oracle.com Tue Aug 20 23:48:29 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 20 Aug 2019 16:48:29 -0700 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html In-Reply-To: <19edcbb7-f04e-3f97-1d61-1bd94a2ccf62@oracle.com> References: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> <60a164f8-6e95-7d46-7409-2c4db0929d0d@oracle.com> <15ff1c0b-f5b6-b45a-fd6a-387b2459613c@oracle.com> <19edcbb7-f04e-3f97-1d61-1bd94a2ccf62@oracle.com> Message-ID: <32201f64-4072-35ce-329f-a4468eb4a807@oracle.com> Hi Alex, It still looks good to me. Thanks, Serguei On 8/20/19 2:26 PM, Alex Menkov wrote: > Hi guys, > > Thanks for review. > Once I noticed that "Phase/Callback Safe/Position/Since" and > "Phase/Event Type/Number/Enabling/Since" tables are a bit wider I > cannot "unnotice" it back :) > > Bordered div's (without width specified) and bordered tables (with > width=100%) have the same width, but div with "display: table" and > "width: 100%" (we need it because otherwise it behaves like a table > and does't fill whole width) is 2px wider and this is consistent in > different browsers (firefox/chrome/IE). I tried different properties, > but was not able to remove this difference. > So I introduced workaround - one more div with right margin 2px. > > Also fixed vertical alignment for the same pseudo-tables. > > webrev (full): > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev.2/ > > webrev (vs prev. version): > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev2_1/ > > > generated jvmti.html: > > http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/2/jvmti.html > > > --alex > > > On 08/17/2019 22:55, David Holmes wrote: >> Hi JC, >> >> On 18/08/2019 2:15 pm, Jean Christophe Beyler wrote: >>> Hi Alex, >>> >>> Looks good to me as well. What is surprising (or maybe not) is the >>> slight changes that you do see. The vertical alignment is off for >>> the Position / Since columns it seems (it used to be vertically >>> centered and no longer; see the "Allocate" table for example). >> >> FWIW I don't observe any differences in that aspect of the tables >> (Firefox on Windows 7). The only visual difference I see is that the >> table lines seem thicker. >> >>> And the same table seems a bit wider on my machine than the other >>> tables: >>> ?? - The Phase/Callback Safe/Position/Since table seems a few pixels >>> wider than the Capabilities one for example. >> >> I see that too. To me it appears to be because there is an extra >> column in the phase/callback/position/Since table and the extra line >> thickness then makes the overall table wider. >> >> Cheers, >> David >> >>> >>> But these are really small details on my machine that I think we are >>> fine, so looks good to me too :) >>> Jc >>> >>> On Sat, Aug 17, 2019 at 12:05 AM David Holmes >>> > wrote: >>> >>> ??? Hi Alex, >>> >>> ??? Visually this appeared fine to me, so as long as the accessibility >>> ??? checking tool is happy then changes seem good. >>> >>> ??? Thanks, >>> ??? David >>> >>> ??? On 17/08/2019 9:46 am, Alex Menkov wrote: >>> ???? > Hi all, >>> ???? > >>> ???? > Please review the change that fixes accessibility issues in >>> ??? generated >>> ???? > jvmti.html >>> ???? > >>> ???? > There are 2 "general" accessibility issues ("content outside >>> of a >>> ???? > region") - fixed by replacing
with
>>> ??? and
>> ???? > role="main"> with
>>> ???? > and huge number (5200+) of table issues: >>> ???? > - no row or column header for cells; >>> ???? > - table has only one column or row. >>> ???? > Most of the tables was updated to have row and column headers, >>> ???? > the tables which does not contain table data (like >>> "Phase/Callback >>> ???? > Safe/Position/Since" block for functions) were converted to use >>> ???
s. >>> ???? > All table headers/descriptions were converted to . >>> ???? > All cases when tables can has only one row/column are handled by >>> ??? xsl (if >>> ???? > there is no data for the table,
s are used). >>> ???? > >>> ???? > jira: https://bugs.openjdk.java.net/browse/JDK-8228547 >>> ???? > >>> ???? > webrev: >>> ???? > >>> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ >>> >>> ???? > >>> ???? > generated doc: >>> ???? > - old: >>> ???? > >>> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html >>> >>> >>> ???? > >>> ???? > - new: >>> ???? > >>> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html >>> >>> >>> ???? > >>> ???? > >>> ???? > Visually there are minimal changes (checked in Firefox, >>> Chrome, IE) >>> ???? > >>> ???? > specdiff: >>> ???? > >>> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html >>> >>> >>> ???? > >>> ???? > >>> ???? > --alex >>> >>> >>> >>> -- >>> >>> Thanks, >>> Jc From serguei.spitsyn at oracle.com Tue Aug 20 23:58:03 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 20 Aug 2019 16:58:03 -0700 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> Message-ID: Hi David, The whole approach looks good to me. + if (jSelf != NULL) { + if (interruptible && Thread::is_interrupted(jSelf, true)) { + // We're now interrupted but we may have consumed a notification. + // To avoid lost wakeups we have to re-issue that notification, which + // may result in a spurious wakeup for another thread. Alternatively we + // ignore checking for interruption before returning. + notify(); + return false; // interrupted + } I'm a bit concerned about introduction of new spurious wake ups above. Some tests can be not defensive against it, so we may discover new intermittent failures. Thanks, Serguei On 8/14/19 11:22 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 > > Preliminary webrev (still has rough edges): > http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ > > Background: > > We've had this comment for a long time: > > ?// The raw monitor subsystem is entirely distinct from normal > ?// java-synchronization or jni-synchronization.? raw monitors are not > ?// associated with objects.? They can be implemented in any manner > ?// that makes sense.? The original implementors decided to piggy-back > ?// the raw-monitor implementation on the existing Java objectMonitor > mechanism. > ?// This flaw needs to fixed.? We should reimplement raw monitors as > sui-generis. > ?// Specifically, we should not implement raw monitors via java monitors. > ?// Time permitting, we should disentangle and deconvolve the two > implementations > ?// and move the resulting raw monitor implementation over to the > JVMTI directories. > ?// Ideally, the raw monitor implementation would be built on top of > ?// park-unpark and nothing else. > > This is an attempt to do that disentangling so that we can then > consider changes to ObjectMonitor without having to worry about > JvmtiRawMonitors. But rather than building on low-level park/unpark > (which would require the same manual queue management and much of the > same complex code as exists in ObjectMonitor) I decided to try and do > this on top of PlatformMonitor. > > The reason this is just a RFC rather than RFR is that I overlooked a > non-trivial aspect of JvmtiRawMonitors: like Java monitors (as > implemented by ObjectMonitor) they interact with the Thread.interrupt > mechanism. This is not clearly stated in the JVM TI specification [1] > but only in passing by the possible errors for RawMonitorWait: > > JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again > > As I explain in the bug report there is no way to build in proper > interrupt support using PlatformMonitor as there is no way we can > "interrupt" the low-level pthread_cond_wait. But we can approximate > it. What I've done in this preliminary version is just check interrupt > state before and after the actual "wait" but we won't get woken by the > interrupt once we have actually blocked. Alternatively we could use a > periodic polling approach and wakeup every Nms to check for interruption. > > The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not > affected by this choice as that code ignores the interrupt until the > real action it was waiting for has occurred. The interrupt is then > reposted later. > > But more generally there could be users of JvmtiRawMonitors that > expect/require that RawMonitorWait is responsive to Thread.interrupt > in a manner similar to Object.wait. And if any of them are reading > this then I'd like to know - hence this RFC :) > > FYI testing to date: > ?- tiers 1 -3 all platforms > ?- hotspot: serviceability/jvmti > ????????????????????????? /jdwp > ??????????? vmTestbase/nsk/jvmti > ????????????????????????? /jdwp > ?- JDK: com/sun/jdi > > Comments/opinions appreciated. > > Thanks, > David > > [1] > https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Aug 21 00:31:24 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 20 Aug 2019 17:31:24 -0700 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> Message-ID: Hi Leonid, It looks good to me. Thank you for discovering and fixing this issue! The only concern I have is if the pid=-1 was used for something, so added Alan to the list. Thanks, Serguei On 8/20/19 4:32 PM, Leonid Mesnik wrote: > Hi > > Could you review following fix which add sanity check of pid value in > attach mechanism on *nix based platforms. > > PID for java process is always positive on affected OS. Hotspot > internally uses signal (SIGQUIT) while attaching. So using negative > numbers as pid might cause very unexpected results and should be > prevented. > > webrev: http://cr.openjdk.java.net/~lmesnik/8229957/webrev.00/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8229957 > > I checked that jcmd doesn't allow to connect to negative pids. > > Leonid > From david.holmes at oracle.com Wed Aug 21 05:21:27 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 21 Aug 2019 15:21:27 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> Message-ID: Hi Serguei, On 21/08/2019 9:58 am, serguei.spitsyn at oracle.com wrote: > Hi David, > > The whole approach looks good to me. Thanks for taking a look. My main concern is about the interrupt semantics, so I really need to get some end-user feedback on that aspect as well. > + if (jSelf != NULL) { > + if (interruptible && Thread::is_interrupted(jSelf, true)) { > + // We're now interrupted but we may have consumed a notification. > + // To avoid lost wakeups we have to re-issue that notification, which > + // may result in a spurious wakeup for another thread. Alternatively we > + // ignore checking for interruption before returning. > + notify(); > + return false; // interrupted > + } > > I'm a bit concerned about introduction of new spurious wake ups above. > Some tests can be not defensive against it, so we may discover new > intermittent failures. That is possible. Though given spurious wakeups are already possible any test that is incorrectly using RawMonitorWait() without checking a condition, is technically already broken. Not checking for interruption after the wait will also require some test changes, and it weakens the interrupt semantics even further. Thanks, David ----- > Thanks, > Serguei > > On 8/14/19 11:22 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >> >> Preliminary webrev (still has rough edges): >> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >> >> Background: >> >> We've had this comment for a long time: >> >> ?// The raw monitor subsystem is entirely distinct from normal >> ?// java-synchronization or jni-synchronization.? raw monitors are not >> ?// associated with objects.? They can be implemented in any manner >> ?// that makes sense.? The original implementors decided to piggy-back >> ?// the raw-monitor implementation on the existing Java objectMonitor >> mechanism. >> ?// This flaw needs to fixed.? We should reimplement raw monitors as >> sui-generis. >> ?// Specifically, we should not implement raw monitors via java monitors. >> ?// Time permitting, we should disentangle and deconvolve the two >> implementations >> ?// and move the resulting raw monitor implementation over to the >> JVMTI directories. >> ?// Ideally, the raw monitor implementation would be built on top of >> ?// park-unpark and nothing else. >> >> This is an attempt to do that disentangling so that we can then >> consider changes to ObjectMonitor without having to worry about >> JvmtiRawMonitors. But rather than building on low-level park/unpark >> (which would require the same manual queue management and much of the >> same complex code as exists in ObjectMonitor) I decided to try and do >> this on top of PlatformMonitor. >> >> The reason this is just a RFC rather than RFR is that I overlooked a >> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >> implemented by ObjectMonitor) they interact with the Thread.interrupt >> mechanism. This is not clearly stated in the JVM TI specification [1] >> but only in passing by the possible errors for RawMonitorWait: >> >> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >> >> As I explain in the bug report there is no way to build in proper >> interrupt support using PlatformMonitor as there is no way we can >> "interrupt" the low-level pthread_cond_wait. But we can approximate >> it. What I've done in this preliminary version is just check interrupt >> state before and after the actual "wait" but we won't get woken by the >> interrupt once we have actually blocked. Alternatively we could use a >> periodic polling approach and wakeup every Nms to check for interruption. >> >> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not >> affected by this choice as that code ignores the interrupt until the >> real action it was waiting for has occurred. The interrupt is then >> reposted later. >> >> But more generally there could be users of JvmtiRawMonitors that >> expect/require that RawMonitorWait is responsive to Thread.interrupt >> in a manner similar to Object.wait. And if any of them are reading >> this then I'd like to know - hence this RFC :) >> >> FYI testing to date: >> ?- tiers 1 -3 all platforms >> ?- hotspot: serviceability/jvmti >> ????????????????????????? /jdwp >> ??????????? vmTestbase/nsk/jvmti >> ????????????????????????? /jdwp >> ?- JDK: com/sun/jdi >> >> Comments/opinions appreciated. >> >> Thanks, >> David >> >> [1] >> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait > From david.holmes at oracle.com Wed Aug 21 05:30:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 21 Aug 2019 15:30:00 +1000 Subject: RFR: JDK-8228547: accessibility errors in jvmti.html In-Reply-To: <32201f64-4072-35ce-329f-a4468eb4a807@oracle.com> References: <8ea3838a-fc66-45ac-c3cd-97da34a852b9@oracle.com> <60a164f8-6e95-7d46-7409-2c4db0929d0d@oracle.com> <15ff1c0b-f5b6-b45a-fd6a-387b2459613c@oracle.com> <19edcbb7-f04e-3f97-1d61-1bd94a2ccf62@oracle.com> <32201f64-4072-35ce-329f-a4468eb4a807@oracle.com> Message-ID: <1b60f4d5-991b-1a37-0bab-61bc60b24ce9@oracle.com> +1 Thanks, David On 21/08/2019 9:48 am, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It still looks good to me. > > Thanks, > Serguei > > On 8/20/19 2:26 PM, Alex Menkov wrote: >> Hi guys, >> >> Thanks for review. >> Once I noticed that "Phase/Callback Safe/Position/Since" and >> "Phase/Event Type/Number/Enabling/Since" tables are a bit wider I >> cannot "unnotice" it back :) >> >> Bordered div's (without width specified) and bordered tables (with >> width=100%) have the same width, but div with "display: table" and >> "width: 100%" (we need it because otherwise it behaves like a table >> and does't fill whole width) is 2px wider and this is consistent in >> different browsers (firefox/chrome/IE). I tried different properties, >> but was not able to remove this difference. >> So I introduced workaround - one more div with right margin 2px. >> >> Also fixed vertical alignment for the same pseudo-tables. >> >> webrev (full): >> >> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev.2/ >> >> webrev (vs prev. version): >> >> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev2_1/ >> >> >> generated jvmti.html: >> >> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/2/jvmti.html >> >> >> --alex >> >> >> On 08/17/2019 22:55, David Holmes wrote: >>> Hi JC, >>> >>> On 18/08/2019 2:15 pm, Jean Christophe Beyler wrote: >>>> Hi Alex, >>>> >>>> Looks good to me as well. What is surprising (or maybe not) is the >>>> slight changes that you do see. The vertical alignment is off for >>>> the Position / Since columns it seems (it used to be vertically >>>> centered and no longer; see the "Allocate" table for example). >>> >>> FWIW I don't observe any differences in that aspect of the tables >>> (Firefox on Windows 7). The only visual difference I see is that the >>> table lines seem thicker. >>> >>>> And the same table seems a bit wider on my machine than the other >>>> tables: >>>> ?? - The Phase/Callback Safe/Position/Since table seems a few pixels >>>> wider than the Capabilities one for example. >>> >>> I see that too. To me it appears to be because there is an extra >>> column in the phase/callback/position/Since table and the extra line >>> thickness then makes the overall table wider. >>> >>> Cheers, >>> David >>> >>>> >>>> But these are really small details on my machine that I think we are >>>> fine, so looks good to me too :) >>>> Jc >>>> >>>> On Sat, Aug 17, 2019 at 12:05 AM David Holmes >>>> > wrote: >>>> >>>> ??? Hi Alex, >>>> >>>> ??? Visually this appeared fine to me, so as long as the accessibility >>>> ??? checking tool is happy then changes seem good. >>>> >>>> ??? Thanks, >>>> ??? David >>>> >>>> ??? On 17/08/2019 9:46 am, Alex Menkov wrote: >>>> ???? > Hi all, >>>> ???? > >>>> ???? > Please review the change that fixes accessibility issues in >>>> ??? generated >>>> ???? > jvmti.html >>>> ???? > >>>> ???? > There are 2 "general" accessibility issues ("content outside >>>> of a >>>> ???? > region") - fixed by replacing
with
>>>> ??? and
>>> ???? > role="main"> with
>>>> ???? > and huge number (5200+) of table issues: >>>> ???? > - no row or column header for cells; >>>> ???? > - table has only one column or row. >>>> ???? > Most of the tables was updated to have row and column headers, >>>> ???? > the tables which does not contain table data (like >>>> "Phase/Callback >>>> ???? > Safe/Position/Since" block for functions) were converted to use >>>> ???
s. >>>> ???? > All table headers/descriptions were converted to . >>>> ???? > All cases when tables can has only one row/column are handled by >>>> ??? xsl (if >>>> ???? > there is no data for the table,
s are used). >>>> ???? > >>>> ???? > jira: https://bugs.openjdk.java.net/browse/JDK-8228547 >>>> ???? > >>>> ???? > webrev: >>>> ???? > >>>> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/webrev/ >>>> >>>> ???? > >>>> ???? > generated doc: >>>> ???? > - old: >>>> ???? > >>>> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/0/jvmti.html >>>> >>>> >>>> ???? > >>>> ???? > - new: >>>> ???? > >>>> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/1/jvmti.html >>>> >>>> >>>> ???? > >>>> ???? > >>>> ???? > Visually there are minimal changes (checked in Firefox, >>>> Chrome, IE) >>>> ???? > >>>> ???? > specdiff: >>>> ???? > >>>> http://cr.openjdk.java.net/~amenkov/jdk14/jvmti_html_accessibility/spectdiff/diff.html >>>> >>>> >>>> ???? > >>>> ???? > >>>> ???? > --alex >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> Jc > From sgehwolf at redhat.com Wed Aug 21 09:08:30 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 21 Aug 2019 11:08:30 +0200 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> Message-ID: On Tue, 2019-08-20 at 16:32 -0700, Leonid Mesnik wrote: > Hi > > Could you review following fix which add sanity check of pid value in > attach mechanism on *nix based platforms. > > PID for java process is always positive on affected OS. Hotspot > internally uses signal (SIGQUIT) while attaching. So using negative > numbers as pid might cause very unexpected results and should be prevented. > > webrev: http://cr.openjdk.java.net/~lmesnik/8229957/webrev.00/ This looks OK to me. Thanks, Severin From yasuenag at gmail.com Wed Aug 21 14:22:06 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 21 Aug 2019 23:22:06 +0900 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> Message-ID: <893a1163-36b0-6e12-24a9-97c5340d3de1@gmail.com> Hi Leonid, In case of Linux on Docker container, java might be run with PID=1. So I think `pid <= 1` is incorrect. Thanks, Yasumasa On 2019/08/21 8:32, Leonid Mesnik wrote: > Hi > > Could you review following fix which add sanity check of pid value in > attach mechanism on *nix based platforms. > > PID for java process is always positive on affected OS. Hotspot > internally uses signal (SIGQUIT) while attaching. So using negative > numbers as pid might cause very unexpected results and should be prevented. > > webrev: http://cr.openjdk.java.net/~lmesnik/8229957/webrev.00/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8229957 > > I checked that jcmd doesn't allow to connect to negative pids. > > Leonid > From sgehwolf at redhat.com Wed Aug 21 14:45:40 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 21 Aug 2019 16:45:40 +0200 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: <893a1163-36b0-6e12-24a9-97c5340d3de1@gmail.com> References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> <893a1163-36b0-6e12-24a9-97c5340d3de1@gmail.com> Message-ID: <563f75af16dbc9359c63c3a1d969948f4ff7eb5b.camel@redhat.com> On Wed, 2019-08-21 at 23:22 +0900, Yasumasa Suenaga wrote: > Hi Leonid, > > In case of Linux on Docker container, java might be run with PID=1. > So I think `pid <= 1` is incorrect. That's a very good point, Yasumasa. Nice catch. I believe changing the condition to < 1 would work. Thanks, Severin From leonid.mesnik at oracle.com Wed Aug 21 17:40:55 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 21 Aug 2019 10:40:55 -0700 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: <563f75af16dbc9359c63c3a1d969948f4ff7eb5b.camel@redhat.com> References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> <893a1163-36b0-6e12-24a9-97c5340d3de1@gmail.com> <563f75af16dbc9359c63c3a1d969948f4ff7eb5b.camel@redhat.com> Message-ID: Yasumasa, Severin Thank you, I updated condition. Agree that pid might be equal 1 for java in containers on linux and also might be for some specific configuration with other OS. Se let just check that it is at least positive. http://cr.openjdk.java.net/~lmesnik/8229957/webrev.01/ Leonid On 8/21/19 7:45 AM, Severin Gehwolf wrote: > On Wed, 2019-08-21 at 23:22 +0900, Yasumasa Suenaga wrote: >> Hi Leonid, >> >> In case of Linux on Docker container, java might be run with PID=1. >> So I think `pid <= 1` is incorrect. > That's a very good point, Yasumasa. Nice catch. > > I believe changing the condition to < 1 would work. > > Thanks, > Severin > From serguei.spitsyn at oracle.com Wed Aug 21 20:59:33 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 21 Aug 2019 13:59:33 -0700 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> <893a1163-36b0-6e12-24a9-97c5340d3de1@gmail.com> <563f75af16dbc9359c63c3a1d969948f4ff7eb5b.camel@redhat.com> Message-ID: <1f04836c-830a-a445-e0a6-0359aa0bc81c@oracle.com> I'm Okay with this update. Thanks, Serguei On 8/21/19 10:40, Leonid Mesnik wrote: > Yasumasa, Severin > > Thank you, I updated condition. Agree that pid might be equal 1 for > java in containers on linux and also might be for some specific > configuration with other OS. Se let just check that it is at least > positive. > > http://cr.openjdk.java.net/~lmesnik/8229957/webrev.01/ > > Leonid > > On 8/21/19 7:45 AM, Severin Gehwolf wrote: >> On Wed, 2019-08-21 at 23:22 +0900, Yasumasa Suenaga wrote: >>> Hi Leonid, >>> >>> In case of Linux on Docker container, java might be run with PID=1. >>> So I think `pid <= 1` is incorrect. >> That's a very good point, Yasumasa. Nice catch. >> >> I believe changing the condition to < 1 would work. >> >> Thanks, >> Severin >> From yasuenag at gmail.com Wed Aug 21 23:47:55 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 22 Aug 2019 08:47:55 +0900 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> <893a1163-36b0-6e12-24a9-97c5340d3de1@gmail.com> <563f75af16dbc9359c63c3a1d969948f4ff7eb5b.camel@redhat.com> Message-ID: Looks good! Yasumasa (ysuenaga) 2019?8?22?(?) 2:38 Leonid Mesnik : > Yasumasa, Severin > > Thank you, I updated condition. Agree that pid might be equal 1 for java > in containers on linux and also might be for some specific configuration > with other OS. Se let just check that it is at least positive. > > http://cr.openjdk.java.net/~lmesnik/8229957/webrev.01/ > > Leonid > > On 8/21/19 7:45 AM, Severin Gehwolf wrote: > > On Wed, 2019-08-21 at 23:22 +0900, Yasumasa Suenaga wrote: > >> Hi Leonid, > >> > >> In case of Linux on Docker container, java might be run with PID=1. > >> So I think `pid <= 1` is incorrect. > > That's a very good point, Yasumasa. Nice catch. > > > > I believe changing the condition to < 1 would work. > > > > Thanks, > > Severin > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leonid.mesnik at oracle.com Thu Aug 22 00:16:08 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Wed, 21 Aug 2019 17:16:08 -0700 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> <893a1163-36b0-6e12-24a9-97c5340d3de1@gmail.com> <563f75af16dbc9359c63c3a1d969948f4ff7eb5b.camel@redhat.com> Message-ID: <8f981b02-2a4f-0f35-a006-919ed0678aaa@oracle.com> Thank you. Leonid On 8/21/19 4:47 PM, Yasumasa Suenaga wrote: > Looks good! > > > Yasumasa (ysuenaga) > > 2019?8?22?(?) 2:38 Leonid Mesnik >: > > Yasumasa, Severin > > Thank you, I updated condition. Agree that pid might be equal 1 > for java > in containers on linux and also might be for some specific > configuration > with other OS. Se let just check that it is at least positive. > > http://cr.openjdk.java.net/~lmesnik/8229957/webrev.01/ > > Leonid > > On 8/21/19 7:45 AM, Severin Gehwolf wrote: > > On Wed, 2019-08-21 at 23:22 +0900, Yasumasa Suenaga wrote: > >> Hi Leonid, > >> > >> In case of Linux on Docker container, java might be run with PID=1. > >> So I think `pid <= 1` is incorrect. > > That's a very good point, Yasumasa. Nice catch. > > > > I believe changing the condition to < 1 would work. > > > > Thanks, > > Severin > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgehwolf at redhat.com Thu Aug 22 07:41:59 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 22 Aug 2019 09:41:59 +0200 Subject: RFR: 8229957: Harden pid verification in attach mechanism In-Reply-To: References: <40f04450-0000-ec38-c950-0422e068f821@oracle.com> <893a1163-36b0-6e12-24a9-97c5340d3de1@gmail.com> <563f75af16dbc9359c63c3a1d969948f4ff7eb5b.camel@redhat.com> Message-ID: <5478ce32fc13c581daece70ef95cbe9d751d749e.camel@redhat.com> On Wed, 2019-08-21 at 10:40 -0700, Leonid Mesnik wrote: > Yasumasa, Severin > > Thank you, I updated condition. Agree that pid might be equal 1 for java > in containers on linux and also might be for some specific configuration > with other OS. Se let just check that it is at least positive. > > http://cr.openjdk.java.net/~lmesnik/8229957/webrev.01/ Looks good. Thanks, Severin From mikhailo.seledtsov at oracle.com Fri Aug 23 17:21:55 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Fri, 23 Aug 2019 10:21:55 -0700 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: <3143B636-9729-4238-9149-3A562B288643@oracle.com> References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> <3143B636-9729-4238-9149-3A562B288643@oracle.com> Message-ID: Finally got some time to work on this issue. Since I have encountered problem using files for passing messages between a container and a test driver (due to permissions), I looked for alternative solutions. I am using the output of a container process to signal when the main method has started, and it works. This simplifies things quite a bit as well. Normally, we use OutputAnalyzer test utility to collect the whole output once the process has completed, and then analyze the resulting output for "contains some string", match, etc. However, testutils/ProcessTools provides an API to consume the output as it is produced. I am using this API to detect when the main() method of the container has started. Updated webrev: ??? http://cr.openjdk.java.net/~mseledtsov/8228960.02/ Testing: ? Ran the test on Linux-x64, various multiple nodes in a test cluster 50 times - All PASS Thank you, Misha On 8/13/19 2:05 PM, Bob Vandette wrote: > >> On Aug 13, 2019, at 3:28 PM, mikhailo.seledtsov at oracle.com wrote: >> >> >> On 8/13/19 12:06 PM, Bob Vandette wrote: >>>> On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: >>>> >>>> Hi Bob, >>>> >>>> The workdir (JTwork/scratch) is created with the "test user" permissions. Let me try to place the "signal" file in /tmp instead, since /tmp should normally have a 777 permission on Linux. >>> Aren?t you creating a file inside a docker container and then checking for its existence outside of the container? >> Correct >>> Isn?t the root user running inside the container? >> By default it is. But it still fails to create a file, for some reason. Can be related to selinux settings (for instance, see this article: https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443), I can not change those. > Is your JTWork/scratch on an NFS mounted file system? If this is the case then the problem is that root is equivalent to nobody on > mounted file systems and can?t create files unless the directory has 777 permissions. I just confirmed this. You?d have to either run > the container test as test-user or change the scratch directory permission. > > Bob. > >> My hope is that /tmp is configured to be accessed by a container engine as a general purpose directory, hence I was thinking to try it out. >> >>> Both processes don?t see the same /tmp right? So that shouldn?t help. >> In my next experiment, I will map a /tmp from host to be a /host-tmp inside the container (--volume /tmp:/host-tmp), then write a signal file to /host-tmp. >>> If scratch has 777 permissions, anyone can create a file. >> scratch has "rwxr-xr-x" >>> You have to be careful that you can clean up the >>> file from outside the container. I?d make sure to create it with 777. >> I do use deleteOnExit(), so it should work (unless the JVM crashes). I guess I could add extra layer of safety here, and set the permissions to 777. Thank you for advice. >> >> >> Thank you, >> >> Misha >> >>> Bob. >>> >>>> If this works, I will have to add some unique number to the file name, perhaps a PID of a child process. >>>> >>>> I will try this, and let you know how it works. >>>> >>>> >>>> Thank you, >>>> >>>> Misha >>>> >>>> On 8/13/19 6:34 AM, Bob Vandette wrote: >>>>> Sorry, I just looked at the webrev and you are trying the approach I suggested. I thought you >>>>> were trying to use file change notification. >>>>> >>>>> Where does the workdir get created? Does it have 777 permissions? >>>>> >>>>> Bob. >>>>> >>>>> >>>>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette wrote: >>>>>> >>>>>> What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. >>>>>> >>>>>> Bob. >>>>>> >>>>>> >>>>>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>> >>>>>>> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". >>>>>>> >>>>>>> The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Misha >>>>>>> >>>>>>> >>>>>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>>>>> >>>>>>>> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >>>>>>>> >>>>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >>>>>>>> >>>>>>>> >>>>>>>> Thank you, >>>>>>>> >>>>>>>> Misha >>>>>>>> >>>>>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>>>>> Hi Severin, Bob, >>>>>>>>>> >>>>>>>>>> Thank you for reviewing the code. >>>>>>>>>> >>>>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>>>>>>>> file and waiting for it to exist with a timeout? >>>>>>>>>> I will try out this approach. >>>>>>>>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>>>>>>>> >>>>>>>>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>>>>>>>> >>>>>>>>> David >>>>>>>>> ----- >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Misha >>>>>>>>>>> Isn?t there a shared volume between the two >>>>>>>>>>> processes? >>>>>>>>>>> >>>>>>>>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>>>>>>>> to the end. >>>>>>>>>>> >>>>>>>>>>> Bob. >>>>>>>>>>> >>>>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Misha, >>>>>>>>>>>> >>>>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>>>>>>>> >>>>>>>>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>>>>>>>> been loaded yet. >>>>>>>>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>>>>>>>> the main test class. >>>>>>>>>>>> That's what I've found too. >>>>>>>>>>>> >>>>>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>>>>>>>> sleep in between. >>>>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>>>>>>>> >>>>>>>>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>>>>> >>>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>>>>> Looks OK to me. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Severin >>>>>>>>>>>> From daniil.x.titov at oracle.com Fri Aug 23 22:52:53 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 23 Aug 2019 15:52:53 -0700 Subject: RFR: 8182119: jdk.hotspot.agent's META-INF/services/com.sun.jdi.connect.Connector no longer needed Message-ID: Please review a trivial fix that removes the service configuration file. It is no longer needed since these SA JDI providers were removed in JDK 9 [3]. Tier1-tier3 tests successfully passed. [1] Bug: https://bugs.openjdk.java.net/browse/JDK-8182119 [2] Webrev: https://cr.openjdk.java.net/~dtitov/8182119/webrev.01/ [3] https://bugs.openjdk.java.net/browse/JDK-8158050 Thanks! -Daniil From alexey.menkov at oracle.com Fri Aug 23 23:54:35 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 23 Aug 2019 16:54:35 -0700 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html Message-ID: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> Hi all, Please review the fix for https://bugs.openjdk.java.net/browse/JDK-8228554 webrev: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ generated docs: old: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html new: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html specdiff: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html Summary: - "content outside of a region" issues: -
replaced with with
, -
replaced with
; - table issues: - added column headers to all tables; - for every row specified row header; - indentation with table "colspan" reimplemented by using CSS. --alex From alexey.menkov at oracle.com Fri Aug 23 23:56:10 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 23 Aug 2019 16:56:10 -0700 Subject: RFR: 8182119: jdk.hotspot.agent's META-INF/services/com.sun.jdi.connect.Connector no longer needed In-Reply-To: References: Message-ID: <06ce6c11-6fcb-993e-013e-ec257168e2f4@oracle.com> Looks good. --alex On 08/23/2019 15:52, Daniil Titov wrote: > Please review a trivial fix that removes the service configuration file. It is no longer needed > since these SA JDI providers were removed in JDK 9 [3]. > > Tier1-tier3 tests successfully passed. > > [1] Bug: https://bugs.openjdk.java.net/browse/JDK-8182119 > [2] Webrev: https://cr.openjdk.java.net/~dtitov/8182119/webrev.01/ > [3] https://bugs.openjdk.java.net/browse/JDK-8158050 > > Thanks! > -Daniil > > From serguei.spitsyn at oracle.com Sat Aug 24 00:29:24 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 23 Aug 2019 17:29:24 -0700 Subject: RFR: 8182119: jdk.hotspot.agent's META-INF/services/com.sun.jdi.connect.Connector no longer needed In-Reply-To: <06ce6c11-6fcb-993e-013e-ec257168e2f4@oracle.com> References: <06ce6c11-6fcb-993e-013e-ec257168e2f4@oracle.com> Message-ID: <56e29f3b-95e4-66bc-e30d-a08dcc46b9c0@oracle.com> Hi Daniil, +1 Thanks, Serguei On 8/23/19 4:56 PM, Alex Menkov wrote: > Looks good. > > --alex > > On 08/23/2019 15:52, Daniil Titov wrote: >> Please review a trivial fix that removes the service configuration >> file. It is no longer needed >> since these SA JDI providers were removed in JDK 9 [3]. >> >> Tier1-tier3 tests successfully passed. >> >> [1] Bug: https://bugs.openjdk.java.net/browse/JDK-8182119 >> [2] Webrev: https://cr.openjdk.java.net/~dtitov/8182119/webrev.01/ >> [3] https://bugs.openjdk.java.net/browse/JDK-8158050 >> >> Thanks! >> -Daniil >> >> From Alan.Bateman at oracle.com Sat Aug 24 06:30:00 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sat, 24 Aug 2019 07:30:00 +0100 Subject: RFR: 8182119: jdk.hotspot.agent's META-INF/services/com.sun.jdi.connect.Connector no longer needed In-Reply-To: References: Message-ID: On 23/08/2019 23:52, Daniil Titov wrote: > Please review a trivial fix that removes the service configuration file. It is no longer needed > since these SA JDI providers were removed in JDK 9 [3]. > > Looks good, surprised it wasn't noticed before. -Alan From mandrikov at gmail.com Sun Aug 25 18:36:12 2019 From: mandrikov at gmail.com (Evgeny Mandrikov) Date: Sun, 25 Aug 2019 20:36:12 +0200 Subject: RFR: JDK-8199136: Dead code in src/jdk.jcmd/share/classes/sun/tools/common/ProcessArgumentMatcher.java Message-ID: Hello! Please review patch [1] for JDK-8199136 [2]. Also it needs a sponsor since I have only author status in OpenJDK Census [3]. After this change tier1 tests pass on my machine. With best regards, Evgeny Mandrikov [1] http://cr.openjdk.java.net/~godin/8199136/webrev.00/ [2] https://bugs.openjdk.java.net/browse/JDK-8199136 [3] https://openjdk.java.net/census#godin -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 26 07:27:18 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 26 Aug 2019 00:27:18 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <4ea44805-4810-c6a4-b47c-b2ab7bba9d49@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Aug 26 07:30:42 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 26 Aug 2019 00:30:42 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <4ea44805-4810-c6a4-b47c-b2ab7bba9d49@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Aug 26 07:57:46 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 26 Aug 2019 17:57:46 +1000 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> <3143B636-9729-4238-9149-3A562B288643@oracle.com> Message-ID: Hi Misha, On 24/08/2019 3:21 am, mikhailo.seledtsov at oracle.com wrote: > Finally got some time to work on this issue. > Since I have encountered problem using files for passing messages > between a container and a test driver (due to permissions), I looked for > alternative solutions. I am using the output of a container process to > signal when the main method has started, and it works. This simplifies > things quite a bit as well. > > Normally, we use OutputAnalyzer test utility to collect the whole output > once the process has completed, and then analyze the resulting output > for "contains some string", match, etc. However, testutils/ProcessTools > provides an API to consume the output as it is produced. I am using this > API to detect when the main() method of the container has started. That seems reasonable. Do we want to make the following change to minimise unneeded output processing: private Consumer outputConsumer = s -> { ! if (!mainMethodStarted && s.contains(EventGeneratorLoop.MAIN_METHOD_STARTED)) { System.out.println("MainContainer: setting mainMethodStarted"); mainMethodStarted = true; } }; > Updated webrev: > ??? http://cr.openjdk.java.net/~mseledtsov/8228960.02/ Otherwise looks okay. Hopefully those other test cases will be enabled in the not too distant future. Thanks, David ----- > > Testing: > > ? Ran the test on Linux-x64, various multiple nodes in a test cluster > 50 times - All PASS > > > Thank you, > > Misha > > On 8/13/19 2:05 PM, Bob Vandette wrote: >> >>> On Aug 13, 2019, at 3:28 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> >>> On 8/13/19 12:06 PM, Bob Vandette wrote: >>>>> On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: >>>>> >>>>> Hi Bob, >>>>> >>>>> ?? The workdir (JTwork/scratch) is created with the "test user" >>>>> permissions. Let me try to place the "signal" file in /tmp instead, >>>>> since /tmp should normally have a 777 permission on Linux. >>>> Aren?t you creating a file inside a docker container and then >>>> checking for its existence outside of the container? >>> Correct >>>> Isn?t the root user running inside the container? >>> By default it is. But it still fails to create a file, for some >>> reason. Can be related to selinux settings (for instance, see this >>> article: >>> https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443), >>> I can not change those. >> Is your JTWork/scratch on an NFS mounted file system?? If this is the >> case then the problem is that root is equivalent to nobody on >> mounted file systems and can?t create files unless the directory has >> 777 permissions.? I just confirmed this.? You?d have to either run >> the container test as test-user or change the scratch directory >> permission. >> >> Bob. >> >>> My hope is that /tmp is configured to be accessed by a container >>> engine as a general purpose directory, hence I was thinking to try it >>> out. >>> >>>> Both processes don?t see the same /tmp right??? So that shouldn?t help. >>> In my next experiment, I will map a /tmp from host to be a /host-tmp >>> inside the container (--volume /tmp:/host-tmp), then write a signal >>> file to /host-tmp. >>>> If scratch has 777 permissions, anyone can create a file. >>> scratch has? "rwxr-xr-x" >>>> You have to be careful that you can clean up the >>>> file from outside the container.? I?d make sure to create it with 777. >>> I do use deleteOnExit(), so it should work (unless the JVM crashes). >>> I guess I could add extra layer of safety here, and set the >>> permissions to 777. Thank you for advice. >>> >>> >>> Thank you, >>> >>> Misha >>> >>>> Bob. >>>> >>>>> If this works, I will have to add some unique number to the file >>>>> name, perhaps a PID of a child process. >>>>> >>>>> I will try this, and let you know how it works. >>>>> >>>>> >>>>> Thank you, >>>>> >>>>> Misha >>>>> >>>>> On 8/13/19 6:34 AM, Bob Vandette wrote: >>>>>> Sorry, I just looked at the webrev and you are trying the approach >>>>>> I suggested.? I thought you >>>>>> were trying to use file change notification. >>>>>> >>>>>> Where does the workdir get created?? Does it have 777 permissions? >>>>>> >>>>>> Bob. >>>>>> >>>>>> >>>>>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette >>>>>>> wrote: >>>>>>> >>>>>>> What if you just poll for the creation of the file waiting some >>>>>>> small amount of time between polling with a maximum timeout. >>>>>>> >>>>>>> Bob. >>>>>>> >>>>>>> >>>>>>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>> >>>>>>>> Unfortunately, this approach does not seem to work on many of >>>>>>>> our test cluster machines. The creation of a "signal" file >>>>>>>> results in "PermissionDenied". >>>>>>>> >>>>>>>> The possible reason is the selinux configuration, or some other >>>>>>>> permission related stuff. The container tries to create a new >>>>>>>> file on a mounted volume on a host system, but host system >>>>>>>> denies it. I will look a bit deeper into this, but I think this >>>>>>>> type of issue can be encountered on any automated test system. >>>>>>>> Hence, we may have to abandon this approach. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Misha >>>>>>>> >>>>>>>> >>>>>>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>> Here is an updated webrev: >>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>>>>>> >>>>>>>>> I am using a simple file-based mechanism to communicate between >>>>>>>>> the processes. The "EventGeneratorLoop" process creates a >>>>>>>>> specific "signal" file on a shared mounted volume, while the >>>>>>>>> main test process waits? for the file to exist before running >>>>>>>>> the test cases. >>>>>>>>> >>>>>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test >>>>>>>>> cluster is in progress. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> >>>>>>>>> Misha >>>>>>>>> >>>>>>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>>>>>> Hi Severin, Bob, >>>>>>>>>>> >>>>>>>>>>> ?? Thank you for reviewing the code. >>>>>>>>>>> >>>>>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>>>>>> Can?t you come up with a better way of synchronizing the >>>>>>>>>>>> test by possibly writing a >>>>>>>>>>>> file and waiting for it to exist with a timeout? >>>>>>>>>>> I will try out this approach. >>>>>>>>>> This seems like a fundamental problem with jcmd - so cc'ing >>>>>>>>>> serviceability-dev. >>>>>>>>>> >>>>>>>>>> But I'm pretty sure they recently addressed a similar issue >>>>>>>>>> with the premature sending of the attach signal? >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> ----- >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Misha >>>>>>>>>>>> Isn?t there a shared volume between the two >>>>>>>>>>>> processes? >>>>>>>>>>>> >>>>>>>>>>>> We?ve been fighting test reliability for a while now.? I can >>>>>>>>>>>> only hope we?re getting >>>>>>>>>>>> to the end. >>>>>>>>>>>> >>>>>>>>>>>> Bob. >>>>>>>>>>>> >>>>>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin >>>>>>>>>>>>> Gehwolf? wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Misha, >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, >>>>>>>>>>>>> mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>>>> Please review this change that fixes a container test >>>>>>>>>>>>>> TestJcmdWithSideCar. >>>>>>>>>>>>>> >>>>>>>>>>>>>> My investigation indicated that a root cause for this >>>>>>>>>>>>>> failure is: >>>>>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main >>>>>>>>>>>>>> class has not >>>>>>>>>>>>>> been loaded yet. >>>>>>>>>>>>>> The target test JVM has started, it is initializing, but >>>>>>>>>>>>>> has not loaded >>>>>>>>>>>>>> the main test class. >>>>>>>>>>>>> That's what I've found too. >>>>>>>>>>>>> >>>>>>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, >>>>>>>>>>>>>> with a short >>>>>>>>>>>>>> sleep in between. >>>>>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an >>>>>>>>>>>>> alternative. >>>>>>>>>>>>> >>>>>>>>>>>>>> Also I have commented out the testCase02() due to another >>>>>>>>>>>>>> bug: >>>>>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>>>>>>>> which is not a test bug. IMO, it is better to run the test >>>>>>>>>>>>>> and skip a >>>>>>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>>>>>> >>>>>>>>>>>>>> ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>>>>>> ???? Webrev: >>>>>>>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>>>>>> Looks OK to me. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Severin >>>>>>>>>>>>> From serguei.spitsyn at oracle.com Mon Aug 26 07:58:13 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 26 Aug 2019 00:58:13 -0700 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> Message-ID: <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> Hi Alex, I see one issue with new table format. For instance look at the table for "DisposeObjects Command (14)". Even a better example is "RedefineClasses Command (18)". In the old tables the indentation was highlighted with the vertical lines. It is missed in your version. Thanks, Serguei On 8/23/19 16:54, Alex Menkov wrote: > Hi all, > > Please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8228554 > > webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ > > generated docs: > old: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html > new: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html > > specdiff: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html > > > Summary: > - "content outside of a region" issues: > ? -
replaced with with
, > ? -
replaced with
; > - table issues: > ? - added column headers to all tables; > ? - for every row specified row header; > ? - indentation with table "colspan" reimplemented by using CSS. > > --alex From serguei.spitsyn at oracle.com Mon Aug 26 08:25:14 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 26 Aug 2019 01:25:14 -0700 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> Message-ID: <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> Hi David, On 8/20/19 22:21, David Holmes wrote: > Hi Serguei, > > On 21/08/2019 9:58 am, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> The whole approach looks good to me. > > Thanks for taking a look. My main concern is about the interrupt > semantics, so I really need to get some end-user feedback on that > aspect as well. I don't have any opinion yet on what interrupt semantics tool developers really need. Yes, we may need to request some feedback. My gut feeling tells me it is not good to break the original semantics. :) But let me think about it a little bit more. Also, we need to file a CSR for this. >> + if (jSelf != NULL) { >> + if (interruptible && Thread::is_interrupted(jSelf, true)) { >> + // We're now interrupted but we may have consumed a notification. >> + // To avoid lost wakeups we have to re-issue that notification, which >> + // may result in a spurious wakeup for another thread. >> Alternatively we >> + // ignore checking for interruption before returning. >> + notify(); >> + return false; // interrupted >> + } >> >> I'm a bit concerned about introduction of new spurious wake ups above. >> Some tests can be not defensive against it, so we may discover new >> intermittent failures. > > That is possible. Though given spurious wakeups are already possible > any test that is incorrectly using RawMonitorWait() without checking a > condition, is technically already broken. Agreed. I even think it is even better if spurious wakeups will happen more frequently. It should help to identify and fix such spots in the test base. > > Not checking for interruption after the wait will also require some > test changes, and it weakens the interrupt semantics even further. I'm thinking about a small investigation on how this is used in our tests. Thanks, Serguei > > Thanks, > David > ----- > >> Thanks, >> Serguei >> >> On 8/14/19 11:22 PM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >>> >>> Preliminary webrev (still has rough edges): >>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >>> >>> Background: >>> >>> We've had this comment for a long time: >>> >>> ?// The raw monitor subsystem is entirely distinct from normal >>> ?// java-synchronization or jni-synchronization.? raw monitors are not >>> ?// associated with objects.? They can be implemented in any manner >>> ?// that makes sense.? The original implementors decided to piggy-back >>> ?// the raw-monitor implementation on the existing Java >>> objectMonitor mechanism. >>> ?// This flaw needs to fixed.? We should reimplement raw monitors as >>> sui-generis. >>> ?// Specifically, we should not implement raw monitors via java >>> monitors. >>> ?// Time permitting, we should disentangle and deconvolve the two >>> implementations >>> ?// and move the resulting raw monitor implementation over to the >>> JVMTI directories. >>> ?// Ideally, the raw monitor implementation would be built on top of >>> ?// park-unpark and nothing else. >>> >>> This is an attempt to do that disentangling so that we can then >>> consider changes to ObjectMonitor without having to worry about >>> JvmtiRawMonitors. But rather than building on low-level park/unpark >>> (which would require the same manual queue management and much of >>> the same complex code as exists in ObjectMonitor) I decided to try >>> and do this on top of PlatformMonitor. >>> >>> The reason this is just a RFC rather than RFR is that I overlooked a >>> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >>> implemented by ObjectMonitor) they interact with the >>> Thread.interrupt mechanism. This is not clearly stated in the JVM TI >>> specification [1] but only in passing by the possible errors for >>> RawMonitorWait: >>> >>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >>> >>> As I explain in the bug report there is no way to build in proper >>> interrupt support using PlatformMonitor as there is no way we can >>> "interrupt" the low-level pthread_cond_wait. But we can approximate >>> it. What I've done in this preliminary version is just check >>> interrupt state before and after the actual "wait" but we won't get >>> woken by the interrupt once we have actually blocked. Alternatively >>> we could use a periodic polling approach and wakeup every Nms to >>> check for interruption. >>> >>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not >>> affected by this choice as that code ignores the interrupt until the >>> real action it was waiting for has occurred. The interrupt is then >>> reposted later. >>> >>> But more generally there could be users of JvmtiRawMonitors that >>> expect/require that RawMonitorWait is responsive to Thread.interrupt >>> in a manner similar to Object.wait. And if any of them are reading >>> this then I'd like to know - hence this RFC :) >>> >>> FYI testing to date: >>> ?- tiers 1 -3 all platforms >>> ?- hotspot: serviceability/jvmti >>> ????????????????????????? /jdwp >>> ??????????? vmTestbase/nsk/jvmti >>> ????????????????????????? /jdwp >>> ?- JDK: com/sun/jdi >>> >>> Comments/opinions appreciated. >>> >>> Thanks, >>> David >>> >>> [1] >>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >> From daniel.daugherty at oracle.com Mon Aug 26 13:00:06 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 26 Aug 2019 09:00:06 -0400 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <4ea44805-4810-c6a4-b47c-b2ab7bba9d49@oracle.com> Message-ID: <49b6764b-274c-ec86-3b2f-2449193201f5@oracle.com> I haven't reviewed the change. I just noticed that we needed info on how the fix was tested. Dan On 8/26/19 3:27 AM, serguei.spitsyn at oracle.com wrote: > Hi Adam, > > Thank you for the explanation below! > Then I'm Okay with the fix as it is. > > Dan, > > Do you have any suggestions or objections? > If not, then do I need to add your name to the list of reviewers? > > Thanks, > Serguei > > > On 8/15/19 04:38, Adam Farley8 wrote: >> Hi Serguei, Daniel, >> >> Good to hear you like the fix. >> >> My intention with the testing was to make sure my change didn't break >> anything else. I didn't do a code paths check before I ran it though; >> saturation run. >> >> As for writing a new test, I'm finding it tricky. >> >> Here's the current flow: >> >> Step 1) VM initialises. >> Step 2) VM loads a couple of libraries and shuts down if one or more >> paths is too long in sun.boot.library.path. >> Step 3) JDWP initializes >> Step 4) JDWP loads a library and shuts down if one or more paths is >> too long in sun.boot.library.path. >> >> As you can see, Step 2 prevents us from reaching Step 4 with a >> too-long-path (required to cause failure). >> >> I worked around that with my webrev by disabling the bit in os.cpp >> that enacts Step 2. >> >> Since my hack will be removed in the final webrev, we need another >> way to reach step 4. >> >> So what we need to test this change, I believe, is a way to insert >> Step 2.5) Change the property to include a too-long path. >> >> This allows the VM to start up properly, but gives us the excessive >> path we need to test the jdwp fix. >> >> Right now, I'm not seeing a way to do this outside of using the JNI. >> >> 1) shell script launches cpp file. >> 2) cpp starts vm without jdwp. >> 3) change the property. >> 4) call jdwp library-loading method directly. >> 5) check the return code. >> >> This seems messy, but I'm not seeing a way to initialise jdwp from >> inside java code (which sounds better to me). >> >> I welcome anyone who can think of a better way to do this. >> >> Best Regards >> >> Adam Farley >> IBM Runtimes >> >> >> "serguei.spitsyn at oracle.com" wrote on >> 15/08/2019 09:25:36: >> >> > From: "serguei.spitsyn at oracle.com" >> > To: Adam Farley8 >> > Cc: Chris Plummer , >> > daniel.daugherty at oracle.com, serviceability-dev at openjdk.java.net >> > Date: 15/08/2019 09:25 >> > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c >> > quietly truncates on buffer overflow >> > >> > Hi Adam, >> > >> > The fix itself looks Okay to me. >> > I'm not sure there is any test case in these test suites which >> > provide a coverage for it. >> > It looks like you need to develop a unit jtreg unit test for this. >> > >> > Thanks, >> > Serguei >> > >> > >> > On 8/13/19 09:28, Adam Farley8 wrote: >> > Hi Serguei, Daniel, >> > >> > My testing was limited to the bug specific test case I mentioned, >> > and the following jdwp tests: >> > >> > test/jdk/com/sun/jdi/Jdwp* >> > test/hotspot/jtreg/serviceability/jdwp >> > >> > Best Regards >> > >> > Adam Farley >> > IBM Runtimes >> > >> > >> > "serguei.spitsyn at oracle.com" wrote on >> > 13/08/2019 17:04:43: >> > >> > > From: "serguei.spitsyn at oracle.com" >> > > To: daniel.daugherty at oracle.com, Adam Farley8 >> > > , Chris Plummer >> > > Cc: serviceability-dev at openjdk.java.net >> > > Date: 13/08/2019 17:08 >> > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c >> > > quietly truncates on buffer overflow >> > > >> > > Hi Adam, >> > > >> > > I'm looking at your fix. >> > > Also interested about your testing. >> > > >> > > Thanks, >> > > Serguei >> > > >> > > On 8/13/19 08:48, Daniel D. Daugherty wrote: >> > > I don't see any information about how this change was tested... >> > > Is there something on another email thread? >> > > >> > > Dan >> > > >> > >> > > On 8/13/19 11:41 AM, Adam Farley8 wrote: >> > > Hi Chris, >> > > >> > > Thanks! >> > > >> > > I understand we need a second reviewer/sponsor to get this change >> > > in. Any volunteers? >> > > >> > > Best Regards >> > > >> > > Adam Farley >> > > IBM Runtimes >> > > >> > > >> > > Chris Plummer wrote on 12/08/2019 >> 21:35:06: >> > > >> > > > From: Chris Plummer >> > > > To: Adam Farley8 , serviceability- >> > > dev at openjdk.java.net >> > > > Date: 12/08/2019 21:35 >> > > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c >> > > > quietly truncates on buffer overflow >> > > > >> > > > Hi Adam, >> > > > >> > > > It looks good to me. >> > > > >> > > > thanks, >> > > > >> > > > Chris >> > > > >> > > > On 8/12/19 7:34 AM, Adam Farley8 wrote: >> > > > Hi All, >> > > > >> > > > This is a known bug, mentioned in a code comment. >> > > > >> > > > Here is the fix for that bug. >> > > > >> > > > Reviewers and sponsors requested. >> > > > >> > > > Short version: if you set sun.boot.library.path to >> > > > something beyond a system's max path length, the >> > > > current code will return an empty string (rather than >> > > > printing a useful error message and shutting down). >> > > > >> > > > This is also a problem if you've specified multiple >> > > > paths with a separator, as this code seems to wrongly >> > > > assess whether the *total* length exceeds max path >> > > > length. So two 200 char paths on windows will cause >> > > > failure, as the total length is 400 (which is beyond >> > > > max length for windows). >> > > > >> > > > Note that the os.cpp bit of the webrev will not be included >> > > > in the final webrev, it just makes this change trivially >> > > > testable. >> > > > >> > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 >> > > > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ >> > > > >> > > > >> > > > Best Regards >> > > > >> > > > Adam Farley >> > > > IBM Runtimes >> > > > >> > > > Unless stated otherwise above: >> > > > IBM United Kingdom Limited - Registered in England and Wales with >> > > > number 741598. >> > > > Registered office: PO Box 41, North Harbour, Portsmouth, >> Hampshire PO6 3AU >> > > Unless stated otherwise above: >> > > IBM United Kingdom Limited - Registered in England and Wales with >> > > number 741598. >> > > Registered office: PO Box 41, North Harbour, Portsmouth, >> Hampshire PO6 3AU >> > Unless stated otherwise above: >> > IBM United Kingdom Limited - Registered in England and Wales with >> > number 741598. >> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire >> PO6 3AU >> Unless stated otherwise above: >> IBM United Kingdom Limited - Registered in England and Wales with >> number 741598. >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire >> PO6 3AU > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgehwolf at redhat.com Mon Aug 26 14:20:10 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 26 Aug 2019 16:20:10 +0200 Subject: [8u] [RFR] 8140482: Various minor code improvements (runtime) In-Reply-To: <5784a2dbe5944f07b1feb03b7b1f87e7@sap.com> References: <5784a2dbe5944f07b1feb03b7b1f87e7@sap.com> Message-ID: Hi, On Thu, 2018-11-22 at 08:51 +0000, Lindenmaier, Goetz wrote: > Hi, > > Doesn't this have to be posted to jdk8u-dev? > > I had a look at the backport. > Including 7127191 confused me a bit. Is it good to hide the fact that > this was backported in the repository? > In os_linux one fix is missing, is this on purpose? I don't think this is a > critical issue, though, so leaving it out is fine. > > > the dropping of the changes to ... > > src/share/vm/runtime/task.cpp and > > src/os/windows/vm/attachListener_windows.cpp > These changes are included in the webrev ...? > > The webrev looks good to me. > > Best regards, > Goetz. > > > > > > > -----Original Message----- > > From: hotspot-dev On Behalf Of > > Andrew Hughes > > Sent: Mittwoch, 21. November 2018 07:45 > > To: serviceability-dev ; hotspot-dev > > > > Subject: [8u] [RFR] 8140482: Various minor code improvements (runtime) > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8140482 > > Original changeset: > > https://hg.openjdk.java.net/jdk-updates/jdk9u/hotspot/rev/cd86b5699825 > > Webrev: > > https://cr.openjdk.java.net/~andrew/openjdk8/8140482/webrev.01/ > > > > The patch largely applies as is, with some adjustment for context and > > the dropping of the changes to src/cpu/x86/vm/stubRoutines_x86.cpp, > > src/share/vm/runtime/task.cpp and > > src/os/windows/vm/attachListener_windows.cpp > > which don't exist in 8u. A clean backport of 7127191 is included, which > > allows the changes to agent/src/os/linux/libproc_impl.c to apply as-is. > > > > Applying the change to 8u improves the code quality there and aids > > in backporting other changes, such as 8210836 [0]. > > > > Ok for 8u? > > > > [0] https://mail.openjdk.java.net/pipermail/serviceability-dev/2018- > > November/025991.html > > > > Thanks, Reviving this old thread. Andrew, could you please rebase this patch to latest 8u? AFAIK, 7127191 has been included since in 8u and the review would be easier if the webrev didn't show it. I'd need this backport to go in so as to proceed with JDK-8210836. Thanks, Severin From alexey.menkov at oracle.com Mon Aug 26 17:01:30 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 26 Aug 2019 10:01:30 -0700 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> Message-ID: <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> Hi Serguei, The change is intentional - it seems to me that there were too many borders in the struct description tables. I thought about removing some of them (or making them thiner or changing color to gray). I don't think absence of the lines makes comprehension of the structures harder. --alex On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > I see one issue with new table format. > For instance look at the table for "DisposeObjects Command (14)". > Even a better example is "RedefineClasses Command (18)". > In the old tables the indentation was highlighted with the vertical lines. > It is missed in your version. > > Thanks, > Serguei > > > On 8/23/19 16:54, Alex Menkov wrote: >> Hi all, >> >> Please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8228554 >> >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >> >> >> generated docs: >> old: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >> >> new: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >> >> >> specdiff: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >> >> >> Summary: >> - "content outside of a region" issues: >> ? -
replaced with with
, >> ? -
replaced with
; >> - table issues: >> ? - added column headers to all tables; >> ? - for every row specified row header; >> ? - indentation with table "colspan" reimplemented by using CSS. >> >> --alex > From mikhailo.seledtsov at oracle.com Mon Aug 26 19:32:36 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Mon, 26 Aug 2019 12:32:36 -0700 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> <3143B636-9729-4238-9149-3A562B288643@oracle.com> Message-ID: <5877bb3c-ede3-ce33-9df9-391b3997a6ac@oracle.com> Hi David, ? Thank you for review. On 8/26/19 12:57 AM, David Holmes wrote: > Hi Misha, > > On 24/08/2019 3:21 am, mikhailo.seledtsov at oracle.com wrote: >> Finally got some time to work on this issue. >> Since I have encountered problem using files for passing messages >> between a container and a test driver (due to permissions), I looked >> for alternative solutions. I am using the output of a container >> process to signal when the main method has started, and it works. >> This simplifies things quite a bit as well. >> >> Normally, we use OutputAnalyzer test utility to collect the whole >> output once the process has completed, and then analyze the resulting >> output for "contains some string", match, etc. However, >> testutils/ProcessTools provides an API to consume the output as it is >> produced. I am using this API to detect when the main() method of the >> container has started. > > That seems reasonable. Do we want to make the following change to > minimise unneeded output processing: > > ???????? private Consumer outputConsumer = s -> { > !??????????? if (!mainMethodStarted && > s.contains(EventGeneratorLoop.MAIN_METHOD_STARTED)) { > ???????????????? System.out.println("MainContainer: setting > mainMethodStarted"); > ???????????????? mainMethodStarted = true; > ???????????? } > ???????? }; Thank you for the suggestion. I will update the code accordingly. > >> Updated webrev: >> ???? http://cr.openjdk.java.net/~mseledtsov/8228960.02/ > > Otherwise looks okay. Hopefully those other test cases will be enabled > in the not too distant future. I hope so as well. Thank you, Misha > > Thanks, > David > ----- > >> >> Testing: >> >> ?? Ran the test on Linux-x64, various multiple nodes in a test >> cluster 50 times - All PASS >> >> >> Thank you, >> >> Misha >> >> On 8/13/19 2:05 PM, Bob Vandette wrote: >>> >>>> On Aug 13, 2019, at 3:28 PM, mikhailo.seledtsov at oracle.com wrote: >>>> >>>> >>>> On 8/13/19 12:06 PM, Bob Vandette wrote: >>>>>> On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>> >>>>>> Hi Bob, >>>>>> >>>>>> ?? The workdir (JTwork/scratch) is created with the "test user" >>>>>> permissions. Let me try to place the "signal" file in /tmp >>>>>> instead, since /tmp should normally have a 777 permission on Linux. >>>>> Aren?t you creating a file inside a docker container and then >>>>> checking for its existence outside of the container? >>>> Correct >>>>> Isn?t the root user running inside the container? >>>> By default it is. But it still fails to create a file, for some >>>> reason. Can be related to selinux settings (for instance, see this >>>> article: >>>> https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443), >>>> I can not change those. >>> Is your JTWork/scratch on an NFS mounted file system?? If this is >>> the case then the problem is that root is equivalent to nobody on >>> mounted file systems and can?t create files unless the directory has >>> 777 permissions.? I just confirmed this.? You?d have to either run >>> the container test as test-user or change the scratch directory >>> permission. >>> >>> Bob. >>> >>>> My hope is that /tmp is configured to be accessed by a container >>>> engine as a general purpose directory, hence I was thinking to try >>>> it out. >>>> >>>>> Both processes don?t see the same /tmp right??? So that shouldn?t >>>>> help. >>>> In my next experiment, I will map a /tmp from host to be a >>>> /host-tmp inside the container (--volume /tmp:/host-tmp), then >>>> write a signal file to /host-tmp. >>>>> If scratch has 777 permissions, anyone can create a file. >>>> scratch has? "rwxr-xr-x" >>>>> You have to be careful that you can clean up the >>>>> file from outside the container.? I?d make sure to create it with >>>>> 777. >>>> I do use deleteOnExit(), so it should work (unless the JVM >>>> crashes). I guess I could add extra layer of safety here, and set >>>> the permissions to 777. Thank you for advice. >>>> >>>> >>>> Thank you, >>>> >>>> Misha >>>> >>>>> Bob. >>>>> >>>>>> If this works, I will have to add some unique number to the file >>>>>> name, perhaps a PID of a child process. >>>>>> >>>>>> I will try this, and let you know how it works. >>>>>> >>>>>> >>>>>> Thank you, >>>>>> >>>>>> Misha >>>>>> >>>>>> On 8/13/19 6:34 AM, Bob Vandette wrote: >>>>>>> Sorry, I just looked at the webrev and you are trying the >>>>>>> approach I suggested.? I thought you >>>>>>> were trying to use file change notification. >>>>>>> >>>>>>> Where does the workdir get created?? Does it have 777 permissions? >>>>>>> >>>>>>> Bob. >>>>>>> >>>>>>> >>>>>>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette >>>>>>>> wrote: >>>>>>>> >>>>>>>> What if you just poll for the creation of the file waiting some >>>>>>>> small amount of time between polling with a maximum timeout. >>>>>>>> >>>>>>>> Bob. >>>>>>>> >>>>>>>> >>>>>>>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>> >>>>>>>>> Unfortunately, this approach does not seem to work on many of >>>>>>>>> our test cluster machines. The creation of a "signal" file >>>>>>>>> results in "PermissionDenied". >>>>>>>>> >>>>>>>>> The possible reason is the selinux configuration, or some >>>>>>>>> other permission related stuff. The container tries to create >>>>>>>>> a new file on a mounted volume on a host system, but host >>>>>>>>> system denies it. I will look a bit deeper into this, but I >>>>>>>>> think this type of issue can be encountered on any automated >>>>>>>>> test system. Hence, we may have to abandon this approach. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Misha >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>> Here is an updated webrev: >>>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>>>>>>> >>>>>>>>>> I am using a simple file-based mechanism to communicate >>>>>>>>>> between the processes. The "EventGeneratorLoop" process >>>>>>>>>> creates a specific "signal" file on a shared mounted volume, >>>>>>>>>> while the main test process waits? for the file to exist >>>>>>>>>> before running the test cases. >>>>>>>>>> >>>>>>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test >>>>>>>>>> cluster is in progress. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thank you, >>>>>>>>>> >>>>>>>>>> Misha >>>>>>>>>> >>>>>>>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>>>>>>> Hi Severin, Bob, >>>>>>>>>>>> >>>>>>>>>>>> ?? Thank you for reviewing the code. >>>>>>>>>>>> >>>>>>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>>>>>>> Can?t you come up with a better way of synchronizing the >>>>>>>>>>>>> test by possibly writing a >>>>>>>>>>>>> file and waiting for it to exist with a timeout? >>>>>>>>>>>> I will try out this approach. >>>>>>>>>>> This seems like a fundamental problem with jcmd - so cc'ing >>>>>>>>>>> serviceability-dev. >>>>>>>>>>> >>>>>>>>>>> But I'm pretty sure they recently addressed a similar issue >>>>>>>>>>> with the premature sending of the attach signal? >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> ----- >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Misha >>>>>>>>>>>>> Isn?t there a shared volume between the two >>>>>>>>>>>>> processes? >>>>>>>>>>>>> >>>>>>>>>>>>> We?ve been fighting test reliability for a while now.? I >>>>>>>>>>>>> can only hope we?re getting >>>>>>>>>>>>> to the end. >>>>>>>>>>>>> >>>>>>>>>>>>> Bob. >>>>>>>>>>>>> >>>>>>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin >>>>>>>>>>>>>> Gehwolf wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Misha, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, >>>>>>>>>>>>>> mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>>>>> Please review this change that fixes a container test >>>>>>>>>>>>>>> TestJcmdWithSideCar. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> My investigation indicated that a root cause for this >>>>>>>>>>>>>>> failure is: >>>>>>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main >>>>>>>>>>>>>>> class has not >>>>>>>>>>>>>>> been loaded yet. >>>>>>>>>>>>>>> The target test JVM has started, it is initializing, but >>>>>>>>>>>>>>> has not loaded >>>>>>>>>>>>>>> the main test class. >>>>>>>>>>>>>> That's what I've found too. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, >>>>>>>>>>>>>>> with a short >>>>>>>>>>>>>>> sleep in between. >>>>>>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an >>>>>>>>>>>>>> alternative. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also I have commented out the testCase02() due to >>>>>>>>>>>>>>> another bug: >>>>>>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to >>>>>>>>>>>>>>> s.j.h.oops.Instance", >>>>>>>>>>>>>>> which is not a test bug. IMO, it is better to run the >>>>>>>>>>>>>>> test and skip a >>>>>>>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>>>>>>> ???? Webrev: >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>>>>>>> Looks OK to me. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Severin >>>>>>>>>>>>>> From david.holmes at oracle.com Mon Aug 26 21:54:27 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 27 Aug 2019 07:54:27 +1000 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> Message-ID: On 27/08/2019 3:01 am, Alex Menkov wrote: > Hi Serguei, > > The change is intentional - it seems to me that there were too many > borders in the struct description tables. I thought about removing some > of them (or making them thiner or changing color to gray). > I don't think absence of the lines makes comprehension of the structures > harder. I like the new look - especially now we have proper headers and no more strange looking empty cells! My only suggestion is to make the first column of each table the same width (were possible) so that the tables line up better - specifically the "Error Data" table's "Value" column should be the same width as the "Reply Data" table's "Type" column. Thanks, David > --alex > > On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> I see one issue with new table format. >> For instance look at the table for "DisposeObjects Command (14)". >> Even a better example is "RedefineClasses Command (18)". >> In the old tables the indentation was highlighted with the vertical >> lines. >> It is missed in your version. >> >> Thanks, >> Serguei >> >> >> On 8/23/19 16:54, Alex Menkov wrote: >>> Hi all, >>> >>> Please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8228554 >>> >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >>> >>> >>> generated docs: >>> old: >>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >>> >>> new: >>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >>> >>> >>> specdiff: >>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >>> >>> >>> Summary: >>> - "content outside of a region" issues: >>> ? -
replaced with with
, >>> ? -
replaced with
; >>> - table issues: >>> ? - added column headers to all tables; >>> ? - for every row specified row header; >>> ? - indentation with table "colspan" reimplemented by using CSS. >>> >>> --alex >> From david.holmes at oracle.com Mon Aug 26 22:21:27 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 27 Aug 2019 08:21:27 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> Message-ID: <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> On 26/08/2019 6:25 pm, serguei.spitsyn at oracle.com wrote: > Hi David, > > > On 8/20/19 22:21, David Holmes wrote: >> Hi Serguei, >> >> On 21/08/2019 9:58 am, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> The whole approach looks good to me. >> >> Thanks for taking a look. My main concern is about the interrupt >> semantics, so I really need to get some end-user feedback on that >> aspect as well. > > > I don't have any opinion yet on what interrupt semantics tool developers > really need. > Yes, we may need to request some feedback. I've now explicitly added JC, Yasumasa, Severin and Martin, to this email thread to try and solicit feedback from all the major players that seem interested in this serviceability area. Folks I'd really appreciate any feedback you may have here on the usecases for JvmtiRawMonitors, and in particular the use RawMonitorWait and its interaction with Thread.interrupt. > My gut feeling tells me it is not good to break the original semantics. :) > But let me think about it a little bit more. Me too, but I wanted to start simple. I suspect I will have to at least implement time-based polling of the interrupt state. > Also, we need to file a CSR for this. Depending on how this proceeds, yes. > >>> + if (jSelf != NULL) { >>> + if (interruptible && Thread::is_interrupted(jSelf, true)) { >>> + // We're now interrupted but we may have consumed a notification. >>> + // To avoid lost wakeups we have to re-issue that notification, which >>> + // may result in a spurious wakeup for another thread. >>> Alternatively we >>> + // ignore checking for interruption before returning. >>> + notify(); >>> + return false; // interrupted >>> + } >>> >>> I'm a bit concerned about introduction of new spurious wake ups above. >>> Some tests can be not defensive against it, so we may discover new >>> intermittent failures. >> >> That is possible. Though given spurious wakeups are already possible >> any test that is incorrectly using RawMonitorWait() without checking a >> condition, is technically already broken. > > Agreed. > I even think it is even better if spurious wakeups will happen more > frequently. > It should help to identify and fix such spots in the test base. Yes it is good tests. Alas not so good for production code :) >> >> Not checking for interruption after the wait will also require some >> test changes, and it weakens the interrupt semantics even further. > > I'm thinking about a small investigation on how this is used in our tests. There seem to be a few uses that are susceptible to spurious wakeup errors, but those tests don't use interrupt. Thanks, David > Thanks, > Serguei > >> >> Thanks, >> David >> ----- >> >>> Thanks, >>> Serguei >>> >>> On 8/14/19 11:22 PM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >>>> >>>> Preliminary webrev (still has rough edges): >>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >>>> >>>> Background: >>>> >>>> We've had this comment for a long time: >>>> >>>> ?// The raw monitor subsystem is entirely distinct from normal >>>> ?// java-synchronization or jni-synchronization.? raw monitors are not >>>> ?// associated with objects.? They can be implemented in any manner >>>> ?// that makes sense.? The original implementors decided to piggy-back >>>> ?// the raw-monitor implementation on the existing Java >>>> objectMonitor mechanism. >>>> ?// This flaw needs to fixed.? We should reimplement raw monitors as >>>> sui-generis. >>>> ?// Specifically, we should not implement raw monitors via java >>>> monitors. >>>> ?// Time permitting, we should disentangle and deconvolve the two >>>> implementations >>>> ?// and move the resulting raw monitor implementation over to the >>>> JVMTI directories. >>>> ?// Ideally, the raw monitor implementation would be built on top of >>>> ?// park-unpark and nothing else. >>>> >>>> This is an attempt to do that disentangling so that we can then >>>> consider changes to ObjectMonitor without having to worry about >>>> JvmtiRawMonitors. But rather than building on low-level park/unpark >>>> (which would require the same manual queue management and much of >>>> the same complex code as exists in ObjectMonitor) I decided to try >>>> and do this on top of PlatformMonitor. >>>> >>>> The reason this is just a RFC rather than RFR is that I overlooked a >>>> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >>>> implemented by ObjectMonitor) they interact with the >>>> Thread.interrupt mechanism. This is not clearly stated in the JVM TI >>>> specification [1] but only in passing by the possible errors for >>>> RawMonitorWait: >>>> >>>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >>>> >>>> As I explain in the bug report there is no way to build in proper >>>> interrupt support using PlatformMonitor as there is no way we can >>>> "interrupt" the low-level pthread_cond_wait. But we can approximate >>>> it. What I've done in this preliminary version is just check >>>> interrupt state before and after the actual "wait" but we won't get >>>> woken by the interrupt once we have actually blocked. Alternatively >>>> we could use a periodic polling approach and wakeup every Nms to >>>> check for interruption. >>>> >>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not >>>> affected by this choice as that code ignores the interrupt until the >>>> real action it was waiting for has occurred. The interrupt is then >>>> reposted later. >>>> >>>> But more generally there could be users of JvmtiRawMonitors that >>>> expect/require that RawMonitorWait is responsive to Thread.interrupt >>>> in a manner similar to Object.wait. And if any of them are reading >>>> this then I'd like to know - hence this RFC :) >>>> >>>> FYI testing to date: >>>> ?- tiers 1 -3 all platforms >>>> ?- hotspot: serviceability/jvmti >>>> ????????????????????????? /jdwp >>>> ??????????? vmTestbase/nsk/jvmti >>>> ????????????????????????? /jdwp >>>> ?- JDK: com/sun/jdi >>>> >>>> Comments/opinions appreciated. >>>> >>>> Thanks, >>>> David >>>> >>>> [1] >>>> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMonitorWait >>>> >>> > From alexey.menkov at oracle.com Mon Aug 26 22:47:21 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 26 Aug 2019 15:47:21 -0700 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> Message-ID: On 08/26/2019 14:54, David Holmes wrote: > On 27/08/2019 3:01 am, Alex Menkov wrote: >> Hi Serguei, >> >> The change is intentional - it seems to me that there were too many >> borders in the struct description tables. I thought about removing >> some of them (or making them thiner or changing color to gray). >> I don't think absence of the lines makes comprehension of the >> structures harder. > > I like the new look - especially now we have proper headers and no more > strange looking empty cells! > > My only suggestion is to make the first column of each table the same > width (were possible) so that the tables line up better - specifically > the "Error Data" table's "Value" column should be the same width as the > "Reply Data" table's "Type" column. Maybe then make 1st column of "Error Data" the same width as (Type + Name) columns in Out Data/Reply Data? Then "Description" column in all tables will be 65%. BTW just discovered at error in Constants tables - they have column 20%, 5% and 65% - going to update the 2st column to be 30% --alex > > Thanks, > David > >> --alex >> >> On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: >>> Hi Alex, >>> >>> I see one issue with new table format. >>> For instance look at the table for "DisposeObjects Command (14)". >>> Even a better example is "RedefineClasses Command (18)". >>> In the old tables the indentation was highlighted with the vertical >>> lines. >>> It is missed in your version. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 8/23/19 16:54, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8228554 >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >>>> >>>> >>>> generated docs: >>>> old: >>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >>>> >>>> new: >>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >>>> >>>> >>>> specdiff: >>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >>>> >>>> >>>> Summary: >>>> - "content outside of a region" issues: >>>> ? -
replaced with with
, >>>> ? -
replaced with
; >>>> - table issues: >>>> ? - added column headers to all tables; >>>> ? - for every row specified row header; >>>> ? - indentation with table "colspan" reimplemented by using CSS. >>>> >>>> --alex >>> From david.holmes at oracle.com Mon Aug 26 23:00:51 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 27 Aug 2019 09:00:51 +1000 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> Message-ID: On 27/08/2019 8:47 am, Alex Menkov wrote: > On 08/26/2019 14:54, David Holmes wrote: >> On 27/08/2019 3:01 am, Alex Menkov wrote: >>> Hi Serguei, >>> >>> The change is intentional - it seems to me that there were too many >>> borders in the struct description tables. I thought about removing >>> some of them (or making them thiner or changing color to gray). >>> I don't think absence of the lines makes comprehension of the >>> structures harder. >> >> I like the new look - especially now we have proper headers and no >> more strange looking empty cells! >> >> My only suggestion is to make the first column of each table the same >> width (were possible) so that the tables line up better - specifically >> the "Error Data" table's "Value" column should be the same width as >> the "Reply Data" table's "Type" column. > > Maybe then make 1st column of "Error Data" the same width as (Type + > Name) columns in Out Data/Reply Data? That would not look very good IMHO. David ----- > Then "Description" column in all tables will be 65%. > > BTW just discovered at error in Constants tables - they have column 20%, > 5% and 65% - going to update the 2st column to be 30% > > --alex > >> >> Thanks, >> David >> >>> --alex >>> >>> On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: >>>> Hi Alex, >>>> >>>> I see one issue with new table format. >>>> For instance look at the table for "DisposeObjects Command (14)". >>>> Even a better example is "RedefineClasses Command (18)". >>>> In the old tables the indentation was highlighted with the vertical >>>> lines. >>>> It is missed in your version. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 8/23/19 16:54, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> Please review the fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8228554 >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >>>>> >>>>> >>>>> generated docs: >>>>> old: >>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >>>>> >>>>> new: >>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >>>>> >>>>> >>>>> specdiff: >>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >>>>> >>>>> >>>>> Summary: >>>>> - "content outside of a region" issues: >>>>> ? -
replaced with with
, >>>>> ? -
replaced with
; >>>>> - table issues: >>>>> ? - added column headers to all tables; >>>>> ? - for every row specified row header; >>>>> ? - indentation with table "colspan" reimplemented by using CSS. >>>>> >>>>> --alex >>>> From alexey.menkov at oracle.com Mon Aug 26 23:44:56 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 26 Aug 2019 16:44:56 -0700 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> Message-ID: <5a434ae0-4561-9dcd-fc9c-92fb203b54ac@oracle.com> Ok. Updated webrev: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev.2/ The difference vs v.1 is: - ErrorSetNode.java - added 'style="width: 20%"' for the 1st column; - ConstantSetNode.java - fixed width of the 1st column (20% -> 30%) generated doc: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/2/jdwp-protocol.html --alex On 08/26/2019 16:00, David Holmes wrote: > On 27/08/2019 8:47 am, Alex Menkov wrote: >> On 08/26/2019 14:54, David Holmes wrote: >>> On 27/08/2019 3:01 am, Alex Menkov wrote: >>>> Hi Serguei, >>>> >>>> The change is intentional - it seems to me that there were too many >>>> borders in the struct description tables. I thought about removing >>>> some of them (or making them thiner or changing color to gray). >>>> I don't think absence of the lines makes comprehension of the >>>> structures harder. >>> >>> I like the new look - especially now we have proper headers and no >>> more strange looking empty cells! >>> >>> My only suggestion is to make the first column of each table the same >>> width (were possible) so that the tables line up better - >>> specifically the "Error Data" table's "Value" column should be the >>> same width as the "Reply Data" table's "Type" column. >> >> Maybe then make 1st column of "Error Data" the same width as (Type + >> Name) columns in Out Data/Reply Data? > > That would not look very good IMHO. > > David > ----- > >> Then "Description" column in all tables will be 65%. >> >> BTW just discovered at error in Constants tables - they have column >> 20%, 5% and 65% - going to update the 2st column to be 30% >> >> --alex >> >>> >>> Thanks, >>> David >>> >>>> --alex >>>> >>>> On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: >>>>> Hi Alex, >>>>> >>>>> I see one issue with new table format. >>>>> For instance look at the table for "DisposeObjects Command (14)". >>>>> Even a better example is "RedefineClasses Command (18)". >>>>> In the old tables the indentation was highlighted with the vertical >>>>> lines. >>>>> It is missed in your version. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 8/23/19 16:54, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review the fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8228554 >>>>>> >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >>>>>> >>>>>> >>>>>> generated docs: >>>>>> old: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >>>>>> >>>>>> new: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >>>>>> >>>>>> >>>>>> specdiff: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >>>>>> >>>>>> >>>>>> Summary: >>>>>> - "content outside of a region" issues: >>>>>> ? -
replaced with with
, >>>>>> ? -
replaced with
; >>>>>> - table issues: >>>>>> ? - added column headers to all tables; >>>>>> ? - for every row specified row header; >>>>>> ? - indentation with table "colspan" reimplemented by using CSS. >>>>>> >>>>>> --alex >>>>> From david.holmes at oracle.com Tue Aug 27 00:55:21 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 27 Aug 2019 10:55:21 +1000 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: <5a434ae0-4561-9dcd-fc9c-92fb203b54ac@oracle.com> References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> <5a434ae0-4561-9dcd-fc9c-92fb203b54ac@oracle.com> Message-ID: <7b40068a-b595-fe1f-6bb7-153e96e618ad@oracle.com> Ship it! :) Thanks, David On 27/08/2019 9:44 am, Alex Menkov wrote: > Ok. > > Updated webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev.2/ > > > The difference vs v.1 is: > - ErrorSetNode.java - added 'style="width: 20%"' for the 1st column; > - ConstantSetNode.java - fixed width of the 1st column (20% -> 30%) > > generated doc: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/2/jdwp-protocol.html > > > --alex > > On 08/26/2019 16:00, David Holmes wrote: >> On 27/08/2019 8:47 am, Alex Menkov wrote: >>> On 08/26/2019 14:54, David Holmes wrote: >>>> On 27/08/2019 3:01 am, Alex Menkov wrote: >>>>> Hi Serguei, >>>>> >>>>> The change is intentional - it seems to me that there were too many >>>>> borders in the struct description tables. I thought about removing >>>>> some of them (or making them thiner or changing color to gray). >>>>> I don't think absence of the lines makes comprehension of the >>>>> structures harder. >>>> >>>> I like the new look - especially now we have proper headers and no >>>> more strange looking empty cells! >>>> >>>> My only suggestion is to make the first column of each table the >>>> same width (were possible) so that the tables line up better - >>>> specifically the "Error Data" table's "Value" column should be the >>>> same width as the "Reply Data" table's "Type" column. >>> >>> Maybe then make 1st column of "Error Data" the same width as (Type + >>> Name) columns in Out Data/Reply Data? >> >> That would not look very good IMHO. >> >> David >> ----- >> >>> Then "Description" column in all tables will be 65%. >>> >>> BTW just discovered at error in Constants tables - they have column >>> 20%, 5% and 65% - going to update the 2st column to be 30% >>> >>> --alex >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> --alex >>>>> >>>>> On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Alex, >>>>>> >>>>>> I see one issue with new table format. >>>>>> For instance look at the table for "DisposeObjects Command (14)". >>>>>> Even a better example is "RedefineClasses Command (18)". >>>>>> In the old tables the indentation was highlighted with the >>>>>> vertical lines. >>>>>> It is missed in your version. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 8/23/19 16:54, Alex Menkov wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review the fix for >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8228554 >>>>>>> >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >>>>>>> >>>>>>> >>>>>>> generated docs: >>>>>>> old: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >>>>>>> >>>>>>> new: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >>>>>>> >>>>>>> >>>>>>> specdiff: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >>>>>>> >>>>>>> >>>>>>> Summary: >>>>>>> - "content outside of a region" issues: >>>>>>> ? -
replaced with with
, >>>>>>> ? -
replaced with
; >>>>>>> - table issues: >>>>>>> ? - added column headers to all tables; >>>>>>> ? - for every row specified row header; >>>>>>> ? - indentation with table "colspan" reimplemented by using CSS. >>>>>>> >>>>>>> --alex >>>>>> From andrew_m_leonard at uk.ibm.com Tue Aug 27 08:22:57 2019 From: andrew_m_leonard at uk.ibm.com (Andrew Leonard) Date: Tue, 27 Aug 2019 09:22:57 +0100 Subject: RFR JDK-8225474: JDI connector accept fails "Address already in use" with concurrent listeners In-Reply-To: References: <87f67d80-03be-ada9-59ca-71de7bf422d0@oracle.com> Message-ID: Hi Alan, Was wondering if you had had a chance to look at this please? Thanks Andrew Andrew Leonard Java Runtimes Development IBM Hursley IBM United Kingdom Ltd internet email: andrew_m_leonard at uk.ibm.com From: Andrew Leonard/UK/IBM To: Alan Bateman Cc: serviceability-dev at openjdk.java.net Date: 02/07/2019 08:37 Subject: Re: RFR JDK-8225474: JDI connector accept fails "Address already in use" with concurrent listeners Thanks Alan, much appreciated. Andrew Leonard Java Runtimes Development IBM Hursley IBM United Kingdom Ltd internet email: andrew_m_leonard at uk.ibm.com From: Alan Bateman To: Andrew Leonard Cc: serviceability-dev at openjdk.java.net Date: 02/07/2019 07:45 Subject: Re: RFR JDK-8225474: JDI connector accept fails "Address already in use" with concurrent listeners On 01/07/2019 20:41, Andrew Leonard wrote: Any one able to review please? This one will take a significant time to review. We also need to figure out if an @implNote if needed as nobody using these connectors will know (from the javadoc) that some of the implementations are thread safe. I hope to get to it in the next few weeks. -Alan. Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.doerr at sap.com Tue Aug 27 15:28:22 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 27 Aug 2019 15:28:22 +0000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> Message-ID: Hi David, I think this change makes sense. We'll test it and take a closer look at it. First impression is good. Best regards, Martin > -----Original Message----- > From: David Holmes > Sent: Dienstag, 27. August 2019 00:21 > To: serguei.spitsyn at oracle.com; serviceability-dev dev at openjdk.java.net>; jcbeyler at google.com; yasuenag at gmail.com; > sgehwolf at redhat.com; Doerr, Martin > Subject: Re: RFC: 8229160: Reimplement JvmtiRawMonitor to use > PlatformMonitor > > On 26/08/2019 6:25 pm, serguei.spitsyn at oracle.com wrote: > > Hi David, > > > > > > On 8/20/19 22:21, David Holmes wrote: > >> Hi Serguei, > >> > >> On 21/08/2019 9:58 am, serguei.spitsyn at oracle.com wrote: > >>> Hi David, > >>> > >>> The whole approach looks good to me. > >> > >> Thanks for taking a look. My main concern is about the interrupt > >> semantics, so I really need to get some end-user feedback on that > >> aspect as well. > > > > > > I don't have any opinion yet on what interrupt semantics tool developers > > really need. > > Yes, we may need to request some feedback. > > I've now explicitly added JC, Yasumasa, Severin and Martin, to this > email thread to try and solicit feedback from all the major players that > seem interested in this serviceability area. Folks I'd really appreciate > any feedback you may have here on the usecases for JvmtiRawMonitors, > and > in particular the use RawMonitorWait and its interaction with > Thread.interrupt. > > > My gut feeling tells me it is not good to break the original semantics. :) > > But let me think about it a little bit more. > > Me too, but I wanted to start simple. I suspect I will have to at least > implement time-based polling of the interrupt state. > > > Also, we need to file a CSR for this. > > Depending on how this proceeds, yes. > > > > >>> + if (jSelf != NULL) { > >>> + if (interruptible && Thread::is_interrupted(jSelf, true)) { > >>> + // We're now interrupted but we may have consumed a notification. > >>> + // To avoid lost wakeups we have to re-issue that notification, which > >>> + // may result in a spurious wakeup for another thread. > >>> Alternatively we > >>> + // ignore checking for interruption before returning. > >>> + notify(); > >>> + return false; // interrupted > >>> + } > >>> > >>> I'm a bit concerned about introduction of new spurious wake ups above. > >>> Some tests can be not defensive against it, so we may discover new > >>> intermittent failures. > >> > >> That is possible. Though given spurious wakeups are already possible > >> any test that is incorrectly using RawMonitorWait() without checking a > >> condition, is technically already broken. > > > > Agreed. > > I even think it is even better if spurious wakeups will happen more > > frequently. > > It should help to identify and fix such spots in the test base. > > Yes it is good tests. Alas not so good for production code :) > > >> > >> Not checking for interruption after the wait will also require some > >> test changes, and it weakens the interrupt semantics even further. > > > > I'm thinking about a small investigation on how this is used in our tests. > > There seem to be a few uses that are susceptible to spurious wakeup > errors, but those tests don't use interrupt. > > Thanks, > David > > > Thanks, > > Serguei > > > >> > >> Thanks, > >> David > >> ----- > >> > >>> Thanks, > >>> Serguei > >>> > >>> On 8/14/19 11:22 PM, David Holmes wrote: > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 > >>>> > >>>> Preliminary webrev (still has rough edges): > >>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ > >>>> > >>>> Background: > >>>> > >>>> We've had this comment for a long time: > >>>> > >>>> ?// The raw monitor subsystem is entirely distinct from normal > >>>> ?// java-synchronization or jni-synchronization.? raw monitors are not > >>>> ?// associated with objects.? They can be implemented in any manner > >>>> ?// that makes sense.? The original implementors decided to piggy-back > >>>> ?// the raw-monitor implementation on the existing Java > >>>> objectMonitor mechanism. > >>>> ?// This flaw needs to fixed.? We should reimplement raw monitors as > >>>> sui-generis. > >>>> ?// Specifically, we should not implement raw monitors via java > >>>> monitors. > >>>> ?// Time permitting, we should disentangle and deconvolve the two > >>>> implementations > >>>> ?// and move the resulting raw monitor implementation over to the > >>>> JVMTI directories. > >>>> ?// Ideally, the raw monitor implementation would be built on top of > >>>> ?// park-unpark and nothing else. > >>>> > >>>> This is an attempt to do that disentangling so that we can then > >>>> consider changes to ObjectMonitor without having to worry about > >>>> JvmtiRawMonitors. But rather than building on low-level park/unpark > >>>> (which would require the same manual queue management and much > of > >>>> the same complex code as exists in ObjectMonitor) I decided to try > >>>> and do this on top of PlatformMonitor. > >>>> > >>>> The reason this is just a RFC rather than RFR is that I overlooked a > >>>> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as > >>>> implemented by ObjectMonitor) they interact with the > >>>> Thread.interrupt mechanism. This is not clearly stated in the JVM TI > >>>> specification [1] but only in passing by the possible errors for > >>>> RawMonitorWait: > >>>> > >>>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again > >>>> > >>>> As I explain in the bug report there is no way to build in proper > >>>> interrupt support using PlatformMonitor as there is no way we can > >>>> "interrupt" the low-level pthread_cond_wait. But we can approximate > >>>> it. What I've done in this preliminary version is just check > >>>> interrupt state before and after the actual "wait" but we won't get > >>>> woken by the interrupt once we have actually blocked. Alternatively > >>>> we could use a periodic polling approach and wakeup every Nms to > >>>> check for interruption. > >>>> > >>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not > >>>> affected by this choice as that code ignores the interrupt until the > >>>> real action it was waiting for has occurred. The interrupt is then > >>>> reposted later. > >>>> > >>>> But more generally there could be users of JvmtiRawMonitors that > >>>> expect/require that RawMonitorWait is responsive to Thread.interrupt > >>>> in a manner similar to Object.wait. And if any of them are reading > >>>> this then I'd like to know - hence this RFC :) > >>>> > >>>> FYI testing to date: > >>>> ?- tiers 1 -3 all platforms > >>>> ?- hotspot: serviceability/jvmti > >>>> ????????????????????????? /jdwp > >>>> ??????????? vmTestbase/nsk/jvmti > >>>> ????????????????????????? /jdwp > >>>> ?- JDK: com/sun/jdi > >>>> > >>>> Comments/opinions appreciated. > >>>> > >>>> Thanks, > >>>> David > >>>> > >>>> [1] > >>>> > https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMo > nitorWait > >>>> > >>> > > From daniil.x.titov at oracle.com Tue Aug 27 21:08:57 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 27 Aug 2019 14:08:57 -0700 Subject: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: <14CB21E2-FD4F-4424-B1F5-97F82A17E36C@oracle.com> References: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> <8df66d4e-df9d-c502-3510-30c22ba58445@oracle.com> <15699850-37f5-93f6-6a55-525a4d099bd3@oracle.com> <95eb05da-cf13-f14b-74a6-9e8bf604b29b@oracle.com> <14CB21E2-FD4F-4424-B1F5-97F82A17E36C@oracle.com> Message-ID: Hi Dean and Chris, Just wanted to check with you would it be OK now to add this issue to Graal-specific problem list, as Dean suggested in one of the previous emails, while the proposal about introducing new options for @requires is being discussed? -Thanks! --Daniil ?On 8/9/19, 3:37 PM, "hotspot-compiler-dev-bounces at openjdk.java.net on behalf of dean.long at oracle.com" wrote: Good question When we have libgraal, there will still be an option (at least for debugging) to turn it off and use Graal the same way we do now, so it seems like the @requires would need to take that into account once we have libgraal. Maybe we will need a new "vm.libgraal.enabled" or make "vm.graal.enabled" be false for libgraal? It does seem a little backwards to require tests to know about the OOM handling details of different JVM features. Instead, how about if we let the test assert that it requires "vm.no-background-oom" or whatever, and let the JVM decide if it supports it. CC'ing hotspot-compiler-dev. dl On 8/8/19 7:42 PM, Chris Plummer wrote: > Actually looking at JDK-8207267 a little closer, it looks like it's > job is to re-enable tests that have been disabled with @requires > !vm.graal.enabled, so it looks like we have two different approaches > going in here. Which is preferred? If the preference is to problem > list, do we want to undo JDK-8207261 (except use JDK-8196611 as the CR). > > Chris > > On 8/8/19 5:08 PM, Chris Plummer wrote: >> That sounds like a better approach to me. >> >> thanks, >> >> Chris >> >> On 8/8/19 4:33 PM, dean.long at oracle.com wrote: >>> This is the kind of failure that is expected to go away with >>> libgraal. You can add the tests to the Graal-specific problem list >>> (see JDK-8196611) and they should be re-enabled with libgraal (see >>> JDK-JDK-8207267). >>> >>> dl >>> >>> On 8/8/19 10:21 AM, Chris Plummer wrote: >>>> Hi Daniil, >>>> >>>> My only objection is at some point it seems we need to be able to >>>> run these tests with graal (and other tests that have been disabled >>>> due to graal) because graal might be the only compiler, and we'll >>>> lose test coverage without these tests. Currently we have 260 jtreg >>>> tests disabled due to graal. I'm not sure to what extent they are >>>> waiting on graal fixes or otherwise have a bug filed to eventually >>>> fix them. Would be nice if we had a process in place to make sure >>>> these issues are eventually addressed. That fact that tests that >>>> exhaust memory in general seem to be incompatible with graal would >>>> to be the bigger issue that needs to be addressed. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 8/7/19 3:38 PM, Daniil Titov wrote: >>>>> Please review the change that fixes the failing tests when running >>>>> with Graal. The issue originally >>>>> included several vmTestbase/nsk/jdi tests but only 2 of them still >>>>> fail: >>>>> - >>>>> vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java >>>>> - >>>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java >>>>> >>>>> The problem with these two tests is that they consume all memory >>>>> to force the class unloading that >>>>> results in the exception during JVMCI compiler initialization and >>>>> the test failure. >>>>> The fix filters these tests out to not run with Graal compiler. >>>>> >>>>> Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 >>>>> >>>>> Thanks, >>>>> Daniil >>>>> >>>>> >>>> >>> >> >> > > From chris.plummer at oracle.com Tue Aug 27 21:50:28 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 27 Aug 2019 14:50:28 -0700 Subject: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: References: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> <8df66d4e-df9d-c502-3510-30c22ba58445@oracle.com> <15699850-37f5-93f6-6a55-525a4d099bd3@oracle.com> <95eb05da-cf13-f14b-74a6-9e8bf604b29b@oracle.com> <14CB21E2-FD4F-4424-B1F5-97F82A17E36C@oracle.com> Message-ID: I'm not sure. You could problem list it, but then the question is which bug to problem list it under, JDK-8195600 or JDK-8207267 (in which case JDK-8195600 would be closed). I'd hate to see a separate CR for every test that fails due to graal unexpectedly executing java code. But then JDK-8207267 seems to be more about getting rid of the use of @requires once libgraal is added, not going through the graal problemlist. Chris On 8/27/19 2:08 PM, Daniil Titov wrote: > Hi Dean and Chris, > > Just wanted to check with you would it be OK now to add this issue to > Graal-specific problem list, as Dean suggested in one of the previous emails, > while the proposal about introducing new options for @requires is being discussed? > > -Thanks! > --Daniil > > > > ?On 8/9/19, 3:37 PM, "hotspot-compiler-dev-bounces at openjdk.java.net on behalf of dean.long at oracle.com" wrote: > > Good question When we have libgraal, there will still be an option (at > least for debugging) to turn it off and use Graal the same way we do > now, so it seems like the @requires would need to take that into account > once we have libgraal. Maybe we will need a new "vm.libgraal.enabled" > or make "vm.graal.enabled" be false for libgraal? > > It does seem a little backwards to require tests to know about the OOM > handling details of different JVM features. Instead, how about if we > let the test assert that it requires "vm.no-background-oom" or whatever, > and let the JVM decide if it supports it. > > CC'ing hotspot-compiler-dev. > > dl > > On 8/8/19 7:42 PM, Chris Plummer wrote: > > Actually looking at JDK-8207267 a little closer, it looks like it's > > job is to re-enable tests that have been disabled with @requires > > !vm.graal.enabled, so it looks like we have two different approaches > > going in here. Which is preferred? If the preference is to problem > > list, do we want to undo JDK-8207261 (except use JDK-8196611 as the CR). > > > > Chris > > > > On 8/8/19 5:08 PM, Chris Plummer wrote: > >> That sounds like a better approach to me. > >> > >> thanks, > >> > >> Chris > >> > >> On 8/8/19 4:33 PM, dean.long at oracle.com wrote: > >>> This is the kind of failure that is expected to go away with > >>> libgraal. You can add the tests to the Graal-specific problem list > >>> (see JDK-8196611) and they should be re-enabled with libgraal (see > >>> JDK-JDK-8207267). > >>> > >>> dl > >>> > >>> On 8/8/19 10:21 AM, Chris Plummer wrote: > >>>> Hi Daniil, > >>>> > >>>> My only objection is at some point it seems we need to be able to > >>>> run these tests with graal (and other tests that have been disabled > >>>> due to graal) because graal might be the only compiler, and we'll > >>>> lose test coverage without these tests. Currently we have 260 jtreg > >>>> tests disabled due to graal. I'm not sure to what extent they are > >>>> waiting on graal fixes or otherwise have a bug filed to eventually > >>>> fix them. Would be nice if we had a process in place to make sure > >>>> these issues are eventually addressed. That fact that tests that > >>>> exhaust memory in general seem to be incompatible with graal would > >>>> to be the bigger issue that needs to be addressed. > >>>> > >>>> thanks, > >>>> > >>>> Chris > >>>> > >>>> On 8/7/19 3:38 PM, Daniil Titov wrote: > >>>>> Please review the change that fixes the failing tests when running > >>>>> with Graal. The issue originally > >>>>> included several vmTestbase/nsk/jdi tests but only 2 of them still > >>>>> fail: > >>>>> - > >>>>> vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java > >>>>> - > >>>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java > >>>>> > >>>>> The problem with these two tests is that they consume all memory > >>>>> to force the class unloading that > >>>>> results in the exception during JVMCI compiler initialization and > >>>>> the test failure. > >>>>> The fix filters these tests out to not run with Graal compiler. > >>>>> > >>>>> Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ > >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 > >>>>> > >>>>> Thanks, > >>>>> Daniil > >>>>> > >>>>> > >>>> > >>> > >> > >> > > > > > > > > > From dean.long at oracle.com Tue Aug 27 22:14:36 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 27 Aug 2019 15:14:36 -0700 Subject: 8195600: [Graal] jdi tests timeouts with Graal because debuggee vm is not resumed In-Reply-To: References: <855244c9-014c-59d1-cda0-b5f38057f588@oracle.com> <8df66d4e-df9d-c502-3510-30c22ba58445@oracle.com> <15699850-37f5-93f6-6a55-525a4d099bd3@oracle.com> <95eb05da-cf13-f14b-74a6-9e8bf604b29b@oracle.com> <14CB21E2-FD4F-4424-B1F5-97F82A17E36C@oracle.com> Message-ID: I don't have a strong opinion either way.? It's too bad we don't have resource management features that would allow setting a memory limit on the test or app while allowing the rest of the JVM to use memory unrestricted.? That might solve a lot of these OOM problems. Even after we move to libgraal, that still doesn't guarantee that reaching an OOM state is a recoverable state for the JVM. Is there a way to rewrite these tests to cover what they need to cover, without consuming all of memory (maybe a class unloading WhiteBox API)?? Or maybe some kind of hint, like a command-line flag, that says the test will consume all of memory.? That way any critical services like the JVMCI compiler could use that hint to eagerly initialize before the test starts. dl On 8/27/19 2:08 PM, Daniil Titov wrote: > Hi Dean and Chris, > > Just wanted to check with you would it be OK now to add this issue to > Graal-specific problem list, as Dean suggested in one of the previous emails, > while the proposal about introducing new options for @requires is being discussed? > > -Thanks! > --Daniil > > > > ?On 8/9/19, 3:37 PM, "hotspot-compiler-dev-bounces at openjdk.java.net on behalf of dean.long at oracle.com" wrote: > > Good question When we have libgraal, there will still be an option (at > least for debugging) to turn it off and use Graal the same way we do > now, so it seems like the @requires would need to take that into account > once we have libgraal. Maybe we will need a new "vm.libgraal.enabled" > or make "vm.graal.enabled" be false for libgraal? > > It does seem a little backwards to require tests to know about the OOM > handling details of different JVM features. Instead, how about if we > let the test assert that it requires "vm.no-background-oom" or whatever, > and let the JVM decide if it supports it. > > CC'ing hotspot-compiler-dev. > > dl > > On 8/8/19 7:42 PM, Chris Plummer wrote: > > Actually looking at JDK-8207267 a little closer, it looks like it's > > job is to re-enable tests that have been disabled with @requires > > !vm.graal.enabled, so it looks like we have two different approaches > > going in here. Which is preferred? If the preference is to problem > > list, do we want to undo JDK-8207261 (except use JDK-8196611 as the CR). > > > > Chris > > > > On 8/8/19 5:08 PM, Chris Plummer wrote: > >> That sounds like a better approach to me. > >> > >> thanks, > >> > >> Chris > >> > >> On 8/8/19 4:33 PM, dean.long at oracle.com wrote: > >>> This is the kind of failure that is expected to go away with > >>> libgraal. You can add the tests to the Graal-specific problem list > >>> (see JDK-8196611) and they should be re-enabled with libgraal (see > >>> JDK-JDK-8207267). > >>> > >>> dl > >>> > >>> On 8/8/19 10:21 AM, Chris Plummer wrote: > >>>> Hi Daniil, > >>>> > >>>> My only objection is at some point it seems we need to be able to > >>>> run these tests with graal (and other tests that have been disabled > >>>> due to graal) because graal might be the only compiler, and we'll > >>>> lose test coverage without these tests. Currently we have 260 jtreg > >>>> tests disabled due to graal. I'm not sure to what extent they are > >>>> waiting on graal fixes or otherwise have a bug filed to eventually > >>>> fix them. Would be nice if we had a process in place to make sure > >>>> these issues are eventually addressed. That fact that tests that > >>>> exhaust memory in general seem to be incompatible with graal would > >>>> to be the bigger issue that needs to be addressed. > >>>> > >>>> thanks, > >>>> > >>>> Chris > >>>> > >>>> On 8/7/19 3:38 PM, Daniil Titov wrote: > >>>>> Please review the change that fixes the failing tests when running > >>>>> with Graal. The issue originally > >>>>> included several vmTestbase/nsk/jdi tests but only 2 of them still > >>>>> fail: > >>>>> - > >>>>> vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java > >>>>> - > >>>>> vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java > >>>>> > >>>>> The problem with these two tests is that they consume all memory > >>>>> to force the class unloading that > >>>>> results in the exception during JVMCI compiler initialization and > >>>>> the test failure. > >>>>> The fix filters these tests out to not run with Graal compiler. > >>>>> > >>>>> Webrev: http://cr.openjdk.java.net/~dtitov/8195600/webrev.01/ > >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8195600 > >>>>> > >>>>> Thanks, > >>>>> Daniil > >>>>> > >>>>> > >>>> > >>> > >> > >> > > > > > > > > > From serguei.spitsyn at oracle.com Wed Aug 28 07:15:29 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 28 Aug 2019 00:15:29 -0700 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: <5a434ae0-4561-9dcd-fc9c-92fb203b54ac@oracle.com> References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> <5a434ae0-4561-9dcd-fc9c-92fb203b54ac@oracle.com> Message-ID: <9f25cb46-4d97-c9e2-2406-902ac1cad236@oracle.com> Hi Alex, Thank you for the update! The most interesting case of a table with a multilevel indent is: http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/2/jdwp-protocol.html#JDWP_EventRequest I still have a doubt as new variant looks a little less clear/comprehensive. What do you think? Thanks, Serguei On 8/26/19 16:44, Alex Menkov wrote: > Ok. > > Updated webrev: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev.2/ > > > The difference vs v.1 is: > - ErrorSetNode.java - added 'style="width: 20%"' for the 1st column; > - ConstantSetNode.java - fixed width of the 1st column (20% -> 30%) > > generated doc: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/2/jdwp-protocol.html > > > --alex > > On 08/26/2019 16:00, David Holmes wrote: >> On 27/08/2019 8:47 am, Alex Menkov wrote: >>> On 08/26/2019 14:54, David Holmes wrote: >>>> On 27/08/2019 3:01 am, Alex Menkov wrote: >>>>> Hi Serguei, >>>>> >>>>> The change is intentional - it seems to me that there were too >>>>> many borders in the struct description tables. I thought about >>>>> removing some of them (or making them thiner or changing color to >>>>> gray). >>>>> I don't think absence of the lines makes comprehension of the >>>>> structures harder. >>>> >>>> I like the new look - especially now we have proper headers and no >>>> more strange looking empty cells! >>>> >>>> My only suggestion is to make the first column of each table the >>>> same width (were possible) so that the tables line up better - >>>> specifically the "Error Data" table's "Value" column should be the >>>> same width as the "Reply Data" table's "Type" column. >>> >>> Maybe then make 1st column of "Error Data" the same width as (Type + >>> Name) columns in Out Data/Reply Data? >> >> That would not look very good IMHO. >> >> David >> ----- >> >>> Then "Description" column in all tables will be 65%. >>> >>> BTW just discovered at error in Constants tables - they have column >>> 20%, 5% and 65% - going to update the 2st column to be 30% >>> >>> --alex >>> >>>> >>>> Thanks, >>>> David >>>> >>>>> --alex >>>>> >>>>> On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Alex, >>>>>> >>>>>> I see one issue with new table format. >>>>>> For instance look at the table for "DisposeObjects Command (14)". >>>>>> Even a better example is "RedefineClasses Command (18)". >>>>>> In the old tables the indentation was highlighted with the >>>>>> vertical lines. >>>>>> It is missed in your version. >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 8/23/19 16:54, Alex Menkov wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review the fix for >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8228554 >>>>>>> >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >>>>>>> >>>>>>> >>>>>>> generated docs: >>>>>>> old: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >>>>>>> >>>>>>> new: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >>>>>>> >>>>>>> >>>>>>> specdiff: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >>>>>>> >>>>>>> >>>>>>> Summary: >>>>>>> - "content outside of a region" issues: >>>>>>> ? -
replaced with with
, >>>>>>> ? -
replaced with
; >>>>>>> - table issues: >>>>>>> ? - added column headers to all tables; >>>>>>> ? - for every row specified row header; >>>>>>> ? - indentation with table "colspan" reimplemented by using CSS. >>>>>>> >>>>>>> --alex >>>>>> From david.holmes at oracle.com Wed Aug 28 07:23:28 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 28 Aug 2019 17:23:28 +1000 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: <9f25cb46-4d97-c9e2-2406-902ac1cad236@oracle.com> References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> <5a434ae0-4561-9dcd-fc9c-92fb203b54ac@oracle.com> <9f25cb46-4d97-c9e2-2406-902ac1cad236@oracle.com> Message-ID: <0d3b492b-9f62-11f4-db6f-6ed8c7fd1e81@oracle.com> Hi Serguei, On 28/08/2019 5:15 pm, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > Thank you for the update! > > The most interesting case of a table with a multilevel indent is: > http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/2/jdwp-protocol.html#JDWP_EventRequest > > > I still have a doubt as new variant looks a little less > clear/comprehensive. > What do you think? I think the new version looks fine - the extra cell boundary markers in the original do not add anything to the clarity IMHO. I find both forms equally difficult to understand. Cheers, David > Thanks, > Serguei > > > > On 8/26/19 16:44, Alex Menkov wrote: >> Ok. >> >> Updated webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev.2/ >> >> >> The difference vs v.1 is: >> - ErrorSetNode.java - added 'style="width: 20%"' for the 1st column; >> - ConstantSetNode.java - fixed width of the 1st column (20% -> 30%) >> >> generated doc: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/2/jdwp-protocol.html >> >> >> --alex >> >> On 08/26/2019 16:00, David Holmes wrote: >>> On 27/08/2019 8:47 am, Alex Menkov wrote: >>>> On 08/26/2019 14:54, David Holmes wrote: >>>>> On 27/08/2019 3:01 am, Alex Menkov wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> The change is intentional - it seems to me that there were too >>>>>> many borders in the struct description tables. I thought about >>>>>> removing some of them (or making them thiner or changing color to >>>>>> gray). >>>>>> I don't think absence of the lines makes comprehension of the >>>>>> structures harder. >>>>> >>>>> I like the new look - especially now we have proper headers and no >>>>> more strange looking empty cells! >>>>> >>>>> My only suggestion is to make the first column of each table the >>>>> same width (were possible) so that the tables line up better - >>>>> specifically the "Error Data" table's "Value" column should be the >>>>> same width as the "Reply Data" table's "Type" column. >>>> >>>> Maybe then make 1st column of "Error Data" the same width as (Type + >>>> Name) columns in Out Data/Reply Data? >>> >>> That would not look very good IMHO. >>> >>> David >>> ----- >>> >>>> Then "Description" column in all tables will be 65%. >>>> >>>> BTW just discovered at error in Constants tables - they have column >>>> 20%, 5% and 65% - going to update the 2st column to be 30% >>>> >>>> --alex >>>> >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> --alex >>>>>> >>>>>> On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Alex, >>>>>>> >>>>>>> I see one issue with new table format. >>>>>>> For instance look at the table for "DisposeObjects Command (14)". >>>>>>> Even a better example is "RedefineClasses Command (18)". >>>>>>> In the old tables the indentation was highlighted with the >>>>>>> vertical lines. >>>>>>> It is missed in your version. >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 8/23/19 16:54, Alex Menkov wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review the fix for >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8228554 >>>>>>>> >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >>>>>>>> >>>>>>>> >>>>>>>> generated docs: >>>>>>>> old: >>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >>>>>>>> >>>>>>>> new: >>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >>>>>>>> >>>>>>>> >>>>>>>> specdiff: >>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >>>>>>>> >>>>>>>> >>>>>>>> Summary: >>>>>>>> - "content outside of a region" issues: >>>>>>>> ? -
replaced with with
, >>>>>>>> ? -
replaced with
; >>>>>>>> - table issues: >>>>>>>> ? - added column headers to all tables; >>>>>>>> ? - for every row specified row header; >>>>>>>> ? - indentation with table "colspan" reimplemented by using CSS. >>>>>>>> >>>>>>>> --alex >>>>>>> > From martin.doerr at sap.com Wed Aug 28 10:53:42 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 28 Aug 2019 10:53:42 +0000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> Message-ID: Hi David, I've run it through our nightly tests and got an assertion on Windows. Test: vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001 Assertion: # Internal Error (src/hotspot/share/prims/jvmtiRawMonitor.cpp:173), pid=17824, tid=7808 # assert(__the_thread__->is_VM_thread()) failed: must be VM thread Current thread (0x00000213a5dbb800): GCTaskThread "GC Thread#0" [stack: 0x0000001bfbe00000,0x0000001bfbf00000] [id=7808] Stack: [0x0000001bfbe00000,0x0000001bfbf00000] Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) ... V [jvm.dll+0x50d1a2] report_vm_error+0x102 (debug.cpp:264) V [jvm.dll+0x8f3996] JvmtiRawMonitor::raw_enter+0x1e6 (jvmtirawmonitor.cpp:173) V [jvm.dll+0x8d6121] JvmtiEnv::RawMonitorEnter+0x211 (jvmtienv.cpp:3345) C [ap04t001.dll+0x30ef] So this assumption is not true: // No other non-Java threads besides VM thread would acquire // a raw monitor. This is the only issue I've seen so far. Best regards, Martin > -----Original Message----- > From: David Holmes > Sent: Dienstag, 27. August 2019 00:21 > To: serguei.spitsyn at oracle.com; serviceability-dev dev at openjdk.java.net>; jcbeyler at google.com; yasuenag at gmail.com; > sgehwolf at redhat.com; Doerr, Martin > Subject: Re: RFC: 8229160: Reimplement JvmtiRawMonitor to use > PlatformMonitor > > On 26/08/2019 6:25 pm, serguei.spitsyn at oracle.com wrote: > > Hi David, > > > > > > On 8/20/19 22:21, David Holmes wrote: > >> Hi Serguei, > >> > >> On 21/08/2019 9:58 am, serguei.spitsyn at oracle.com wrote: > >>> Hi David, > >>> > >>> The whole approach looks good to me. > >> > >> Thanks for taking a look. My main concern is about the interrupt > >> semantics, so I really need to get some end-user feedback on that > >> aspect as well. > > > > > > I don't have any opinion yet on what interrupt semantics tool developers > > really need. > > Yes, we may need to request some feedback. > > I've now explicitly added JC, Yasumasa, Severin and Martin, to this > email thread to try and solicit feedback from all the major players that > seem interested in this serviceability area. Folks I'd really appreciate > any feedback you may have here on the usecases for JvmtiRawMonitors, > and > in particular the use RawMonitorWait and its interaction with > Thread.interrupt. > > > My gut feeling tells me it is not good to break the original semantics. :) > > But let me think about it a little bit more. > > Me too, but I wanted to start simple. I suspect I will have to at least > implement time-based polling of the interrupt state. > > > Also, we need to file a CSR for this. > > Depending on how this proceeds, yes. > > > > >>> + if (jSelf != NULL) { > >>> + if (interruptible && Thread::is_interrupted(jSelf, true)) { > >>> + // We're now interrupted but we may have consumed a notification. > >>> + // To avoid lost wakeups we have to re-issue that notification, which > >>> + // may result in a spurious wakeup for another thread. > >>> Alternatively we > >>> + // ignore checking for interruption before returning. > >>> + notify(); > >>> + return false; // interrupted > >>> + } > >>> > >>> I'm a bit concerned about introduction of new spurious wake ups above. > >>> Some tests can be not defensive against it, so we may discover new > >>> intermittent failures. > >> > >> That is possible. Though given spurious wakeups are already possible > >> any test that is incorrectly using RawMonitorWait() without checking a > >> condition, is technically already broken. > > > > Agreed. > > I even think it is even better if spurious wakeups will happen more > > frequently. > > It should help to identify and fix such spots in the test base. > > Yes it is good tests. Alas not so good for production code :) > > >> > >> Not checking for interruption after the wait will also require some > >> test changes, and it weakens the interrupt semantics even further. > > > > I'm thinking about a small investigation on how this is used in our tests. > > There seem to be a few uses that are susceptible to spurious wakeup > errors, but those tests don't use interrupt. > > Thanks, > David > > > Thanks, > > Serguei > > > >> > >> Thanks, > >> David > >> ----- > >> > >>> Thanks, > >>> Serguei > >>> > >>> On 8/14/19 11:22 PM, David Holmes wrote: > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 > >>>> > >>>> Preliminary webrev (still has rough edges): > >>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ > >>>> > >>>> Background: > >>>> > >>>> We've had this comment for a long time: > >>>> > >>>> ?// The raw monitor subsystem is entirely distinct from normal > >>>> ?// java-synchronization or jni-synchronization.? raw monitors are not > >>>> ?// associated with objects.? They can be implemented in any manner > >>>> ?// that makes sense.? The original implementors decided to piggy-back > >>>> ?// the raw-monitor implementation on the existing Java > >>>> objectMonitor mechanism. > >>>> ?// This flaw needs to fixed.? We should reimplement raw monitors as > >>>> sui-generis. > >>>> ?// Specifically, we should not implement raw monitors via java > >>>> monitors. > >>>> ?// Time permitting, we should disentangle and deconvolve the two > >>>> implementations > >>>> ?// and move the resulting raw monitor implementation over to the > >>>> JVMTI directories. > >>>> ?// Ideally, the raw monitor implementation would be built on top of > >>>> ?// park-unpark and nothing else. > >>>> > >>>> This is an attempt to do that disentangling so that we can then > >>>> consider changes to ObjectMonitor without having to worry about > >>>> JvmtiRawMonitors. But rather than building on low-level park/unpark > >>>> (which would require the same manual queue management and much > of > >>>> the same complex code as exists in ObjectMonitor) I decided to try > >>>> and do this on top of PlatformMonitor. > >>>> > >>>> The reason this is just a RFC rather than RFR is that I overlooked a > >>>> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as > >>>> implemented by ObjectMonitor) they interact with the > >>>> Thread.interrupt mechanism. This is not clearly stated in the JVM TI > >>>> specification [1] but only in passing by the possible errors for > >>>> RawMonitorWait: > >>>> > >>>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again > >>>> > >>>> As I explain in the bug report there is no way to build in proper > >>>> interrupt support using PlatformMonitor as there is no way we can > >>>> "interrupt" the low-level pthread_cond_wait. But we can approximate > >>>> it. What I've done in this preliminary version is just check > >>>> interrupt state before and after the actual "wait" but we won't get > >>>> woken by the interrupt once we have actually blocked. Alternatively > >>>> we could use a periodic polling approach and wakeup every Nms to > >>>> check for interruption. > >>>> > >>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not > >>>> affected by this choice as that code ignores the interrupt until the > >>>> real action it was waiting for has occurred. The interrupt is then > >>>> reposted later. > >>>> > >>>> But more generally there could be users of JvmtiRawMonitors that > >>>> expect/require that RawMonitorWait is responsive to Thread.interrupt > >>>> in a manner similar to Object.wait. And if any of them are reading > >>>> this then I'd like to know - hence this RFC :) > >>>> > >>>> FYI testing to date: > >>>> ?- tiers 1 -3 all platforms > >>>> ?- hotspot: serviceability/jvmti > >>>> ????????????????????????? /jdwp > >>>> ??????????? vmTestbase/nsk/jvmti > >>>> ????????????????????????? /jdwp > >>>> ?- JDK: com/sun/jdi > >>>> > >>>> Comments/opinions appreciated. > >>>> > >>>> Thanks, > >>>> David > >>>> > >>>> [1] > >>>> > https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMo > nitorWait > >>>> > >>> > > From david.holmes at oracle.com Wed Aug 28 12:33:14 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 28 Aug 2019 22:33:14 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> Message-ID: <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> Hi Martin, On 28/08/2019 8:53 pm, Doerr, Martin wrote: > Hi David, > > I've run it through our nightly tests and got an assertion on Windows. > > Test: > vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001 > > Assertion: > # Internal Error (src/hotspot/share/prims/jvmtiRawMonitor.cpp:173), pid=17824, tid=7808 > # assert(__the_thread__->is_VM_thread()) failed: must be VM thread That's very interesting - the assertion exists in the current code as well. > Current thread (0x00000213a5dbb800): GCTaskThread "GC Thread#0" [stack: 0x0000001bfbe00000,0x0000001bfbf00000] [id=7808] Now why would a GCTaskThread being executing code that accesses JvmtiRawMonitors? Are we in some kind of event callback? > Stack: [0x0000001bfbe00000,0x0000001bfbf00000] > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > ... > V [jvm.dll+0x50d1a2] report_vm_error+0x102 (debug.cpp:264) > V [jvm.dll+0x8f3996] JvmtiRawMonitor::raw_enter+0x1e6 (jvmtirawmonitor.cpp:173) > V [jvm.dll+0x8d6121] JvmtiEnv::RawMonitorEnter+0x211 (jvmtienv.cpp:3345) > C [ap04t001.dll+0x30ef] Is there any more stack? What is that dll? > So this assumption is not true: > // No other non-Java threads besides VM thread would acquire > // a raw monitor. > > This is the only issue I've seen so far. As I said this is not a new assertion so its interesting that you fund this. Can you tell me what test this was and how to reproduce? Thanks, David ----- > Best regards, > Martin > > >> -----Original Message----- >> From: David Holmes >> Sent: Dienstag, 27. August 2019 00:21 >> To: serguei.spitsyn at oracle.com; serviceability-dev > dev at openjdk.java.net>; jcbeyler at google.com; yasuenag at gmail.com; >> sgehwolf at redhat.com; Doerr, Martin >> Subject: Re: RFC: 8229160: Reimplement JvmtiRawMonitor to use >> PlatformMonitor >> >> On 26/08/2019 6:25 pm, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> >>> On 8/20/19 22:21, David Holmes wrote: >>>> Hi Serguei, >>>> >>>> On 21/08/2019 9:58 am, serguei.spitsyn at oracle.com wrote: >>>>> Hi David, >>>>> >>>>> The whole approach looks good to me. >>>> >>>> Thanks for taking a look. My main concern is about the interrupt >>>> semantics, so I really need to get some end-user feedback on that >>>> aspect as well. >>> >>> >>> I don't have any opinion yet on what interrupt semantics tool developers >>> really need. >>> Yes, we may need to request some feedback. >> >> I've now explicitly added JC, Yasumasa, Severin and Martin, to this >> email thread to try and solicit feedback from all the major players that >> seem interested in this serviceability area. Folks I'd really appreciate >> any feedback you may have here on the usecases for JvmtiRawMonitors, >> and >> in particular the use RawMonitorWait and its interaction with >> Thread.interrupt. >> >>> My gut feeling tells me it is not good to break the original semantics. :) >>> But let me think about it a little bit more. >> >> Me too, but I wanted to start simple. I suspect I will have to at least >> implement time-based polling of the interrupt state. >> >>> Also, we need to file a CSR for this. >> >> Depending on how this proceeds, yes. >> >>> >>>>> + if (jSelf != NULL) { >>>>> + if (interruptible && Thread::is_interrupted(jSelf, true)) { >>>>> + // We're now interrupted but we may have consumed a notification. >>>>> + // To avoid lost wakeups we have to re-issue that notification, which >>>>> + // may result in a spurious wakeup for another thread. >>>>> Alternatively we >>>>> + // ignore checking for interruption before returning. >>>>> + notify(); >>>>> + return false; // interrupted >>>>> + } >>>>> >>>>> I'm a bit concerned about introduction of new spurious wake ups above. >>>>> Some tests can be not defensive against it, so we may discover new >>>>> intermittent failures. >>>> >>>> That is possible. Though given spurious wakeups are already possible >>>> any test that is incorrectly using RawMonitorWait() without checking a >>>> condition, is technically already broken. >>> >>> Agreed. >>> I even think it is even better if spurious wakeups will happen more >>> frequently. >>> It should help to identify and fix such spots in the test base. >> >> Yes it is good tests. Alas not so good for production code :) >> >>>> >>>> Not checking for interruption after the wait will also require some >>>> test changes, and it weakens the interrupt semantics even further. >>> >>> I'm thinking about a small investigation on how this is used in our tests. >> >> There seem to be a few uses that are susceptible to spurious wakeup >> errors, but those tests don't use interrupt. >> >> Thanks, >> David >> >>> Thanks, >>> Serguei >>> >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 8/14/19 11:22 PM, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229160 >>>>>> >>>>>> Preliminary webrev (still has rough edges): >>>>>> http://cr.openjdk.java.net/~dholmes/8229160/webrev.prelim/ >>>>>> >>>>>> Background: >>>>>> >>>>>> We've had this comment for a long time: >>>>>> >>>>>> ?// The raw monitor subsystem is entirely distinct from normal >>>>>> ?// java-synchronization or jni-synchronization.? raw monitors are not >>>>>> ?// associated with objects.? They can be implemented in any manner >>>>>> ?// that makes sense.? The original implementors decided to piggy-back >>>>>> ?// the raw-monitor implementation on the existing Java >>>>>> objectMonitor mechanism. >>>>>> ?// This flaw needs to fixed.? We should reimplement raw monitors as >>>>>> sui-generis. >>>>>> ?// Specifically, we should not implement raw monitors via java >>>>>> monitors. >>>>>> ?// Time permitting, we should disentangle and deconvolve the two >>>>>> implementations >>>>>> ?// and move the resulting raw monitor implementation over to the >>>>>> JVMTI directories. >>>>>> ?// Ideally, the raw monitor implementation would be built on top of >>>>>> ?// park-unpark and nothing else. >>>>>> >>>>>> This is an attempt to do that disentangling so that we can then >>>>>> consider changes to ObjectMonitor without having to worry about >>>>>> JvmtiRawMonitors. But rather than building on low-level park/unpark >>>>>> (which would require the same manual queue management and much >> of >>>>>> the same complex code as exists in ObjectMonitor) I decided to try >>>>>> and do this on top of PlatformMonitor. >>>>>> >>>>>> The reason this is just a RFC rather than RFR is that I overlooked a >>>>>> non-trivial aspect of JvmtiRawMonitors: like Java monitors (as >>>>>> implemented by ObjectMonitor) they interact with the >>>>>> Thread.interrupt mechanism. This is not clearly stated in the JVM TI >>>>>> specification [1] but only in passing by the possible errors for >>>>>> RawMonitorWait: >>>>>> >>>>>> JVMTI_ERROR_INTERRUPT??? Wait was interrupted, try again >>>>>> >>>>>> As I explain in the bug report there is no way to build in proper >>>>>> interrupt support using PlatformMonitor as there is no way we can >>>>>> "interrupt" the low-level pthread_cond_wait. But we can approximate >>>>>> it. What I've done in this preliminary version is just check >>>>>> interrupt state before and after the actual "wait" but we won't get >>>>>> woken by the interrupt once we have actually blocked. Alternatively >>>>>> we could use a periodic polling approach and wakeup every Nms to >>>>>> check for interruption. >>>>>> >>>>>> The only use of JvmtiRawMonitors in the JDK libraries (JDWP) is not >>>>>> affected by this choice as that code ignores the interrupt until the >>>>>> real action it was waiting for has occurred. The interrupt is then >>>>>> reposted later. >>>>>> >>>>>> But more generally there could be users of JvmtiRawMonitors that >>>>>> expect/require that RawMonitorWait is responsive to Thread.interrupt >>>>>> in a manner similar to Object.wait. And if any of them are reading >>>>>> this then I'd like to know - hence this RFC :) >>>>>> >>>>>> FYI testing to date: >>>>>> ?- tiers 1 -3 all platforms >>>>>> ?- hotspot: serviceability/jvmti >>>>>> ????????????????????????? /jdwp >>>>>> ??????????? vmTestbase/nsk/jvmti >>>>>> ????????????????????????? /jdwp >>>>>> ?- JDK: com/sun/jdi >>>>>> >>>>>> Comments/opinions appreciated. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> [1] >>>>>> >> https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#RawMo >> nitorWait >>>>>> >>>>> >>> From martin.doerr at sap.com Wed Aug 28 12:57:13 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 28 Aug 2019 12:57:13 +0000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> Message-ID: Hi David, > Now why would a GCTaskThread being executing code that accesses > JvmtiRawMonitors? Are we in some kind of event callback? I believe so. The test registers the following callbacks: callbacks.GarbageCollectionStart = &GarbageCollectionStart; callbacks.GarbageCollectionFinish = &GarbageCollectionFinish; And these callback functions use jvmti->RawMonitorEnter. Note that the spec for "Raw Monitor Enter" allows this: "This function may be called from the callbacks to the Heap iteration functions, or from the event handlers for the GarbageCollectionStart, GarbageCollectionFinish, and ObjectFree events." > Is there any more stack? What is that dll? The dll belongs to the test. I guess it was built without debug info so there's no native stack trace available. > Can you tell me what test this was and how to reproduce? make run-test TEST="vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001" I haven't tried if it can be reproduced well. May be sporadic. At least, I can confirm that the following comment is true ?? // FIXME: this is broken - raw_enter only accepts the VMThread Best regards, Martin From yasuenag at gmail.com Wed Aug 28 14:14:03 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 28 Aug 2019 23:14:03 +0900 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> Message-ID: <6886d18c-00c1-b027-dee1-a161c9bf4a63@gmail.com> Hi, I guess this issue was occurred on ObjectFree JVMTI event. It is registered in a04t001.cpp [1]. I think it might be called on GC worker because it relates to JvmtiExport::weak_oops_do(). In current code, JvmtiRawMonitor::_owner is handled with atomic operation. However at JvmtiRawMonitor::quick_enter() in new webrev, it just referent normally. Should it be handled atomically? At least, I think this assert might be failed in current code because raw monitor might be handled on GC worker. So "FIXME" comment Martin found is valid... Thanks, Yasumasa [1] hg.openjdk.java.net/jdk/jdk/file/4f38fcd65577/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001/ap04t001.cpp#l219 On 2019/08/28 21:57, Doerr, Martin wrote: > Hi David, > >> Now why would a GCTaskThread being executing code that accesses >> JvmtiRawMonitors? Are we in some kind of event callback? > I believe so. The test registers the following callbacks: > callbacks.GarbageCollectionStart = &GarbageCollectionStart; > callbacks.GarbageCollectionFinish = &GarbageCollectionFinish; > And these callback functions use jvmti->RawMonitorEnter. > > Note that the spec for "Raw Monitor Enter" allows this: > "This function may be called from the callbacks to the Heap iteration functions, or from the event handlers for the GarbageCollectionStart, GarbageCollectionFinish, and ObjectFree events." > >> Is there any more stack? What is that dll? > The dll belongs to the test. I guess it was built without debug info so there's no native stack trace available. > >> Can you tell me what test this was and how to reproduce? > make run-test TEST="vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001" > I haven't tried if it can be reproduced well. May be sporadic. > > At least, I can confirm that the following comment is true ?? > // FIXME: this is broken - raw_enter only accepts the VMThread > > Best regards, > Martin > From serguei.spitsyn at oracle.com Wed Aug 28 16:54:33 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 28 Aug 2019 09:54:33 -0700 Subject: RFR: JDK-8228554: Accessibility errors in jdwp-protocol.html In-Reply-To: <0d3b492b-9f62-11f4-db6f-6ed8c7fd1e81@oracle.com> References: <0be12a6e-74b8-4191-c812-b3f8b195b41f@oracle.com> <12824420-15de-2131-9211-60fc6cabc8fa@oracle.com> <087be63e-8751-3d7c-4614-ecd903f252f5@oracle.com> <5a434ae0-4561-9dcd-fc9c-92fb203b54ac@oracle.com> <9f25cb46-4d97-c9e2-2406-902ac1cad236@oracle.com> <0d3b492b-9f62-11f4-db6f-6ed8c7fd1e81@oracle.com> Message-ID: Hi David, Thank you for sharing your opinion. I do not have any good suggestion how to improve it. Alex, I'm fine with the fix as it is. Thanks, Serguei On 8/28/19 00:23, David Holmes wrote: > Hi Serguei, > > On 28/08/2019 5:15 pm, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> Thank you for the update! >> >> The most interesting case of a table with a multilevel indent is: >> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/2/jdwp-protocol.html#JDWP_EventRequest >> >> >> I still have a doubt as new variant looks a little less >> clear/comprehensive. >> What do you think? > > I think the new version looks fine - the extra cell boundary markers > in the original do not add anything to the clarity IMHO. I find both > forms equally difficult to understand. > > Cheers, > David > > >> Thanks, >> Serguei >> >> >> >> On 8/26/19 16:44, Alex Menkov wrote: >>> Ok. >>> >>> Updated webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev.2/ >>> >>> >>> The difference vs v.1 is: >>> - ErrorSetNode.java - added 'style="width: 20%"' for the 1st column; >>> - ConstantSetNode.java - fixed width of the 1st column (20% -> 30%) >>> >>> generated doc: >>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/2/jdwp-protocol.html >>> >>> >>> --alex >>> >>> On 08/26/2019 16:00, David Holmes wrote: >>>> On 27/08/2019 8:47 am, Alex Menkov wrote: >>>>> On 08/26/2019 14:54, David Holmes wrote: >>>>>> On 27/08/2019 3:01 am, Alex Menkov wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> The change is intentional - it seems to me that there were too >>>>>>> many borders in the struct description tables. I thought about >>>>>>> removing some of them (or making them thiner or changing color >>>>>>> to gray). >>>>>>> I don't think absence of the lines makes comprehension of the >>>>>>> structures harder. >>>>>> >>>>>> I like the new look - especially now we have proper headers and >>>>>> no more strange looking empty cells! >>>>>> >>>>>> My only suggestion is to make the first column of each table the >>>>>> same width (were possible) so that the tables line up better - >>>>>> specifically the "Error Data" table's "Value" column should be >>>>>> the same width as the "Reply Data" table's "Type" column. >>>>> >>>>> Maybe then make 1st column of "Error Data" the same width as (Type >>>>> + Name) columns in Out Data/Reply Data? >>>> >>>> That would not look very good IMHO. >>>> >>>> David >>>> ----- >>>> >>>>> Then "Description" column in all tables will be 65%. >>>>> >>>>> BTW just discovered at error in Constants tables - they have >>>>> column 20%, 5% and 65% - going to update the 2st column to be 30% >>>>> >>>>> --alex >>>>> >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> --alex >>>>>>> >>>>>>> On 08/26/2019 00:58, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Alex, >>>>>>>> >>>>>>>> I see one issue with new table format. >>>>>>>> For instance look at the table for "DisposeObjects Command (14)". >>>>>>>> Even a better example is "RedefineClasses Command (18)". >>>>>>>> In the old tables the indentation was highlighted with the >>>>>>>> vertical lines. >>>>>>>> It is missed in your version. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 8/23/19 16:54, Alex Menkov wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review the fix for >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8228554 >>>>>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/webrev/ >>>>>>>>> >>>>>>>>> >>>>>>>>> generated docs: >>>>>>>>> old: >>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/0/jdwp-protocol.html >>>>>>>>> >>>>>>>>> new: >>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/1/jdwp-protocol.html >>>>>>>>> >>>>>>>>> >>>>>>>>> specdiff: >>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/jdwp-protocol_accessibility/spectdiff/diff.html >>>>>>>>> >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> - "content outside of a region" issues: >>>>>>>>> ? -
replaced with with
, >>>>>>>>> ? -
replaced with
; >>>>>>>>> - table issues: >>>>>>>>> ? - added column headers to all tables; >>>>>>>>> ? - for every row specified row header; >>>>>>>>> ? - indentation with table "colspan" reimplemented by using CSS. >>>>>>>>> >>>>>>>>> --alex >>>>>>>> >> From hohensee at amazon.com Wed Aug 28 19:22:19 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 28 Aug 2019 19:22:19 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Message-ID: Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes. JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 Previous email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it. I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation. The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Wed Aug 28 20:21:00 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 28 Aug 2019 21:21:00 +0100 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: Message-ID: <4e4a608d-8074-5ac3-3cbc-d141934f27bb@oracle.com> On 28/08/2019 20:22, Hohensee, Paul wrote: > > Please review a performance improvement for > ThreadMXBean.getThreadAllocatedBytes and the addition of > getCurrentThreadAllocatedBytes. > > JBS issue:https://bugs.openjdk.java.net/browse/JDK-8207266 > > Webrev:http://cr.openjdk.java.net/~phh/8207266/webrev.00/ > > CSR:https://bugs.openjdk.java.net/browse/JDK-8230311 > > Previous email threads: > https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html > https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html > > The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d > be great for someone to review it. > I suspect the new method needs to be specified to be for the local management case only, and it might able to useful to specify it to be the equivalent of getThreadAllocatedBytes(Thread.currentThread().getId()). -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Wed Aug 28 20:40:45 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Wed, 28 Aug 2019 20:40:45 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <4e4a608d-8074-5ac3-3cbc-d141934f27bb@oracle.com> References: <4e4a608d-8074-5ac3-3cbc-d141934f27bb@oracle.com> Message-ID: <835FCE11-B2DC-4481-8FEB-436A0043B5E4@amazon.com> Thanks for taking a look, Alan. What does the ?local management case? mean? I can certainly specify getCurrentThreadAllocatedBytes as being equivalent to getThreadAllocatedBytes(Thread.currentThread().getId(). Paul From: Alan Bateman Date: Wednesday, August 28, 2019 at 1:24 PM To: "Hohensee, Paul" , OpenJDK Serviceability Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread On 28/08/2019 20:22, Hohensee, Paul wrote: Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes. JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 Previous email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it. I suspect the new method needs to be specified to be for the local management case only, and it might able to useful to specify it to be the equivalent of getThreadAllocatedBytes(Thread.currentThread().getId()). -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Wed Aug 28 22:58:43 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 28 Aug 2019 15:58:43 -0700 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: Message-ID: Hi Paul, The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature. Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature??? Can this be supported by all JVM implementation??? What is the overhead if this is enabled by default?? Does it need to be disabled??? This metric is from TLAB that might be okay.? This needs advice/discussion with GC experts. I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean. As Alan points out, current thread makes sense only in local VM management.? When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread? Mandy On 8/28/19 12:22 PM, Hohensee, Paul wrote: > > Please review a performance improvement for > ThreadMXBean.getThreadAllocatedBytes and the addition of > getCurrentThreadAllocatedBytes. > > JBS issue:https://bugs.openjdk.java.net/browse/JDK-8207266 > > Webrev:http://cr.openjdk.java.net/~phh/8207266/webrev.00/ > > CSR:https://bugs.openjdk.java.net/browse/JDK-8230311 > > Previous email threads: > https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html > https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html > > The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d > be great for someone to review it. > > I took Mandy?s advice and put the fast paths in the library code. I > added a new JMM method GetOneThreadsAllocatedBytes that works the same > as GetThreadCpuTime: it uses a thread_id value of zero to distinguish > the current thread. On my Mac laptop, the result runs 47x faster for > the current thread than the old implementation. > > The 3 tests intest/jdk/com/sun/management/ThreadMXBean all pass. I > added code to ThreadAllocatedMemory.java to test > getCurrentThreadAllocatedBytes as well as variations on > getThreadAllocatedBytes(id). A submit repo job is in progress. > > Thanks, > > Paul > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Aug 28 23:19:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 29 Aug 2019 09:19:36 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> Message-ID: <4201ba6d-0fb2-16cb-c375-26560c15e936@oracle.com> On 28/08/2019 10:57 pm, Doerr, Martin wrote: > Hi David, > >> Now why would a GCTaskThread being executing code that accesses >> JvmtiRawMonitors? Are we in some kind of event callback? > I believe so. The test registers the following callbacks: > callbacks.GarbageCollectionStart = &GarbageCollectionStart; > callbacks.GarbageCollectionFinish = &GarbageCollectionFinish; > And these callback functions use jvmti->RawMonitorEnter. > > Note that the spec for "Raw Monitor Enter" allows this: > "This function may be called from the callbacks to the Heap iteration functions, or from the event handlers for the GarbageCollectionStart, GarbageCollectionFinish, and ObjectFree events." Yep it is allowed. >> Is there any more stack? What is that dll? > The dll belongs to the test. I guess it was built without debug info so there's no native stack trace available. > >> Can you tell me what test this was and how to reproduce? > make run-test TEST="vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001" > I haven't tried if it can be reproduced well. May be sporadic. Doesn't seem to reproduce on Linux. > At least, I can confirm that the following comment is true ?? > // FIXME: this is broken - raw_enter only accepts the VMThread As I noted I had yet to reconcile the fact the outer raw monitor code seems to allow non-JavaThreads while the inner code only allows the VMThread! I'm quite surprised we have not encountered this bug before now - unless there is some really subtle code difference I'm missing here. Anyway this is an aside to the question of interrupt semantics that needs to be addressed before this can proceed. :) Thanks, David > Best regards, > Martin > From david.holmes at oracle.com Wed Aug 28 23:30:40 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 29 Aug 2019 09:30:40 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <6886d18c-00c1-b027-dee1-a161c9bf4a63@gmail.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> <6886d18c-00c1-b027-dee1-a161c9bf4a63@gmail.com> Message-ID: <9a36be5b-fdbd-b5b3-9759-3cf36a48c145@oracle.com> Hi Yasumasa, On 29/08/2019 12:14 am, Yasumasa Suenaga wrote: > Hi, > > I guess this issue was occurred on ObjectFree JVMTI event. > It is registered in a04t001.cpp [1]. I think it might be called on GC > worker because it relates to JvmtiExport::weak_oops_do(). > > In current code, JvmtiRawMonitor::_owner is handled with atomic operation. > However at JvmtiRawMonitor::quick_enter() in new webrev, it just > referent normally. > Should it be handled atomically? No. In the new code "_owner" is only an informational field used by the current thread to detect its own recursive lock usage. Actual ownership is determined by the successful lock or tryLock of the underlying PlatformMonitor. However I had been thinking about memory ordering and there are missing membars. We have to do: lock(); _owner=THREAD; _owner=NULL; unlock(); in the order specified so I need to insert storestore barriers. > > At least, I think this assert might be failed in current code because > raw monitor might be handled on GC worker. So "FIXME" comment Martin > found is valid... Yes. Though I'd still like to understand why we don't see the same assertion failure in existing code. But moving on ... what are your thoughts on the interrupt semantics? Thanks, David > > Thanks, > > Yasumasa > > > [1] > hg.openjdk.java.net/jdk/jdk/file/4f38fcd65577/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001/ap04t001.cpp#l219 > > > > On 2019/08/28 21:57, Doerr, Martin wrote: >> Hi David, >> >>> Now why would a GCTaskThread being executing code that accesses >>> JvmtiRawMonitors? Are we in some kind of event callback? >> I believe so. The test registers the following callbacks: >> ???? callbacks.GarbageCollectionStart = &GarbageCollectionStart; >> ???? callbacks.GarbageCollectionFinish = &GarbageCollectionFinish; >> And these callback functions use jvmti->RawMonitorEnter. >> >> Note that the spec for "Raw Monitor Enter" allows this: >> "This function may be called from the callbacks to the Heap iteration >> functions, or from the event handlers for the GarbageCollectionStart, >> GarbageCollectionFinish, and ObjectFree events." >> >>> Is there any more stack? What is that dll? >> The dll belongs to the test. I guess it was built without debug info >> so there's no native stack trace available. >> >>> Can you tell me what test this was and how to reproduce? >> make run-test >> TEST="vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001" >> I haven't tried if it can be reproduced well. May be sporadic. >> >> At least, I can confirm that the following comment is true ?? >> // FIXME: this is broken - raw_enter only accepts the VMThread >> >> Best regards, >> Martin >> From yasuenag at gmail.com Thu Aug 29 02:47:51 2019 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 29 Aug 2019 11:47:51 +0900 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <9a36be5b-fdbd-b5b3-9759-3cf36a48c145@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> <6886d18c-00c1-b027-dee1-a161c9bf4a63@gmail.com> <9a36be5b-fdbd-b5b3-9759-3cf36a48c145@oracle.com> Message-ID: Hi David, 2019?8?29?(?) 8:30 David Holmes : > Hi Yasumasa, > > On 29/08/2019 12:14 am, Yasumasa Suenaga wrote: > > Hi, > > > > I guess this issue was occurred on ObjectFree JVMTI event. > > It is registered in a04t001.cpp [1]. I think it might be called on GC > > worker because it relates to JvmtiExport::weak_oops_do(). > > > > In current code, JvmtiRawMonitor::_owner is handled with atomic > operation. > > However at JvmtiRawMonitor::quick_enter() in new webrev, it just > > referent normally. > > Should it be handled atomically? > > No. In the new code "_owner" is only an informational field used by the > current thread to detect its own recursive lock usage. Actual ownership > is determined by the successful lock or tryLock of the underlying > PlatformMonitor. > > However I had been thinking about memory ordering and there are missing > membars. We have to do: > > lock(); _owner=THREAD; > _owner=NULL; unlock(); > > in the order specified so I need to insert storestore barriers. > Sounds good. > > > At least, I think this assert might be failed in current code because > > raw monitor might be handled on GC worker. So "FIXME" comment Martin > > found is valid... > > Yes. Though I'd still like to understand why we don't see the same > assertion failure in existing code. > > But moving on ... what are your thoughts on the interrupt semantics? > I think the interruption for raw monitor can be option, and behavioral change does not give huge impact for JVMTI native agent developers. I have not used raw monitor with Thread::interrupt because raw monitor API is C/C++. So they are closed in JVMTI agent code in my experience. I used raw monitor in HeapStats [1] prototype in the past. It is for kicking callback functions of GarbageCollectionStart/Finish like a semaphore. So I do not need interruption for it. Especially in JVMTI event callbacks, raw monitor(s) do not need to expose to Java layer. Thanks, Yasumasa [1] https://icedtea.classpath.org/wiki/HeapStats Thanks, > David > > > > > > Thanks, > > > > Yasumasa > > > > > > [1] > > > hg.openjdk.java.net/jdk/jdk/file/4f38fcd65577/test/hotspot/jtreg/vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001/ap04t001.cpp#l219 > > > > > > > > On 2019/08/28 21:57, Doerr, Martin wrote: > >> Hi David, > >> > >>> Now why would a GCTaskThread being executing code that accesses > >>> JvmtiRawMonitors? Are we in some kind of event callback? > >> I believe so. The test registers the following callbacks: > >> callbacks.GarbageCollectionStart = &GarbageCollectionStart; > >> callbacks.GarbageCollectionFinish = &GarbageCollectionFinish; > >> And these callback functions use jvmti->RawMonitorEnter. > >> > >> Note that the spec for "Raw Monitor Enter" allows this: > >> "This function may be called from the callbacks to the Heap iteration > >> functions, or from the event handlers for the GarbageCollectionStart, > >> GarbageCollectionFinish, and ObjectFree events." > >> > >>> Is there any more stack? What is that dll? > >> The dll belongs to the test. I guess it was built without debug info > >> so there's no native stack trace available. > >> > >>> Can you tell me what test this was and how to reproduce? > >> make run-test > >> TEST="vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001" > >> I haven't tried if it can be reproduced well. May be sporadic. > >> > >> At least, I can confirm that the following comment is true ?? > >> // FIXME: this is broken - raw_enter only accepts the VMThread > >> > >> Best regards, > >> Martin > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Aug 29 02:54:56 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 28 Aug 2019 19:54:56 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> Message-ID: <0c01c46b-3ff9-3016-8791-868019459d13@oracle.com> An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Thu Aug 29 07:18:49 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 29 Aug 2019 08:18:49 +0100 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: Message-ID: <9b62699c-0aac-a8fd-27a6-a8d4d3820dd2@oracle.com> On 28/08/2019 23:58, Mandy Chung wrote: > Hi Paul, > > The CSR proposes this method in java.lang.management.ThreadMXBean as a > Java SE feature. > > Has this been discussed with the GC team to commit measuring current > thread's allocated bytes as Java SE feature??? Can this be supported > by all JVM implementation??? What is the overhead if this is enabled > by default?? Does it need to be disabled??? This metric is from TLAB > that might be okay.? This needs advice/discussion with GC experts. The webrev adds it to jdk.management/com.sun.management.ThreadMXBean so I suspect it is a typo in the CSR and the proposal is for it to be JDK-specific. -Alan. From martin.doerr at sap.com Thu Aug 29 11:06:46 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 29 Aug 2019 11:06:46 +0000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: <4201ba6d-0fb2-16cb-c375-26560c15e936@oracle.com> References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> <4201ba6d-0fb2-16cb-c375-26560c15e936@oracle.com> Message-ID: Hi David, shouldn't _recursions get set to 0 before unlocking in raw_wait? Best regards, Martin > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 29. August 2019 01:20 > To: Doerr, Martin ; serguei.spitsyn at oracle.com; > serviceability-dev ; > jcbeyler at google.com; yasuenag at gmail.com; sgehwolf at redhat.com > Subject: Re: RFC: 8229160: Reimplement JvmtiRawMonitor to use > PlatformMonitor > > On 28/08/2019 10:57 pm, Doerr, Martin wrote: > > Hi David, > > > >> Now why would a GCTaskThread being executing code that accesses > >> JvmtiRawMonitors? Are we in some kind of event callback? > > I believe so. The test registers the following callbacks: > > callbacks.GarbageCollectionStart = &GarbageCollectionStart; > > callbacks.GarbageCollectionFinish = &GarbageCollectionFinish; > > And these callback functions use jvmti->RawMonitorEnter. > > > > Note that the spec for "Raw Monitor Enter" allows this: > > "This function may be called from the callbacks to the Heap iteration > functions, or from the event handlers for the GarbageCollectionStart, > GarbageCollectionFinish, and ObjectFree events." > > Yep it is allowed. > > >> Is there any more stack? What is that dll? > > The dll belongs to the test. I guess it was built without debug info so there's > no native stack trace available. > > > >> Can you tell me what test this was and how to reproduce? > > make run-test > TEST="vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001" > > I haven't tried if it can be reproduced well. May be sporadic. > > Doesn't seem to reproduce on Linux. > > > At least, I can confirm that the following comment is true ?? > > // FIXME: this is broken - raw_enter only accepts the VMThread > > As I noted I had yet to reconcile the fact the outer raw monitor code > seems to allow non-JavaThreads while the inner code only allows the > VMThread! I'm quite surprised we have not encountered this bug before > now - unless there is some really subtle code difference I'm missing here. > > Anyway this is an aside to the question of interrupt semantics that > needs to be addressed before this can proceed. :) > > Thanks, > David > > > Best regards, > > Martin > > From david.holmes at oracle.com Thu Aug 29 11:53:20 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 29 Aug 2019 21:53:20 +1000 Subject: RFC: 8229160: Reimplement JvmtiRawMonitor to use PlatformMonitor In-Reply-To: References: <842a1d43-bdcc-3345-2731-d92b477e3ad3@oracle.com> <9946e675-6e1e-3644-e0e9-e96c54313e3f@oracle.com> <2d34d72b-2d2b-3907-2b69-86507664c697@oracle.com> <0c0dce1f-27c8-415e-4157-29108b05eb4e@oracle.com> <4201ba6d-0fb2-16cb-c375-26560c15e936@oracle.com> Message-ID: <5eecfb2d-a1de-1b78-2215-5d2a21ee5c2c@oracle.com> Hi Martin, On 29/08/2019 9:06 pm, Doerr, Martin wrote: > Hi David, > > shouldn't _recursions get set to 0 before unlocking in raw_wait? Yes. And there's a missing assertion that _recursions==0 in quick-enter. Thanks, David > Best regards, > Martin > > >> -----Original Message----- >> From: David Holmes >> Sent: Donnerstag, 29. August 2019 01:20 >> To: Doerr, Martin ; serguei.spitsyn at oracle.com; >> serviceability-dev ; >> jcbeyler at google.com; yasuenag at gmail.com; sgehwolf at redhat.com >> Subject: Re: RFC: 8229160: Reimplement JvmtiRawMonitor to use >> PlatformMonitor >> >> On 28/08/2019 10:57 pm, Doerr, Martin wrote: >>> Hi David, >>> >>>> Now why would a GCTaskThread being executing code that accesses >>>> JvmtiRawMonitors? Are we in some kind of event callback? >>> I believe so. The test registers the following callbacks: >>> callbacks.GarbageCollectionStart = &GarbageCollectionStart; >>> callbacks.GarbageCollectionFinish = &GarbageCollectionFinish; >>> And these callback functions use jvmti->RawMonitorEnter. >>> >>> Note that the spec for "Raw Monitor Enter" allows this: >>> "This function may be called from the callbacks to the Heap iteration >> functions, or from the event handlers for the GarbageCollectionStart, >> GarbageCollectionFinish, and ObjectFree events." >> >> Yep it is allowed. >> >>>> Is there any more stack? What is that dll? >>> The dll belongs to the test. I guess it was built without debug info so there's >> no native stack trace available. >>> >>>> Can you tell me what test this was and how to reproduce? >>> make run-test >> TEST="vmTestbase/nsk/jvmti/scenarios/allocation/AP04/ap04t001" >>> I haven't tried if it can be reproduced well. May be sporadic. >> >> Doesn't seem to reproduce on Linux. >> >>> At least, I can confirm that the following comment is true ?? >>> // FIXME: this is broken - raw_enter only accepts the VMThread >> >> As I noted I had yet to reconcile the fact the outer raw monitor code >> seems to allow non-JavaThreads while the inner code only allows the >> VMThread! I'm quite surprised we have not encountered this bug before >> now - unless there is some really subtle code difference I'm missing here. >> >> Anyway this is an aside to the question of interrupt semantics that >> needs to be addressed before this can proceed. :) >> >> Thanks, >> David >> >>> Best regards, >>> Martin >>> From adam.farley at uk.ibm.com Thu Aug 29 13:26:01 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Thu, 29 Aug 2019 14:26:01 +0100 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: <0c01c46b-3ff9-3016-8791-868019459d13@oracle.com> References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <0c01c46b-3ff9-3016-8791-868019459d13@oracle.com> Message-ID: Hi Serguei, I haven't actually run a fastdebug build before. Will do that now and address the issues. Once done, I'll re-run the tests I ran, and also the tests you've listed below. Can you advise on how "good coverage" is determined, so I know for future bug fixes? As for the up-to-date-ness, I'll update the build before doing the above. Expect a webrev once all of this is complete. Best Regards Adam Farley IBM Runtimes "serguei.spitsyn at oracle.com" wrote on 29/08/2019 03:54:56: > From: "serguei.spitsyn at oracle.com" > To: Adam Farley8 > Cc: Chris Plummer , > daniel.daugherty at oracle.com, serviceability-dev at openjdk.java.net > Date: 29/08/2019 04:23 > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > quietly truncates on buffer overflow > > Hi Adam, > > Sorry for the latency. > I was in process to build, test and push your fix and got the > fastdebug build errors below. > > So, my question is if you've ever built the fastdebug version. > This change is in the system-dependent code, so it has to be tested > on both Unix and Windows. > > > My testing was limited to the bug specific test case I mentioned, > and the following jdwp tests: > > > > test/jdk/com/sun/jdi/Jdwp* > > test/hotspot/jtreg/serviceability/jdwp > > This set of tests does not provide a good coverage. > To make sure nothing is broken you need to run the the test/jdk/com/sun/jdi > and also the following vmTestbase tests: > > test/hotspot/jtreg/vmTestbase/nsk/jdi > test/hotspot/jtreg/vmTestbase/nsk/jdb > test/hotspot/jtreg/vmTestbase/nsk/jdwp > > BTW, your current webrev is not up-to-date: > http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > I guess, the change in the src/hotspot/share/runtime/os.cpp became > obsolete after your previous fix that was already pushed. > > Thanks, > Serguei > > . . . > In file included from /scratch/sspitsyn/jdk14.1/open/src/ > jdk.jdwp.agent/unix/native/libjdwp/linker_md.c:37:0: > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > libjdwp/linker_md.c: In function ?dll_build_name?: > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > libjdwp/util.h:46:23: error: ?Do? undeclared (first use in this function) > #define strdup(p) Do not use this interface. > ^ > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > paths_copy = strdup(paths); > ^ > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > libjdwp/util.h:46:23: note: each undeclared identifier is reported > only once for each function it appears in > #define strdup(p) Do not use this interface. > ^ > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > paths_copy = strdup(paths); > ^ > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > libjdwp/util.h:46:26: error: expected ?;? before ?not? > #define strdup(p) Do not use this interface. > ^ > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > libjdwp/linker_md.c:51:18: note: in expansion of macro ?strdup? > paths_copy = strdup(paths); > ^ > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/share/native/ > libjdwp/util.h:38:24: error: expected ?;? before ?not? > #define free(p) Do not use this interface. > ^ > /scratch/sspitsyn/jdk14.1/open/src/jdk.jdwp.agent/unix/native/ > libjdwp/linker_md.c:71:5: note: in expansion of macro ?free? > free(paths_copy); > ^ > gmake[3]: *** [/scratch/sspitsyn/jdk14.1/build/linux-x86_64-server- > fastdebug/support/native/jdk.jdwp.agent/libjdwp/linker_md.o] Error 1 > gmake[2]: *** [jdk.jdwp.agent-libs] Error 1 > gmake[2]: *** Waiting for unfinished jobs.... > > ERROR: Build failed for target 'images' in configuration 'linux- > x86_64-server-fastdebug' (exit code 2) > > > > On 8/13/19 09:28, Adam Farley8 wrote: > Hi Serguei, Daniel, > > My testing was limited to the bug specific test case I mentioned, > and the following jdwp tests: > > test/jdk/com/sun/jdi/Jdwp* > test/hotspot/jtreg/serviceability/jdwp > > Best Regards > > Adam Farley > IBM Runtimes > > > "serguei.spitsyn at oracle.com" wrote on > 13/08/2019 17:04:43: > > > From: "serguei.spitsyn at oracle.com" > > To: daniel.daugherty at oracle.com, Adam Farley8 > > , Chris Plummer > > Cc: serviceability-dev at openjdk.java.net > > Date: 13/08/2019 17:08 > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > quietly truncates on buffer overflow > > > > Hi Adam, > > > > I'm looking at your fix. > > Also interested about your testing. > > > > Thanks, > > Serguei > > > > On 8/13/19 08:48, Daniel D. Daugherty wrote: > > I don't see any information about how this change was tested... > > Is there something on another email thread? > > > > Dan > > > > > On 8/13/19 11:41 AM, Adam Farley8 wrote: > > Hi Chris, > > > > Thanks! > > > > I understand we need a second reviewer/sponsor to get this change > > in. Any volunteers? > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > > > Chris Plummer wrote on 12/08/2019 21:35:06: > > > > > From: Chris Plummer > > > To: Adam Farley8 , serviceability- > > dev at openjdk.java.net > > > Date: 12/08/2019 21:35 > > > Subject: Re: RFR: 8229378: jdwp library loader in linker_md.c > > > quietly truncates on buffer overflow > > > > > > Hi Adam, > > > > > > It looks good to me. > > > > > > thanks, > > > > > > Chris > > > > > > On 8/12/19 7:34 AM, Adam Farley8 wrote: > > > Hi All, > > > > > > This is a known bug, mentioned in a code comment. > > > > > > Here is the fix for that bug. > > > > > > Reviewers and sponsors requested. > > > > > > Short version: if you set sun.boot.library.path to > > > something beyond a system's max path length, the > > > current code will return an empty string (rather than > > > printing a useful error message and shutting down). > > > > > > This is also a problem if you've specified multiple > > > paths with a separator, as this code seems to wrongly > > > assess whether the *total* length exceeds max path > > > length. So two 200 char paths on windows will cause > > > failure, as the total length is 400 (which is beyond > > > max length for windows). > > > > > > Note that the os.cpp bit of the webrev will not be included > > > in the final webrev, it just makes this change trivially > > > testable. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8229378 > > > Webrev: http://cr.openjdk.java.net/~afarley/8229378/webrev/ > > > > > > > > > Best Regards > > > > > > Adam Farley > > > IBM Runtimes > > > > > > Unless stated otherwise above: > > > IBM United Kingdom Limited - Registered in England and Wales with > > > number 741598. > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with > > number 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with > number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikhailo.seledtsov at oracle.com Thu Aug 29 15:41:57 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 29 Aug 2019 08:41:57 -0700 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: <5877bb3c-ede3-ce33-9df9-391b3997a6ac@oracle.com> References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> <3143B636-9729-4238-9149-3A562B288643@oracle.com> <5877bb3c-ede3-ce33-9df9-391b3997a6ac@oracle.com> Message-ID: <2a8d9d95-3ca9-5792-b4ea-d8a800ce51ce@oracle.com> I believe I need a second reviewer for this change. Could someone, please, review this change version 2 ? (David already reviewed it). http://cr.openjdk.java.net/~mseledtsov/8228960.02/ Thank you in advance, Misha On 8/26/19 12:32 PM, mikhailo.seledtsov at oracle.com wrote: > Hi David, > > ? Thank you for review. > > On 8/26/19 12:57 AM, David Holmes wrote: >> Hi Misha, >> >> On 24/08/2019 3:21 am, mikhailo.seledtsov at oracle.com wrote: >>> Finally got some time to work on this issue. >>> Since I have encountered problem using files for passing messages >>> between a container and a test driver (due to permissions), I looked >>> for alternative solutions. I am using the output of a container >>> process to signal when the main method has started, and it works. >>> This simplifies things quite a bit as well. >>> >>> Normally, we use OutputAnalyzer test utility to collect the whole >>> output once the process has completed, and then analyze the >>> resulting output for "contains some string", match, etc. However, >>> testutils/ProcessTools provides an API to consume the output as it >>> is produced. I am using this API to detect when the main() method of >>> the container has started. >> >> That seems reasonable. Do we want to make the following change to >> minimise unneeded output processing: >> >> ???????? private Consumer outputConsumer = s -> { >> !??????????? if (!mainMethodStarted && >> s.contains(EventGeneratorLoop.MAIN_METHOD_STARTED)) { >> ???????????????? System.out.println("MainContainer: setting >> mainMethodStarted"); >> ???????????????? mainMethodStarted = true; >> ???????????? } >> ???????? }; > Thank you for the suggestion. I will update the code accordingly. >> >>> Updated webrev: >>> ???? http://cr.openjdk.java.net/~mseledtsov/8228960.02/ >> >> Otherwise looks okay. Hopefully those other test cases will be >> enabled in the not too distant future. > > I hope so as well. > > > Thank you, > > Misha > >> >> Thanks, >> David >> ----- >> >>> >>> Testing: >>> >>> ?? Ran the test on Linux-x64, various multiple nodes in a test >>> cluster 50 times - All PASS >>> >>> >>> Thank you, >>> >>> Misha >>> >>> On 8/13/19 2:05 PM, Bob Vandette wrote: >>>> >>>>> On Aug 13, 2019, at 3:28 PM, mikhailo.seledtsov at oracle.com wrote: >>>>> >>>>> >>>>> On 8/13/19 12:06 PM, Bob Vandette wrote: >>>>>>> On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>> >>>>>>> Hi Bob, >>>>>>> >>>>>>> ?? The workdir (JTwork/scratch) is created with the "test user" >>>>>>> permissions. Let me try to place the "signal" file in /tmp >>>>>>> instead, since /tmp should normally have a 777 permission on Linux. >>>>>> Aren?t you creating a file inside a docker container and then >>>>>> checking for its existence outside of the container? >>>>> Correct >>>>>> Isn?t the root user running inside the container? >>>>> By default it is. But it still fails to create a file, for some >>>>> reason. Can be related to selinux settings (for instance, see this >>>>> article: >>>>> https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443), >>>>> I can not change those. >>>> Is your JTWork/scratch on an NFS mounted file system?? If this is >>>> the case then the problem is that root is equivalent to nobody on >>>> mounted file systems and can?t create files unless the directory >>>> has 777 permissions.? I just confirmed this. You?d have to either run >>>> the container test as test-user or change the scratch directory >>>> permission. >>>> >>>> Bob. >>>> >>>>> My hope is that /tmp is configured to be accessed by a container >>>>> engine as a general purpose directory, hence I was thinking to try >>>>> it out. >>>>> >>>>>> Both processes don?t see the same /tmp right??? So that shouldn?t >>>>>> help. >>>>> In my next experiment, I will map a /tmp from host to be a >>>>> /host-tmp inside the container (--volume /tmp:/host-tmp), then >>>>> write a signal file to /host-tmp. >>>>>> If scratch has 777 permissions, anyone can create a file. >>>>> scratch has? "rwxr-xr-x" >>>>>> You have to be careful that you can clean up the >>>>>> file from outside the container.? I?d make sure to create it with >>>>>> 777. >>>>> I do use deleteOnExit(), so it should work (unless the JVM >>>>> crashes). I guess I could add extra layer of safety here, and set >>>>> the permissions to 777. Thank you for advice. >>>>> >>>>> >>>>> Thank you, >>>>> >>>>> Misha >>>>> >>>>>> Bob. >>>>>> >>>>>>> If this works, I will have to add some unique number to the file >>>>>>> name, perhaps a PID of a child process. >>>>>>> >>>>>>> I will try this, and let you know how it works. >>>>>>> >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> Misha >>>>>>> >>>>>>> On 8/13/19 6:34 AM, Bob Vandette wrote: >>>>>>>> Sorry, I just looked at the webrev and you are trying the >>>>>>>> approach I suggested. I thought you >>>>>>>> were trying to use file change notification. >>>>>>>> >>>>>>>> Where does the workdir get created?? Does it have 777 permissions? >>>>>>>> >>>>>>>> Bob. >>>>>>>> >>>>>>>> >>>>>>>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> What if you just poll for the creation of the file waiting >>>>>>>>> some small amount of time between polling with a maximum timeout. >>>>>>>>> >>>>>>>>> Bob. >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Unfortunately, this approach does not seem to work on many of >>>>>>>>>> our test cluster machines. The creation of a "signal" file >>>>>>>>>> results in "PermissionDenied". >>>>>>>>>> >>>>>>>>>> The possible reason is the selinux configuration, or some >>>>>>>>>> other permission related stuff. The container tries to create >>>>>>>>>> a new file on a mounted volume on a host system, but host >>>>>>>>>> system denies it. I will look a bit deeper into this, but I >>>>>>>>>> think this type of issue can be encountered on any automated >>>>>>>>>> test system. Hence, we may have to abandon this approach. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Misha >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>> Here is an updated webrev: >>>>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>>>>>>>> >>>>>>>>>>> I am using a simple file-based mechanism to communicate >>>>>>>>>>> between the processes. The "EventGeneratorLoop" process >>>>>>>>>>> creates a specific "signal" file on a shared mounted volume, >>>>>>>>>>> while the main test process waits? for the file to exist >>>>>>>>>>> before running the test cases. >>>>>>>>>>> >>>>>>>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test >>>>>>>>>>> cluster is in progress. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thank you, >>>>>>>>>>> >>>>>>>>>>> Misha >>>>>>>>>>> >>>>>>>>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>>>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>>>>>>>> Hi Severin, Bob, >>>>>>>>>>>>> >>>>>>>>>>>>> ?? Thank you for reviewing the code. >>>>>>>>>>>>> >>>>>>>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>>>>>>>> Can?t you come up with a better way of synchronizing the >>>>>>>>>>>>>> test by possibly writing a >>>>>>>>>>>>>> file and waiting for it to exist with a timeout? >>>>>>>>>>>>> I will try out this approach. >>>>>>>>>>>> This seems like a fundamental problem with jcmd - so cc'ing >>>>>>>>>>>> serviceability-dev. >>>>>>>>>>>> >>>>>>>>>>>> But I'm pretty sure they recently addressed a similar issue >>>>>>>>>>>> with the premature sending of the attach signal? >>>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> ----- >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Misha >>>>>>>>>>>>>> Isn?t there a shared volume between the two >>>>>>>>>>>>>> processes? >>>>>>>>>>>>>> >>>>>>>>>>>>>> We?ve been fighting test reliability for a while now.? I >>>>>>>>>>>>>> can only hope we?re getting >>>>>>>>>>>>>> to the end. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Bob. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin >>>>>>>>>>>>>>> Gehwolf wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Misha, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, >>>>>>>>>>>>>>> mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>>>>>> Please review this change that fixes a container test >>>>>>>>>>>>>>>> TestJcmdWithSideCar. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> My investigation indicated that a root cause for this >>>>>>>>>>>>>>>> failure is: >>>>>>>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main >>>>>>>>>>>>>>>> class has not >>>>>>>>>>>>>>>> been loaded yet. >>>>>>>>>>>>>>>> The target test JVM has started, it is initializing, >>>>>>>>>>>>>>>> but has not loaded >>>>>>>>>>>>>>>> the main test class. >>>>>>>>>>>>>>> That's what I've found too. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The proposed solution is to try 'jcmd -l' several >>>>>>>>>>>>>>>> times, with a short >>>>>>>>>>>>>>>> sleep in between. >>>>>>>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an >>>>>>>>>>>>>>> alternative. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Also I have commented out the testCase02() due to >>>>>>>>>>>>>>>> another bug: >>>>>>>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to >>>>>>>>>>>>>>>> s.j.h.oops.Instance", >>>>>>>>>>>>>>>> which is not a test bug. IMO, it is better to run the >>>>>>>>>>>>>>>> test and skip a >>>>>>>>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>>>>>>>> ???? Webrev: >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>>>>>>>> Looks OK to me. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Severin >>>>>>>>>>>>>>> From hohensee at amazon.com Thu Aug 29 17:01:17 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 29 Aug 2019 17:01:17 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: Message-ID: My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in com.sun.management.ThreadMXBean along with the current two getThreadAllocatedBytes methods for the reasons you list. I?ve updated the CSR to specify com.sun.management and added a rationale. AllocatedBytes is currently enabled by Hotspot by default because the overhead of recording TLAB occupancy is negligible. There?s no new GC code, nor will there be, so imo we don?t have to involve the GC folks. I.e., the new JMM method GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes JavaThread method, and getCurrentThreadAllocatedBytes is the same as getThreadAllocatedBytes: it just bypasses the thread lookup code. I hadn?t tracked down what happens when getCurrentThreadUserTime and getCurrentThreadCpuTime are called before, but if I?m not mistaken, it the code in jcmd() in attachListener.cpp will call GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use Thread::current() as the subject of the call, see os::current_thread_cpu_time in os_linux.cpp. That means that the CurrentThread methods should work remotely the same way they do locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as its subject when called on behalf of getCurrentThreadAllocatedBytes, so it will also uses the current remote Java thread. Even if these methods only worked locally, there are many setups where apps are self-monitoring that could use the performance improvement. Thanks, Paul From: Mandy Chung Date: Wednesday, August 28, 2019 at 3:59 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature. Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature? Can this be supported by all JVM implementation? What is the overhead if this is enabled by default? Does it need to be disabled? This metric is from TLAB that might be okay. This needs advice/discussion with GC experts. I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean. As Alan points out, current thread makes sense only in local VM management. When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread? Mandy On 8/28/19 12:22 PM, Hohensee, Paul wrote: Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes. JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 Previous email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it. I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation. The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob.vandette at oracle.com Thu Aug 29 17:55:20 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 29 Aug 2019 13:55:20 -0400 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: <2a8d9d95-3ca9-5792-b4ea-d8a800ce51ce@oracle.com> References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> <3143B636-9729-4238-9149-3A562B288643@oracle.com> <5877bb3c-ede3-ce33-9df9-391b3997a6ac@oracle.com> <2a8d9d95-3ca9-5792-b4ea-d8a800ce51ce@oracle.com> Message-ID: Misha, Looks good. A couple of nits. 1. You might want to remove hotspot_containers section of ProblemList.txt since there are no bugs listed. 2. Can you make this timeout a constant just like TIME_TO_RUN_MAIN_PROCESS 73 mainContainer.waitForMainMethodStart(5*1000); 3. assertIsAlive() is not used except for the commented out tests. Do you think you?ll ultimately use this method or is this left over from previous attempts? 222 public void assertIsAlive() throws Exception { Bob. > On Aug 29, 2019, at 11:41 AM, mikhailo.seledtsov at oracle.com wrote: > > I believe I need a second reviewer for this change. Could someone, please, review this change version 2 ? (David already reviewed it). > > http://cr.openjdk.java.net/~mseledtsov/8228960.02/ > > > Thank you in advance, > > Misha > > > On 8/26/19 12:32 PM, mikhailo.seledtsov at oracle.com wrote: >> Hi David, >> >> Thank you for review. >> >> On 8/26/19 12:57 AM, David Holmes wrote: >>> Hi Misha, >>> >>> On 24/08/2019 3:21 am, mikhailo.seledtsov at oracle.com wrote: >>>> Finally got some time to work on this issue. >>>> Since I have encountered problem using files for passing messages between a container and a test driver (due to permissions), I looked for alternative solutions. I am using the output of a container process to signal when the main method has started, and it works. This simplifies things quite a bit as well. >>>> >>>> Normally, we use OutputAnalyzer test utility to collect the whole output once the process has completed, and then analyze the resulting output for "contains some string", match, etc. However, testutils/ProcessTools provides an API to consume the output as it is produced. I am using this API to detect when the main() method of the container has started. >>> >>> That seems reasonable. Do we want to make the following change to minimise unneeded output processing: >>> >>> private Consumer outputConsumer = s -> { >>> ! if (!mainMethodStarted && s.contains(EventGeneratorLoop.MAIN_METHOD_STARTED)) { >>> System.out.println("MainContainer: setting mainMethodStarted"); >>> mainMethodStarted = true; >>> } >>> }; >> Thank you for the suggestion. I will update the code accordingly. >>> >>>> Updated webrev: >>>> http://cr.openjdk.java.net/~mseledtsov/8228960.02/ >>> >>> Otherwise looks okay. Hopefully those other test cases will be enabled in the not too distant future. >> >> I hope so as well. >> >> >> Thank you, >> >> Misha >> >>> >>> Thanks, >>> David >>> ----- >>> >>>> >>>> Testing: >>>> >>>> Ran the test on Linux-x64, various multiple nodes in a test cluster 50 times - All PASS >>>> >>>> >>>> Thank you, >>>> >>>> Misha >>>> >>>> On 8/13/19 2:05 PM, Bob Vandette wrote: >>>>> >>>>>> On Aug 13, 2019, at 3:28 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>> >>>>>> >>>>>> On 8/13/19 12:06 PM, Bob Vandette wrote: >>>>>>>> On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>> >>>>>>>> Hi Bob, >>>>>>>> >>>>>>>> The workdir (JTwork/scratch) is created with the "test user" permissions. Let me try to place the "signal" file in /tmp instead, since /tmp should normally have a 777 permission on Linux. >>>>>>> Aren?t you creating a file inside a docker container and then checking for its existence outside of the container? >>>>>> Correct >>>>>>> Isn?t the root user running inside the container? >>>>>> By default it is. But it still fails to create a file, for some reason. Can be related to selinux settings (for instance, see this article: https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443), I can not change those. >>>>> Is your JTWork/scratch on an NFS mounted file system? If this is the case then the problem is that root is equivalent to nobody on >>>>> mounted file systems and can?t create files unless the directory has 777 permissions. I just confirmed this. You?d have to either run >>>>> the container test as test-user or change the scratch directory permission. >>>>> >>>>> Bob. >>>>> >>>>>> My hope is that /tmp is configured to be accessed by a container engine as a general purpose directory, hence I was thinking to try it out. >>>>>> >>>>>>> Both processes don?t see the same /tmp right? So that shouldn?t help. >>>>>> In my next experiment, I will map a /tmp from host to be a /host-tmp inside the container (--volume /tmp:/host-tmp), then write a signal file to /host-tmp. >>>>>>> If scratch has 777 permissions, anyone can create a file. >>>>>> scratch has "rwxr-xr-x" >>>>>>> You have to be careful that you can clean up the >>>>>>> file from outside the container. I?d make sure to create it with 777. >>>>>> I do use deleteOnExit(), so it should work (unless the JVM crashes). I guess I could add extra layer of safety here, and set the permissions to 777. Thank you for advice. >>>>>> >>>>>> >>>>>> Thank you, >>>>>> >>>>>> Misha >>>>>> >>>>>>> Bob. >>>>>>> >>>>>>>> If this works, I will have to add some unique number to the file name, perhaps a PID of a child process. >>>>>>>> >>>>>>>> I will try this, and let you know how it works. >>>>>>>> >>>>>>>> >>>>>>>> Thank you, >>>>>>>> >>>>>>>> Misha >>>>>>>> >>>>>>>> On 8/13/19 6:34 AM, Bob Vandette wrote: >>>>>>>>> Sorry, I just looked at the webrev and you are trying the approach I suggested. I thought you >>>>>>>>> were trying to use file change notification. >>>>>>>>> >>>>>>>>> Where does the workdir get created? Does it have 777 permissions? >>>>>>>>> >>>>>>>>> Bob. >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette wrote: >>>>>>>>>> >>>>>>>>>> What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. >>>>>>>>>> >>>>>>>>>> Bob. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>> >>>>>>>>>>> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". >>>>>>>>>>> >>>>>>>>>>> The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Misha >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>>>>>>>>> >>>>>>>>>>>> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >>>>>>>>>>>> >>>>>>>>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thank you, >>>>>>>>>>>> >>>>>>>>>>>> Misha >>>>>>>>>>>> >>>>>>>>>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>>>>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>>>>>>>>> Hi Severin, Bob, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for reviewing the code. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>>>>>>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>>>>>>>>>>>> file and waiting for it to exist with a timeout? >>>>>>>>>>>>>> I will try out this approach. >>>>>>>>>>>>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>>>>>>>>>>>> >>>>>>>>>>>>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>>>>>>>>>>>> >>>>>>>>>>>>> David >>>>>>>>>>>>> ----- >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Misha >>>>>>>>>>>>>>> Isn?t there a shared volume between the two >>>>>>>>>>>>>>> processes? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>>>>>>>>>>>> to the end. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Bob. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Misha, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>>>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>>>>>>>>>>>> been loaded yet. >>>>>>>>>>>>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>>>>>>>>>>>> the main test class. >>>>>>>>>>>>>>>> That's what I've found too. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>>>>>>>>>>>> sleep in between. >>>>>>>>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>>>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>>>>>>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>>>>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>>>>>>>>> Looks OK to me. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Severin >>>>>>>>>>>>>>>> From hohensee at amazon.com Thu Aug 29 18:05:04 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Thu, 29 Aug 2019 18:05:04 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <9b62699c-0aac-a8fd-27a6-a8d4d3820dd2@oracle.com> References: <9b62699c-0aac-a8fd-27a6-a8d4d3820dd2@oracle.com> Message-ID: Yes. See previous email. Thanks, ?On 8/29/19, 12:19 AM, "Alan Bateman" wrote: On 28/08/2019 23:58, Mandy Chung wrote: > Hi Paul, > > The CSR proposes this method in java.lang.management.ThreadMXBean as a > Java SE feature. > > Has this been discussed with the GC team to commit measuring current > thread's allocated bytes as Java SE feature? Can this be supported > by all JVM implementation? What is the overhead if this is enabled > by default? Does it need to be disabled? This metric is from TLAB > that might be okay. This needs advice/discussion with GC experts. The webrev adds it to jdk.management/com.sun.management.ThreadMXBean so I suspect it is a typo in the CSR and the proposal is for it to be JDK-specific. -Alan. From serguei.spitsyn at oracle.com Thu Aug 29 18:38:02 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 29 Aug 2019 11:38:02 -0700 Subject: RFR: 8229378: jdwp library loader in linker_md.c quietly truncates on buffer overflow In-Reply-To: References: <9bab421c-08a1-b554-6ac9-35290856ee56@oracle.com> <0c01c46b-3ff9-3016-8791-868019459d13@oracle.com> Message-ID: <3ea2e6ad-a32d-603c-258c-985da4e2f50a@oracle.com> An HTML attachment was scrubbed... URL: From mikhailo.seledtsov at oracle.com Thu Aug 29 21:13:42 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Thu, 29 Aug 2019 14:13:42 -0700 Subject: RFR(S): 8228960: [TESTBUG] containers/docker/TestJcmdWithSideCar.java: jcmd reports main class as 'Unknown' In-Reply-To: References: <9c9ec6b1-139a-4e67-8460-b586456fd493@oracle.com> <854c1dfe97937389230599e72c22aa0cbb0bf600.camel@redhat.com> <5D4B58E9.2070002@oracle.com> <9418a5e6-750d-80bf-4060-d023812f37e7@oracle.com> <6cf3a353-e5c5-28e9-113b-98ded3c1f9e7@oracle.com> <3143B636-9729-4238-9149-3A562B288643@oracle.com> <5877bb3c-ede3-ce33-9df9-391b3997a6ac@oracle.com> <2a8d9d95-3ca9-5792-b4ea-d8a800ce51ce@oracle.com> Message-ID: <52e8ad59-e7f4-586c-72fd-8a32a897ba95@oracle.com> Hi Bob, ? Thank you for review. On 8/29/19 10:55 AM, Bob Vandette wrote: > Misha, > > Looks good. A couple of nits. > > 1. You might want to remove hotspot_containers section of ProblemList.txt since there are no bugs listed. I can remove that section. > 2. Can you make this timeout a constant just like TIME_TO_RUN_MAIN_PROCESS > > 73 mainContainer.waitForMainMethodStart(5*1000); Will do. > > 3. assertIsAlive() is not used except for the commented out tests. Do you think you?ll ultimately use > this method or is this left over from previous attempts? > > 222 public void assertIsAlive() throws Exception { Currently it is not in use. However when test cases 02 and 03 are back online, this check will be used. Since the changes you suggest are minor, I will not post an updated webrev (let me know if you would like to see the updated webrev). I will make the updates recommended by you, run another test cycle, and integrate the changes. Thank you, Misha > > > Bob. > > >> On Aug 29, 2019, at 11:41 AM, mikhailo.seledtsov at oracle.com wrote: >> >> I believe I need a second reviewer for this change. Could someone, please, review this change version 2 ? (David already reviewed it). >> >> http://cr.openjdk.java.net/~mseledtsov/8228960.02/ >> >> >> Thank you in advance, >> >> Misha >> >> >> On 8/26/19 12:32 PM, mikhailo.seledtsov at oracle.com wrote: >>> Hi David, >>> >>> Thank you for review. >>> >>> On 8/26/19 12:57 AM, David Holmes wrote: >>>> Hi Misha, >>>> >>>> On 24/08/2019 3:21 am, mikhailo.seledtsov at oracle.com wrote: >>>>> Finally got some time to work on this issue. >>>>> Since I have encountered problem using files for passing messages between a container and a test driver (due to permissions), I looked for alternative solutions. I am using the output of a container process to signal when the main method has started, and it works. This simplifies things quite a bit as well. >>>>> >>>>> Normally, we use OutputAnalyzer test utility to collect the whole output once the process has completed, and then analyze the resulting output for "contains some string", match, etc. However, testutils/ProcessTools provides an API to consume the output as it is produced. I am using this API to detect when the main() method of the container has started. >>>> That seems reasonable. Do we want to make the following change to minimise unneeded output processing: >>>> >>>> private Consumer outputConsumer = s -> { >>>> ! if (!mainMethodStarted && s.contains(EventGeneratorLoop.MAIN_METHOD_STARTED)) { >>>> System.out.println("MainContainer: setting mainMethodStarted"); >>>> mainMethodStarted = true; >>>> } >>>> }; >>> Thank you for the suggestion. I will update the code accordingly. >>>>> Updated webrev: >>>>> http://cr.openjdk.java.net/~mseledtsov/8228960.02/ >>>> Otherwise looks okay. Hopefully those other test cases will be enabled in the not too distant future. >>> I hope so as well. >>> >>> >>> Thank you, >>> >>> Misha >>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>>> Testing: >>>>> >>>>> Ran the test on Linux-x64, various multiple nodes in a test cluster 50 times - All PASS >>>>> >>>>> >>>>> Thank you, >>>>> >>>>> Misha >>>>> >>>>> On 8/13/19 2:05 PM, Bob Vandette wrote: >>>>>>> On Aug 13, 2019, at 3:28 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>> >>>>>>> >>>>>>> On 8/13/19 12:06 PM, Bob Vandette wrote: >>>>>>>>> On Aug 13, 2019, at 2:57 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>> >>>>>>>>> Hi Bob, >>>>>>>>> >>>>>>>>> The workdir (JTwork/scratch) is created with the "test user" permissions. Let me try to place the "signal" file in /tmp instead, since /tmp should normally have a 777 permission on Linux. >>>>>>>> Aren?t you creating a file inside a docker container and then checking for its existence outside of the container? >>>>>>> Correct >>>>>>>> Isn?t the root user running inside the container? >>>>>>> By default it is. But it still fails to create a file, for some reason. Can be related to selinux settings (for instance, see this article: https://stackoverflow.com/questions/24288616/permission-denied-on-accessing-host-directory-in-docker/31334443), I can not change those. >>>>>> Is your JTWork/scratch on an NFS mounted file system? If this is the case then the problem is that root is equivalent to nobody on >>>>>> mounted file systems and can?t create files unless the directory has 777 permissions. I just confirmed this. You?d have to either run >>>>>> the container test as test-user or change the scratch directory permission. >>>>>> >>>>>> Bob. >>>>>> >>>>>>> My hope is that /tmp is configured to be accessed by a container engine as a general purpose directory, hence I was thinking to try it out. >>>>>>> >>>>>>>> Both processes don?t see the same /tmp right? So that shouldn?t help. >>>>>>> In my next experiment, I will map a /tmp from host to be a /host-tmp inside the container (--volume /tmp:/host-tmp), then write a signal file to /host-tmp. >>>>>>>> If scratch has 777 permissions, anyone can create a file. >>>>>>> scratch has "rwxr-xr-x" >>>>>>>> You have to be careful that you can clean up the >>>>>>>> file from outside the container. I?d make sure to create it with 777. >>>>>>> I do use deleteOnExit(), so it should work (unless the JVM crashes). I guess I could add extra layer of safety here, and set the permissions to 777. Thank you for advice. >>>>>>> >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> Misha >>>>>>> >>>>>>>> Bob. >>>>>>>> >>>>>>>>> If this works, I will have to add some unique number to the file name, perhaps a PID of a child process. >>>>>>>>> >>>>>>>>> I will try this, and let you know how it works. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> >>>>>>>>> Misha >>>>>>>>> >>>>>>>>> On 8/13/19 6:34 AM, Bob Vandette wrote: >>>>>>>>>> Sorry, I just looked at the webrev and you are trying the approach I suggested. I thought you >>>>>>>>>> were trying to use file change notification. >>>>>>>>>> >>>>>>>>>> Where does the workdir get created? Does it have 777 permissions? >>>>>>>>>> >>>>>>>>>> Bob. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Aug 13, 2019, at 9:29 AM, Bob Vandette wrote: >>>>>>>>>>> >>>>>>>>>>> What if you just poll for the creation of the file waiting some small amount of time between polling with a maximum timeout. >>>>>>>>>>> >>>>>>>>>>> Bob. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Aug 12, 2019, at 8:22 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>> >>>>>>>>>>>> Unfortunately, this approach does not seem to work on many of our test cluster machines. The creation of a "signal" file results in "PermissionDenied". >>>>>>>>>>>> >>>>>>>>>>>> The possible reason is the selinux configuration, or some other permission related stuff. The container tries to create a new file on a mounted volume on a host system, but host system denies it. I will look a bit deeper into this, but I think this type of issue can be encountered on any automated test system. Hence, we may have to abandon this approach. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Misha >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 8/12/19 3:59 PM, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>>> Here is an updated webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.01/ >>>>>>>>>>>>> >>>>>>>>>>>>> I am using a simple file-based mechanism to communicate between the processes. The "EventGeneratorLoop" process creates a specific "signal" file on a shared mounted volume, while the main test process waits for the file to exist before running the test cases. >>>>>>>>>>>>> >>>>>>>>>>>>> Passes on Linux-x64 Docker-enabled host. Testing in the test cluster is in progress. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you, >>>>>>>>>>>>> >>>>>>>>>>>>> Misha >>>>>>>>>>>>> >>>>>>>>>>>>> On 8/7/19 5:11 PM, David Holmes wrote: >>>>>>>>>>>>>> On 8/08/2019 9:04 am, Mikhailo Seledtsov wrote: >>>>>>>>>>>>>>> Hi Severin, Bob, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for reviewing the code. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 8/7/19, 11:38 AM, Bob Vandette wrote: >>>>>>>>>>>>>>>> Can?t you come up with a better way of synchronizing the test by possibly writing a >>>>>>>>>>>>>>>> file and waiting for it to exist with a timeout? >>>>>>>>>>>>>>> I will try out this approach. >>>>>>>>>>>>>> This seems like a fundamental problem with jcmd - so cc'ing serviceability-dev. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But I'm pretty sure they recently addressed a similar issue with the premature sending of the attach signal? >>>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> ----- >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Misha >>>>>>>>>>>>>>>> Isn?t there a shared volume between the two >>>>>>>>>>>>>>>> processes? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We?ve been fighting test reliability for a while now. I can only hope we?re getting >>>>>>>>>>>>>>>> to the end. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Bob. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Aug 7, 2019, at 2:18 PM, Severin Gehwolf wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Misha, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, 2019-08-06 at 20:17 -0700, mikhailo.seledtsov at oracle.com wrote: >>>>>>>>>>>>>>>>>> Please review this change that fixes a container test TestJcmdWithSideCar. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> My investigation indicated that a root cause for this failure is: >>>>>>>>>>>>>>>>>> JCMD -l shows 'Unknown' for class name because the main class has not >>>>>>>>>>>>>>>>>> been loaded yet. >>>>>>>>>>>>>>>>>> The target test JVM has started, it is initializing, but has not loaded >>>>>>>>>>>>>>>>>> the main test class. >>>>>>>>>>>>>>>>> That's what I've found too. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The proposed solution is to try 'jcmd -l' several times, with a short >>>>>>>>>>>>>>>>>> sleep in between. >>>>>>>>>>>>>>>>> Thread.sleep() isn't great, but I'm not sure there is an alternative. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Also I have commented out the testCase02() due to another bug: >>>>>>>>>>>>>>>>>> "JDK-8228850: jhsdb jinfo fails with ClassCastException: >>>>>>>>>>>>>>>>>> s.j.h.oops.TypeArray cannot be cast to s.j.h.oops.Instance", >>>>>>>>>>>>>>>>>> which is not a test bug. IMO, it is better to run the test and skip a >>>>>>>>>>>>>>>>>> sub-case than to skip the entire test. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8228960 >>>>>>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8228960.00/ >>>>>>>>>>>>>>>>> Looks OK to me. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Severin >>>>>>>>>>>>>>>>> From hohensee at amazon.com Fri Aug 30 14:11:43 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 30 Aug 2019 14:11:43 +0000 Subject: FW: Project Loom Is Moving to GitHub In-Reply-To: <2035130483.1697521.1567111411562.JavaMail.zimbra@u-pem.fr> References: <2035130483.1697521.1567111411562.JavaMail.zimbra@u-pem.fr> Message-ID: It's there. git clone https://github.com/openjdk/loom.git ?On 8/29/19, 1:47 PM, "loom-dev on behalf of Remi Forax" wrote: Works for me, It's way faster at home and even in CI env (on AWS us-west), doing a clone get a boost from ~2m10s to ~1m10s. R?mi ----- Mail original ----- > De: "Ron Pressler" > ?: "loom-dev" > Envoy?: Mardi 27 Ao?t 2019 11:16:50 > Objet: Re: Project Loom Is Moving to GitHub > The transition is done. Project Loom is now on GitHub. > > ? R > > > > On August 23, 2019 at 1:01:09 PM, Ron Pressler > (ron.pressler at oracle.com(mailto:ron.pressler at oracle.com)) wrote: > >> Next week, Project Loom will transition, on a trial basis, to a Git repository >> hosted on GitHub on as part of Project Skara [1]. >> >> The GitHub Loom repository is at [2] (which is currently a read-only mirror). >> >> Loom was chosen as a Skara early adopter because of the relatively small number >> of active committers, as well as the fact that most of them are in the European >> time zones, like the Skara team, who will then be able to provide close support. >> >> Developers who are interested in learning more about the Skara tools and >> workflow >> can find information on the Skara Wiki [3], GitHub project [4], and mailing list >> [5]. >> >> At the moment, Project Loom will not be accepting unsolicited pull-requests. Any >> contribution from a non-committer must first be discussed on the Loom mailing >> list [6]. >> >> Ron >> >> [1] https://openjdk.java.net/jeps/357 >> [2] https://github.com/openjdk/loom >> [3] https://wiki.openjdk.java.net/display/skara >> [4] https://github.com/openjdk/skara >> [5] https://mail.openjdk.java.net/mailman/listinfo/skara-dev >> [6] https://mail.openjdk.java.net/mailman/listinfo/loom-dev >> >> >> From mandy.chung at oracle.com Fri Aug 30 17:21:50 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 30 Aug 2019 10:21:50 -0700 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: References: Message-ID: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> OK.? That's better.? Some review comments: The javadoc of getCurrentThreadAllocatedBytes() can simply say: "Returns an approximation of the total amount of memory, in bytes, allocated in heap memory for the current thread. This is a convenient method for local management use and is equivalent to calling getThreadAllocatedBytes(Thread.currentThread().getId()). src/hotspot/share/include/jmm.h GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/ sun/management/ThreadImpl.java ? 43???? private static final String THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED = ? 44???????? "Thread allocated memory measurement is not supported."; if (!isThreadAllocatedMemorySupported()) { ?? throw new UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED); } Perhaps the above can be refactored as throwIfAllocatedMemoryUnsupported() method. ?391???????? if (ids.length == 1) { ?392???????????? sizes[0] = -1; ?: ?398???????????? if (ids.length == 1) { ?399???????????????? long id = ids[0]; ?400???????????????? sizes[0] = getThreadAllocatedMemory0( ?401???????????????????? Thread.currentThread().getId() == id ? 0 : id); ?402???????????? } else { It seems cleaner to handle the 1-element array case at the beginning of this method: ?? if (ids.length == 1) { ?????? long size = getThreadAllocatedBytes(ids[0]); ?????? return new long[] { size }; ?? } I didn't review the hotspot implementation and the test. Mandy On 8/29/19 10:01 AM, Hohensee, Paul wrote: > > My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in > com.sun.management.ThreadMXBean along with the current two > getThreadAllocatedBytes methods for the reasons you list. I?ve updated > the CSR to specify com.sun.management and added a rationale. > AllocatedBytes is currently enabled by Hotspot by default because the > overhead of recording TLAB occupancy is negligible. > > There?s no new GC code, nor will there be, so imo we don?t have to > involve the GC folks. I.e., the new JMM method > GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes > JavaThread method, and getCurrentThreadAllocatedBytes is the same as > getThreadAllocatedBytes: it just bypasses the thread lookup code. > > I hadn?t tracked down what happens when getCurrentThreadUserTime and > getCurrentThreadCpuTime are called before, but if I?m not mistaken, it > the code in jcmd() in attachListener.cpp will call > GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use > Thread::current() as the subject of the call, see > os::current_thread_cpu_time in os_linux.cpp. That means that the > CurrentThread methods should work remotely the same way they do > locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as > its subject when called on behalf of getCurrentThreadAllocatedBytes, > so it will also uses the current remote Java thread. Even if these > methods only worked locally, there are many setups where apps are > self-monitoring that could use the performance improvement. > > Thanks, > > Paul > > *From: *Mandy Chung > *Date: *Wednesday, August 28, 2019 at 3:59 PM > *To: *"Hohensee, Paul" > *Cc: *OpenJDK Serviceability , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (M): 8207266: > ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread > > Hi Paul, > > The CSR proposes this method in java.lang.management.ThreadMXBean as a > Java SE feature. > > Has this been discussed with the GC team to commit measuring current > thread's allocated bytes as Java SE feature??? Can this be supported > by all JVM implementation??? What is the overhead if this is enabled > by default?? Does it need to be disabled??? This metric is from TLAB > that might be okay.? This needs advice/discussion with GC experts. > > I see that CSR mentions it can be disabled and link to > isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() > methods but these methods are defined in com.sun.management.ThreadMXBean. > > As Alan points out, current thread makes sense only in local VM > management.? When this is monitored from a JMX client (e.g. jconsole > to connect to a running JVM, "currentThreadAllowcatedBytes" attribute > is the current thread in jconsole process which invoking > Thread::currentThread? > > Mandy > > On 8/28/19 12:22 PM, Hohensee, Paul wrote: > > Please review a performance improvement for > ThreadMXBean.getThreadAllocatedBytes and the addition of > getCurrentThreadAllocatedBytes. > > JBS issue:https://bugs.openjdk.java.net/browse/JDK-8207266 > > Webrev:http://cr.openjdk.java.net/~phh/8207266/webrev.00/ > > CSR:https://bugs.openjdk.java.net/browse/JDK-8230311 > > Previous email threads: > https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html > https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html > > The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. > I?d be great for someone to review it. > > I took Mandy?s advice and put the fast paths in the library code. > I added a new JMM method GetOneThreadsAllocatedBytes that works > the same as GetThreadCpuTime: it uses a thread_id value of zero to > distinguish the current thread. On my Mac laptop, the result runs > 47x faster for the current thread than the old implementation. > > The 3 tests intest/jdk/com/sun/management/ThreadMXBean all pass. I > added code to ThreadAllocatedMemory.java to test > getCurrentThreadAllocatedBytes as well as variations on > getThreadAllocatedBytes(id). A submit repo job is in progress. > > Thanks, > > Paul > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Fri Aug 30 22:33:05 2019 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 30 Aug 2019 22:33:05 +0000 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> Message-ID: <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> Thanks for your review, Mandy. Revised webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.02/. I updated the CSR with your suggested javadoc for getCurrentThreadAllocatedBytes. It now matches that for getCurrentThreadUserTime and getCurrentThreadCputime. I also fixed the ?convenient? -> ?convenience? typos in j.l.m.ThreadMXBean.java. I meant GetOneThreads to be the possessive, but don?t feel strongly either way so I?m fine with GetOneThread. I updated ThreadImpl.java as you suggested, though in getThreadAllocatedBytes(long[] ids) I had to add a redundant-in-the-not-length-1-case check for a null ids reference. Would someone take a look at the Hotspot side and the test please? Paul From: Mandy Chung Date: Friday, August 30, 2019 at 10:22 AM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread OK. That's better. Some review comments: The javadoc of getCurrentThreadAllocatedBytes() can simply say: "Returns an approximation of the total amount of memory, in bytes, allocated in heap memory for the current thread. This is a convenient method for local management use and is equivalent to calling getThreadAllocatedBytes(Thread.currentThread().getId()). src/hotspot/share/include/jmm.h GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/ sun/management/ThreadImpl.java 43 private static final String THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED = 44 "Thread allocated memory measurement is not supported."; if (!isThreadAllocatedMemorySupported()) { throw new UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED); } Perhaps the above can be refactored as throwIfAllocatedMemoryUnsupported() method. 391 if (ids.length == 1) { 392 sizes[0] = -1; : 398 if (ids.length == 1) { 399 long id = ids[0]; 400 sizes[0] = getThreadAllocatedMemory0( 401 Thread.currentThread().getId() == id ? 0 : id); 402 } else { It seems cleaner to handle the 1-element array case at the beginning of this method: if (ids.length == 1) { long size = getThreadAllocatedBytes(ids[0]); return new long[] { size }; } I didn't review the hotspot implementation and the test. Mandy On 8/29/19 10:01 AM, Hohensee, Paul wrote: My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in com.sun.management.ThreadMXBean along with the current two getThreadAllocatedBytes methods for the reasons you list. I?ve updated the CSR to specify com.sun.management and added a rationale. AllocatedBytes is currently enabled by Hotspot by default because the overhead of recording TLAB occupancy is negligible. There?s no new GC code, nor will there be, so imo we don?t have to involve the GC folks. I.e., the new JMM method GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes JavaThread method, and getCurrentThreadAllocatedBytes is the same as getThreadAllocatedBytes: it just bypasses the thread lookup code. I hadn?t tracked down what happens when getCurrentThreadUserTime and getCurrentThreadCpuTime are called before, but if I?m not mistaken, it the code in jcmd() in attachListener.cpp will call GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use Thread::current() as the subject of the call, see os::current_thread_cpu_time in os_linux.cpp. That means that the CurrentThread methods should work remotely the same way they do locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as its subject when called on behalf of getCurrentThreadAllocatedBytes, so it will also uses the current remote Java thread. Even if these methods only worked locally, there are many setups where apps are self-monitoring that could use the performance improvement. Thanks, Paul From: Mandy Chung Date: Wednesday, August 28, 2019 at 3:59 PM To: "Hohensee, Paul" Cc: OpenJDK Serviceability , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread Hi Paul, The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature. Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature? Can this be supported by all JVM implementation? What is the overhead if this is enabled by default? Does it need to be disabled? This metric is from TLAB that might be okay. This needs advice/discussion with GC experts. I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean. As Alan points out, current thread makes sense only in local VM management. When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread? Mandy On 8/28/19 12:22 PM, Hohensee, Paul wrote: Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes. JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266 Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/ CSR: https://bugs.openjdk.java.net/browse/JDK-8230311 Previous email threads: https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it. I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation. The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Fri Aug 30 23:25:30 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 30 Aug 2019 16:25:30 -0700 Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread In-Reply-To: <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> References: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com> <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com> Message-ID: <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com> CSR reviewed. management.cpp 2083???? java_thread = (JavaThread*)THREAD; 2084???? if (java_thread->is_Java_thread()) { 2085?????? return java_thread->cooked_allocated_bytes(); 2086???? } The cast should be done after is_Java_thread() test. ThreadImpl.java ?162???? private void throwIfNullThreadIds(long[] ids) { Even better: simply use Objects::requiresNonNull and this method can be removed. This suggests positive naming alternative to throwIfThreadAllocatedMemoryNotSupported - "ensureThreadAllocatedMemorySupported" (sorry I should have suggested that) ThreadMXBean.java ?130????? * @throws java.lang.UnsupportedOperationException if the Java virtual Nit: "java.lang." can be dropped. @since 14 is missing. Mandy On 8/30/19 3:33 PM, Hohensee, Paul wrote: > > Thanks for your review, Mandy. Revised webrev at > http://cr.openjdk.java.net/~phh/8207266/webrev.02/. > > I updated the CSR with your suggested javadoc for > getCurrentThreadAllocatedBytes. It now matches that for > getCurrentThreadUserTime and getCurrentThreadCputime. I also fixed the > ?convenient? -> ?convenience? typos in j.l.m.ThreadMXBean.java. > > I meant GetOneThreads to be the possessive, but don?t feel strongly > either way so I?m fine with GetOneThread. > > I updated ThreadImpl.java as you suggested, though in > getThreadAllocatedBytes(long[] ids) I had to add a > redundant-in-the-not-length-1-case check for a null ids reference. > Would someone take a look at the Hotspot side and the test please? > Paul > > *From: *Mandy Chung > *Date: *Friday, August 30, 2019 at 10:22 AM > *To: *"Hohensee, Paul" > *Cc: *OpenJDK Serviceability , > "hotspot-gc-dev at openjdk.java.net" > *Subject: *Re: RFR (M): 8207266: > ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread > > OK.? That's better.? Some review comments: > > The javadoc of getCurrentThreadAllocatedBytes() can simply say: > > "Returns an approximation of the total amount of memory, in bytes, > allocated in heap memory for the current thread. > > This is a convenient method for local management use and is equivalent > to calling getThreadAllocatedBytes(Thread.currentThread().getId()). > > > src/hotspot/share/include/jmm.h > > GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/ > > sun/management/ThreadImpl.java > > ? 43???? private static final String > THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED = > ? 44???????? "Thread allocated memory measurement is not supported."; > > if (!isThreadAllocatedMemorySupported()) { > ?? throw new > UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED); > } > > Perhaps the above can be refactored as > throwIfAllocatedMemoryUnsupported() method. > > ?391???????? if (ids.length == 1) { > ?392???????????? sizes[0] = -1; > ?: > ?398???????????? if (ids.length == 1) { > ?399???????????????? long id = ids[0]; > ?400???????????????? sizes[0] = getThreadAllocatedMemory0( > ?401???????????????????? Thread.currentThread().getId() == id ? 0 : id); > ?402???????????? } else { > > It seems cleaner to handle the 1-element array case at the beginning > of this method: > ?? if (ids.length == 1) { > ?????? long size = getThreadAllocatedBytes(ids[0]); > ?????? return new long[] { size }; > ?? } > > I didn't review the hotspot implementation and the test. > > Mandy > > On 8/29/19 10:01 AM, Hohensee, Paul wrote: > > My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in > com.sun.management.ThreadMXBean along with the current two > getThreadAllocatedBytes methods for the reasons you list. I?ve > updated the CSR to specify com.sun.management and added a > rationale. AllocatedBytes is currently enabled by Hotspot by > default because the overhead of recording TLAB occupancy is > negligible. > > There?s no new GC code, nor will there be, so imo we don?t have to > involve the GC folks. I.e., the new JMM method > GetOneThreadsAllocatedBytes uses the existing > cooked_allocated_bytes JavaThread method, and > getCurrentThreadAllocatedBytes is the same as > getThreadAllocatedBytes: it just bypasses the thread lookup code. > > I hadn?t tracked down what happens when getCurrentThreadUserTime > and getCurrentThreadCpuTime are called before, but if I?m not > mistaken, it the code in jcmd() in attachListener.cpp will call > GetThreadCpuTimeWithKind in management.cpp, and it will ultimately > use Thread::current() as the subject of the call, see > os::current_thread_cpu_time in os_linux.cpp. That means that the > CurrentThread methods should work remotely the same way they do > locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD > as its subject when called on behalf of > getCurrentThreadAllocatedBytes, so it will also uses the current > remote Java thread. Even if these methods only worked locally, > there are many setups where apps are self-monitoring that could > use the performance improvement. > > Thanks, > > Paul > > *From: *Mandy Chung > > *Date: *Wednesday, August 28, 2019 at 3:59 PM > *To: *"Hohensee, Paul" > > *Cc: *OpenJDK Serviceability > , > "hotspot-gc-dev at openjdk.java.net" > > > > *Subject: *Re: RFR (M): 8207266: > ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread > > Hi Paul, > > The CSR proposes this method in java.lang.management.ThreadMXBean > as a Java SE feature. > > Has this been discussed with the GC team to commit measuring > current thread's allocated bytes as Java SE feature??? Can this be > supported by all JVM implementation??? What is the overhead if > this is enabled by default?? Does it need to be disabled??? This > metric is from TLAB that might be okay. This needs > advice/discussion with GC experts. > > I see that CSR mentions it can be disabled and link to > isThreadAllocatedMemoryEnabled() and > setThreadAllocatedMemoryEnabled() methods but these methods are > defined in com.sun.management.ThreadMXBean. > > As Alan points out, current thread makes sense only in local VM > management.? When this is monitored from a JMX client (e.g. > jconsole to connect to a running JVM, > "currentThreadAllowcatedBytes" attribute is the current thread in > jconsole process which invoking Thread::currentThread? > > Mandy > > On 8/28/19 12:22 PM, Hohensee, Paul wrote: > > Please review a performance improvement for > ThreadMXBean.getThreadAllocatedBytes and the addition of > getCurrentThreadAllocatedBytes. > > JBS issue:https://bugs.openjdk.java.net/browse/JDK-8207266 > > Webrev:http://cr.openjdk.java.net/~phh/8207266/webrev.00/ > > CSR:https://bugs.openjdk.java.net/browse/JDK-8230311 > > Previous email threads: > https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html > https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html > > The CSR is for adding > ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for > someone to review it. > > I took Mandy?s advice and put the fast paths in the library > code. I added a new JMM method GetOneThreadsAllocatedBytes > that works the same as GetThreadCpuTime: it uses a thread_id > value of zero to distinguish the current thread. On my Mac > laptop, the result runs 47x faster for the current thread than > the old implementation. > > The 3 tests intest/jdk/com/sun/management/ThreadMXBean all > pass. I added code to ThreadAllocatedMemory.java to test > getCurrentThreadAllocatedBytes as well as variations on > getThreadAllocatedBytes(id). A submit repo job is in progress. > > Thanks, > > Paul > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: