From coleen.phillimore at oracle.com Mon Dec 2 13:42:09 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 2 Dec 2019 08:42:09 -0500 Subject: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load In-Reply-To: References: <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com> <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com> <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com> <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com> <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com> <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com> <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com> <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com> <568c2562-0a56-73ac-c0af-43339d701b19@oracle.com> <200cb839-9019-58f1-17e5-7a0426a6035b@oracle.com> Message-ID: Thanks Erik! Coleen On 11/25/19 9:37 AM, Erik ?sterlund wrote: > Hi Coleen, > > Still good BTW! > > Thanks, > /Erik > > On 2019-11-25 14:47, coleen.phillimore at oracle.com wrote: >> Thanks for the code review, Serguei! >> Coleen >> >> On 11/22/19 6:34 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Coleen, >>> >>> +1 >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/22/19 14:53, Daniel D. Daugherty wrote: >>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev >>>> >>>> src/hotspot/share/prims/jvmtiImpl.cpp >>>> ??? No comments. >>>> >>>> src/hotspot/share/prims/jvmtiImpl.hpp >>>> ??? No comments. >>>> >>>> src/hotspot/share/runtime/serviceThread.cpp >>>> ??? No comments. >>>> >>>> Thumbs up. >>>> >>>> Dan >>>> >>>> >>>> On 11/22/19 2:15 PM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Dan, Thank you for reviewing this! >>>>> >>>>> On 11/22/19 12:49 PM, Daniel D. Daugherty wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> Sorry for the delay in getting back to this re-review. >>>>>> >>>>>> >>>>>> On 11/21/19 9:12 AM, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> Please review a new version of this change that keeps the >>>>>>> nmethod from being unloaded, after it is added to the deferred >>>>>>> event queue: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html >>>>>> >>>>>> src/hotspot/share/code/nmethod.cpp >>>>>> ??? No comments. >>>>>> >>>>>> src/hotspot/share/oops/instanceKlass.cpp >>>>>> ??? No comments. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiExport.cpp >>>>>> ??? No comments. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiImpl.cpp >>>>>> ??? Nice solution with the new oops_do() and nmethods_do() functions! >>>>> Erik's insistance! >>>>>> >>>>>> ??? old L988: void JvmtiDeferredEventQueue::enqueue(const >>>>>> JvmtiDeferredEvent& event) { >>>>>> ??? new L998: void >>>>>> JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) { >>>>>> ??????? Not sure why this was changed. >>>>>> >>>>>> ??????? Update: Looks like Serguei raised the issue and Coleen >>>>>> has already >>>>>> ??????? resolved it. >>>>> >>>>> Yes. >>>>>> >>>>>> src/hotspot/share/prims/jvmtiImpl.hpp >>>>>> ??? old L494: ??? QueueNode(const JvmtiDeferredEvent& event) >>>>>> ??? new L498: ??? QueueNode(JvmtiDeferredEvent& event) >>>>>> ??????? Why was this changed? >>>>>> >>>>>> ??????? Update: Not clear if this was covered by Coleen's reply >>>>>> to Serguei. >>>>>> >>>>>> ??? old L497: ??? const JvmtiDeferredEvent& event() const { >>>>>> return _event; } >>>>>> ??? new L501: ??? JvmtiDeferredEvent& event() { return _event; } >>>>>> ??????? Why was this changed? >>>>>> >>>>>> ??????? Update: Coleen's reply to Serguei explained this. Perhaps >>>>>> add: >>>>>> ????????????????? // Not const because of oops_do() and >>>>>> nmethods_do(). >>>>>> >>>>>> ??? old L509: ? static void enqueue(const JvmtiDeferredEvent& >>>>>> event) NOT_JVMTI_RETURN; >>>>>> ??? new L513: ? static void enqueue(JvmtiDeferredEvent event) >>>>>> NOT_JVMTI_RETURN; >>>>>> ??????? Why was this changed? >>>>>> >>>>>> ??????? Update: Looks like Serguei raised the issue and Coleen >>>>>> has already >>>>>> ??????? resolved it. >>>>> >>>>> Yes, I fixed these. >>>>>> >>>>>> src/hotspot/share/runtime/mutexLocker.cpp >>>>>> ??? This change is going to require some testing to make sure we >>>>>> don't >>>>>> ??? have any new deadlock scenarios. >>>>> >>>>> Luckily, I've previously added an implicit NoSafepointVerifier to >>>>> locks that are _allow_vm_block = true, like this one. >>>>> + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, >>>>> _safepoint_check_never); // used for creating jmethodIDs. >>>>> which prevents one class of deadlock. If we take out another lock >>>>> with a higher rank, we'll get the ranking assert. >>>>> >>>>> This lock prevents insertion into an array, and has little outside >>>>> calls. >>>>> >>>>> I'm running tests in tier 1-6 but any code that travels through >>>>> this should get these assertion checks, rather than deadlocking. >>>>> >>>>>> >>>>>> src/hotspot/share/runtime/serviceThread.cpp >>>>>> ??? L50 - nit - why the extra blank line? >>>>> >>>>> To separate static data member definitions from functions.? I >>>>> removed it. >>>>>> >>>>>> src/hotspot/share/runtime/serviceThread.hpp >>>>>> ??? Thanks for cleaning up the static: >>>>>> >>>>>> ????? ServiceThread::is_service_thread(Thread* thread) >>>>>> >>>>>> ??? stuff. Having it be different than the other threads was >>>>>> ??? a bit jarring. >>>>>> >>>>>> src/hotspot/share/runtime/thread.hpp >>>>>> ??? No comments. >>>>>> >>>>>> Thumbs up. My only comments are nits so I don't need to see a >>>>>> new webrev if you decide to fix them. >>>>> >>>>> So it turns out that in stress testing my fix >>>>> forhttps://bugs.openjdk.java.net/browse/JDK-8212160 >>>>> >>>>> Because I was in the area and thought this was a duplicate of that >>>>> bug (it is not).?? I found that calling oops_do and nmethods_do >>>>> the ServiceThread? needs to hold the Service_lock, because other >>>>> threads can be adding things to the global queue while the sweeper >>>>> thread is calling this in a handshake. >>>>> >>>>> I am now retesting this change with the changes above, and with >>>>> the Service_lock.?? So far my stress tests for JDK-81212160 and >>>>> the stress test for this bug pass, but I'm going to run through >>>>> all the tiers 1-6 over the weekend. >>>>> >>>>> Please have a look at the changes in the meantime. >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev >>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04/webrev >>>>> >>>>> Thanks, >>>>> Coleen >>>>>> >>>>>> Dan >>>>>> >>>>>>> >>>>>>> Ran the test that failed 100 times without failure, tier1 on >>>>>>> Oracle supported platforms, and tier2-3 including jvmti and jdi >>>>>>> tests locally. >>>>>>> >>>>>>> See bug for more details about the crash. >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> Sorry for not sending an update.? I talked to Erik and am >>>>>>>> working on a version that keeps the nmethod from being unloaded >>>>>>>> while it's in the deferred event queue, with a version that the >>>>>>>> GC people will like, and I like.? I'm testing it out now. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> Coleen >>>>>>>> >>>>>>>> >>>>>>>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> Sorry for the latency, I had to investigate it a little bit. >>>>>>>>> I still have some doubt your fix is right thing to do. >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Coleen, >>>>>>>>>>> >>>>>>>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi, I've been working on answers to these questions, so >>>>>>>>>>>> I'll start with this one. >>>>>>>>>>>> >>>>>>>>>>>> The nmethodLocker keeps the nmethod from being reclaimed >>>>>>>>>>>> (made_zombie or memory released) by the sweeper, but the >>>>>>>>>>>> nmethod could be unloaded.? Unloading the nmethod clears >>>>>>>>>>>> the Method* _method field. >>>>>>>>>>> >>>>>>>>>>> Yes, I see it is done in the nmethod::make_unloaded(). >>>>>>>>>>> >>>>>>>>>>>> The post_compiled_method_load event needs the _method field >>>>>>>>>>>> to look at things like inlining and ScopeDesc fields.?? If >>>>>>>>>>>> the nmethod is unloaded, some of the oops are dead.? There >>>>>>>>>>>> are "holder" oops that correspond to the metadata in the >>>>>>>>>>>> nmethod. If these oops are dead, causing the nmethod to get >>>>>>>>>>>> unloaded, then the metadata may not be valid. >>>>>>>>>>>> >>>>>>>>>>>> So my change 02 looks for a NULL nmethod._method field to >>>>>>>>>>>> tell whether we can post information about the nmethod. >>>>>>>>>>>> >>>>>>>>>>>> There's code in nmethod.cpp like: >>>>>>>>>>>> >>>>>>>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() { >>>>>>>>>>>> ? if (_jmethod_id == NULL) { >>>>>>>>>>>> ??? // Cache the jmethod_id since it can no longer be >>>>>>>>>>>> looked up once the >>>>>>>>>>>> ??? // method itself has been marked for unloading. >>>>>>>>>>>> ??? _jmethod_id = method()->jmethod_id(); >>>>>>>>>>>> ? } >>>>>>>>>>>> ? return _jmethod_id; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> Which was added when post_method_load and unload were >>>>>>>>>>>> turned into deferred events. >>>>>>>>>>> >>>>>>>>>>> Could we cache the jmethodID in the >>>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event >>>>>>>>>>> similarly as we do in the >>>>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event? >>>>>>>>>>> This would help to get rid of the dependency on the >>>>>>>>>>> nmethod::_method. >>>>>>>>>>> Do we depend on any other nmethod fields? >>>>>>>>>> >>>>>>>>>> Yes, there are other nmethod metadata that we rely on to >>>>>>>>>> print inline information, and this function >>>>>>>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it >>>>>>>>>> uses the ScopeDesc data in the nmethod. >>>>>>>>> >>>>>>>>> One possible approach is to prepare and cache all this information >>>>>>>>> in the nmethod::post_compiled_method_load_event() before the >>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event() is called. >>>>>>>>> The event parameters are: >>>>>>>>> typedef struct { >>>>>>>>> const void* start_address; >>>>>>>>> jlocation location; >>>>>>>>> } jvmtiAddrLocationMap; >>>>>>>>> CompiledMethodLoad(jvmtiEnv *jvmti_env, >>>>>>>>> jmethodID method, >>>>>>>>> jint code_size, >>>>>>>>> const void* code_addr, >>>>>>>>> jint map_length, >>>>>>>>> const jvmtiAddrLocationMap* map, >>>>>>>>> const void* compile_info) >>>>>>>>> Some of these addresses above could be not accessible when an >>>>>>>>> event is posted. >>>>>>>>> Not sure yet if it is Okay. >>>>>>>>> The question is if this kind of refactoring is worth and right >>>>>>>>> thing to do. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> We do cache the jmethodID but that's not good enough.? See my >>>>>>>>>> last comment in the bug report. The jmethodID can point to an >>>>>>>>>> unloaded method. >>>>>>>>> >>>>>>>>> This looks like it is done a little bit late. >>>>>>>>> It'd better to do it before the event is deferred (see above). >>>>>>>>> >>>>>>>>>> I tried a version of keeping the nmethod alive, but the GC >>>>>>>>>> folks will hate it.? And it doesn't work and I hate it. >>>>>>>>> >>>>>>>>> From serviceability point of view this is the best and most >>>>>>>>> consistent approach. >>>>>>>>> I seems to me, it was initially designed this way. >>>>>>>>> The downside is it adds some extra complexity to the GC. >>>>>>>>> >>>>>>>>>> My version 01 is the best, with the caveat that maybe it >>>>>>>>>> should check for _method == NULL instead of >>>>>>>>>> nmethod->is_alive().? I have to talk to Erik to see if >>>>>>>>>> there's a race with concurrent class unloading. >>>>>>>>>> >>>>>>>>>> Any application that depends on a compiled method loading >>>>>>>>>> event on a class that could be unloaded is a buggy >>>>>>>>>> application.? Applications should not rely on when the JIT >>>>>>>>>> compiler decides to compile a method!? This happens to us for >>>>>>>>>> a stress test.? Most applications will get most of their >>>>>>>>>> compiled method loading events as they normally do. >>>>>>>>> >>>>>>>>> It is not an application that relies on the compiled method >>>>>>>>> loading event. >>>>>>>>> It is about profiling tools to be able to get correct >>>>>>>>> information about what is going on with compilations. >>>>>>>>> My concern is that if we skip such compiled method load events >>>>>>>>> then profilers have no way >>>>>>>>> to find out there many unneeded compilations that are thrown >>>>>>>>> away without any real use. >>>>>>>>> Also, it is not clear what happens with the subsequent >>>>>>>>> compiled method unload events. >>>>>>>>> Are they going to be skipped as well or they can appear and >>>>>>>>> confuse profilers? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>>> I put more debugging in the bug to show this crash was from >>>>>>>>>>>> an unloaded nmethod. >>>>>>>>>>>> >>>>>>>>>>>> Coleen >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Coleen, >>>>>>>>>>>>> >>>>>>>>>>>>> I have some questions. >>>>>>>>>>>>> >>>>>>>>>>>>> Both the compiler method load and unload are posted as >>>>>>>>>>>>> deferred events. >>>>>>>>>>>>> Both events keep the nmethod alive until the ServiceThread >>>>>>>>>>>>> processes the event. >>>>>>>>>>>>> >>>>>>>>>>>>> The implementation is: >>>>>>>>>>>>> >>>>>>>>>>>>> JvmtiDeferredEvent >>>>>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) { >>>>>>>>>>>>> ? . . . >>>>>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can >>>>>>>>>>>>> process >>>>>>>>>>>>> ? // this deferred event. >>>>>>>>>>>>> ? nmethodLocker::lock_nmethod(nm); >>>>>>>>>>>>> ? return event; >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> JvmtiDeferredEvent >>>>>>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* >>>>>>>>>>>>> nm, jmethodID id, const void* code) { >>>>>>>>>>>>> ? . . . >>>>>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can >>>>>>>>>>>>> process >>>>>>>>>>>>> ? // this deferred event. This will keep the memory for the >>>>>>>>>>>>> ? // generated code from being reused too early. We pass >>>>>>>>>>>>> ? // zombie_ok == true here so that our nmethod that was just >>>>>>>>>>>>> ? // made into a zombie can be locked. >>>>>>>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */); >>>>>>>>>>>>> ? return event; >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> void JvmtiDeferredEvent::post() { >>>>>>>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()), >>>>>>>>>>>>> ???????? "Service thread must post enqueued events"); >>>>>>>>>>>>> ? switch(_type) { >>>>>>>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: { >>>>>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load; >>>>>>>>>>>>> JvmtiExport::post_compiled_method_load(nm); >>>>>>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>>>>>> ????? break; >>>>>>>>>>>>> ??? } >>>>>>>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: { >>>>>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm; >>>>>>>>>>>>> JvmtiExport::post_compiled_method_unload( >>>>>>>>>>>>> _event_data.compiled_method_unload.method_id, >>>>>>>>>>>>> _event_data.compiled_method_unload.code_begin); >>>>>>>>>>>>> ????? // done with the deferred event so unlock the nmethod >>>>>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm); >>>>>>>>>>>>> ????? break; >>>>>>>>>>>>> ??? } >>>>>>>>>>>>> ??? . . . >>>>>>>>>>>>> ? } >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> Then I wonder how is it possible for the nmethod to be not >>>>>>>>>>>>> alive here?: >>>>>>>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) { >>>>>>>>>>>>> . . . >>>>>>>>>>>>> 2173 // It's not safe to look at metadata for unloaded >>>>>>>>>>>>> methods. >>>>>>>>>>>>> 2174 if (!nm->is_alive()) { >>>>>>>>>>>>> 2175 return; >>>>>>>>>>>>> 2176 } >>>>>>>>>>>>> At least, it lokks like something else is broken. >>>>>>>>>>>>> Do I miss something important here? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote: >>>>>>>>>>>>>> Summary: Don't post information which uses metadata from >>>>>>>>>>>>>> unloaded nmethods >>>>>>>>>>>>>> >>>>>>>>>>>>>> Tested tier1-3 and 100 times with test that failed >>>>>>>>>>>>>> (reproduced failure without the fix). >>>>>>>>>>>>>> >>>>>>>>>>>>>> open webrev at >>>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev >>>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Coleen >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Mon Dec 2 14:43:38 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 2 Dec 2019 09:43:38 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> Message-ID: <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> On 11/26/19 7:03 PM, David Holmes wrote: > (adding runtime as well) > > Hi Coleen, > > On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >> Summary: Add local deferred event list to thread to post events >> outside CodeCache_lock. >> >> This patch builds on the patch for JDK-8173361.? With this patch, I >> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >> and have one per thread.? The CodeBlob event that used to drop the >> CodeCache_lock and raced with the sweeper thread, adds the events it >> wants to post to its thread local list, and processes it outside the >> lock.? The list is walked in GC and by the sweeper to keep the >> nmethods from being unloaded and zombied, respectively. > > Sorry I don't understand why we would want/need a deferred event queue > for every JavaThread? Isn't this only relevant for non-JavaThreads > that need to have the ServiceThread process the deferred event? I thought I'd written this in the bug but I had only discussed this with Erik.? I've added a comment to the bug to explain why I added the per-JavaThread queue.? In order to process these events after the CodeCache_lock is dropped, I have to queue them somewhere safe. The ServiceThread queue is safe, *but* the ServiceThread can't keep up with the events, especially from this test case.? So the test case gets a native OOM. So I've added the safe queue as a field to each JavaThread because multiple JavaThreads could be posting these events at the same time, and there didn't seem to be a better safe place to cache them, without adding another layer of queuing code. I did write comments to this effect here: http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html Thanks, Coleen > > David > >> Also, the jmethod_id field in nmethod was only used as a boolean so >> don't create a jmethod_id until needed for post_compiled_method_unload. >> >> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed in >> the original bug report. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >> >> Thanks, >> Coleen From david.holmes at oracle.com Tue Dec 3 04:52:28 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Dec 2019 14:52:28 +1000 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> Message-ID: Hi Coleen, On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: > > > On 11/26/19 7:03 PM, David Holmes wrote: >> (adding runtime as well) >> >> Hi Coleen, >> >> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>> Summary: Add local deferred event list to thread to post events >>> outside CodeCache_lock. >>> >>> This patch builds on the patch for JDK-8173361.? With this patch, I >>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >>> and have one per thread.? The CodeBlob event that used to drop the >>> CodeCache_lock and raced with the sweeper thread, adds the events it >>> wants to post to its thread local list, and processes it outside the >>> lock.? The list is walked in GC and by the sweeper to keep the >>> nmethods from being unloaded and zombied, respectively. >> >> Sorry I don't understand why we would want/need a deferred event queue >> for every JavaThread? Isn't this only relevant for non-JavaThreads >> that need to have the ServiceThread process the deferred event? > > I thought I'd written this in the bug but I had only discussed this with > Erik.? I've added a comment to the bug to explain why I added the > per-JavaThread queue.? In order to process these events after the > CodeCache_lock is dropped, I have to queue them somewhere safe. The > ServiceThread queue is safe, *but* the ServiceThread can't keep up with > the events, especially from this test case.? So the test case gets a > native OOM. > > So I've added the safe queue as a field to each JavaThread because > multiple JavaThreads could be posting these events at the same time, and > there didn't seem to be a better safe place to cache them, without > adding another layer of queuing code. I think I'm getting the picture now. At the time the events are generated we can't post them directly because the current thread is inside compiler code. Hence the events must be deferred. Using the ServiceThread to handle the deferred events is one way to deal with this - but it can't keep up in this scenario. So instead we store the events in the current thread and when the current thread returns to code where it is safe to post the events, it does so itself. Is that generally correct? I admit I'm not keen on adding this additional field per-thread just for a temporary usage. Some kind of stack allocated helper would be preferable, but would need to be passed through the call chain so that the events could be added to it. Also I'm not clear why we aggressively delete the _jvmti_event_queue after posting the events. I'd be worried about the overhead we are introducing for creating and deleting this queue. When the JvmtiDeferredEventQueue data structure was intended only for use by the ServiceThread its dynamic node allocation may have made more sense. But now that seems like a liability to me - if JvmtiDeferredEvents could be linked directly we wouldn't need dynamic nodes, nor dynamic per-thread queues (just a per-thread pointer). Just some thoughts. Thanks, David > I did write comments to this effect here: > > http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html > > > Thanks, > Coleen > >> >> David >> >>> Also, the jmethod_id field in nmethod was only used as a boolean so >>> don't create a jmethod_id until needed for post_compiled_method_unload. >>> >>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed in >>> the original bug report. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>> >>> Thanks, >>> Coleen > From erik.osterlund at oracle.com Tue Dec 3 12:48:21 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 3 Dec 2019 13:48:21 +0100 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> Message-ID: <30df2087-a20d-f745-8c19-fb0173a50421@oracle.com> Hi Coleen, This looks great. Thanks for sorting this out! /Erik On 12/2/19 3:43 PM, coleen.phillimore at oracle.com wrote: > > > On 11/26/19 7:03 PM, David Holmes wrote: >> (adding runtime as well) >> >> Hi Coleen, >> >> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>> Summary: Add local deferred event list to thread to post events >>> outside CodeCache_lock. >>> >>> This patch builds on the patch for JDK-8173361.? With this patch, I >>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >>> and have one per thread.? The CodeBlob event that used to drop the >>> CodeCache_lock and raced with the sweeper thread, adds the events it >>> wants to post to its thread local list, and processes it outside the >>> lock.? The list is walked in GC and by the sweeper to keep the >>> nmethods from being unloaded and zombied, respectively. >> >> Sorry I don't understand why we would want/need a deferred event >> queue for every JavaThread? Isn't this only relevant for >> non-JavaThreads that need to have the ServiceThread process the >> deferred event? > > I thought I'd written this in the bug but I had only discussed this > with Erik.? I've added a comment to the bug to explain why I added the > per-JavaThread queue.? In order to process these events after the > CodeCache_lock is dropped, I have to queue them somewhere safe. The > ServiceThread queue is safe, *but* the ServiceThread can't keep up > with the events, especially from this test case.? So the test case > gets a native OOM. > > So I've added the safe queue as a field to each JavaThread because > multiple JavaThreads could be posting these events at the same time, > and there didn't seem to be a better safe place to cache them, without > adding another layer of queuing code. > > I did write comments to this effect here: > > http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html > > > Thanks, > Coleen > >> >> David >> >>> Also, the jmethod_id field in nmethod was only used as a boolean so >>> don't create a jmethod_id until needed for post_compiled_method_unload. >>> >>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed >>> in the original bug report. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>> >>> Thanks, >>> Coleen > From coleen.phillimore at oracle.com Tue Dec 3 13:08:25 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Dec 2019 08:08:25 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> Message-ID: On 12/2/19 11:52 PM, David Holmes wrote: > Hi Coleen, > > On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: >> >> >> On 11/26/19 7:03 PM, David Holmes wrote: >>> (adding runtime as well) >>> >>> Hi Coleen, >>> >>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>> Summary: Add local deferred event list to thread to post events >>>> outside CodeCache_lock. >>>> >>>> This patch builds on the patch for JDK-8173361.? With this patch, I >>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >>>> and have one per thread.? The CodeBlob event that used to drop the >>>> CodeCache_lock and raced with the sweeper thread, adds the events >>>> it wants to post to its thread local list, and processes it outside >>>> the lock.? The list is walked in GC and by the sweeper to keep the >>>> nmethods from being unloaded and zombied, respectively. >>> >>> Sorry I don't understand why we would want/need a deferred event >>> queue for every JavaThread? Isn't this only relevant for >>> non-JavaThreads that need to have the ServiceThread process the >>> deferred event? >> >> I thought I'd written this in the bug but I had only discussed this >> with Erik.? I've added a comment to the bug to explain why I added >> the per-JavaThread queue.? In order to process these events after the >> CodeCache_lock is dropped, I have to queue them somewhere safe. The >> ServiceThread queue is safe, *but* the ServiceThread can't keep up >> with the events, especially from this test case.? So the test case >> gets a native OOM. >> >> So I've added the safe queue as a field to each JavaThread because >> multiple JavaThreads could be posting these events at the same time, >> and there didn't seem to be a better safe place to cache them, >> without adding another layer of queuing code. > > I think I'm getting the picture now. At the time the events are > generated we can't post them directly because the current thread is > inside compiler code. Hence the events must be deferred. Using the > ServiceThread to handle the deferred events is one way to deal with > this - but it can't keep up in this scenario. So instead we store the > events in the current thread and when the current thread returns to > code where it is safe to post the events, it does so itself. Is that > generally correct? Yes. > > I admit I'm not keen on adding this additional field per-thread just > for a temporary usage. Some kind of stack allocated helper would be > preferable, but would need to be passed through the call chain so that > the events could be added to it. Right, and the GC and nmethods_do has to find it somehow.? It wasn't my first choice of where to put it also because there is too many things in JavaThread.? Might be time for a future cleanup of Thread. > > Also I'm not clear why we aggressively delete the _jvmti_event_queue > after posting the events. I'd be worried about the overhead we are > introducing for creating and deleting this queue. When the > JvmtiDeferredEventQueue data structure was intended only for use by > the ServiceThread its dynamic node allocation may have made more > sense. But now that seems like a liability to me - if > JvmtiDeferredEvents could be linked directly we wouldn't need dynamic > nodes, nor dynamic per-thread queues (just a per-thread pointer). I'm not following.? The queue is for multiple events that might be posted while in the CodeCache_lock, so they need to be in order and linked together.? While we post them and take them off, if the callback safepoints (maybe calls back into the JVM), we don't want to have GC or nmethods_do walk the one that's been posted already. So a queue seems to make sense. One thing that I experimented with was to have the ServiceThread take ownership of the queue in it's local thread queue and post them all, which could be a future enhancement.? It didn't help my OOM situation. Deleting the queue after all the events are posted allows JavaThread::oops_do and nmethods_do only a null check to deal with this jvmti wart. Thanks, Coleen > > Just some thoughts. > > Thanks, > David > >> I did write comments to this effect here: >> >> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >> >> >> Thanks, >> Coleen >> >>> >>> David >>> >>>> Also, the jmethod_id field in nmethod was only used as a boolean so >>>> don't create a jmethod_id until needed for >>>> post_compiled_method_unload. >>>> >>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed >>>> in the original bug report. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>> >>>> Thanks, >>>> Coleen >> From coleen.phillimore at oracle.com Tue Dec 3 13:11:12 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Dec 2019 08:11:12 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <30df2087-a20d-f745-8c19-fb0173a50421@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> <30df2087-a20d-f745-8c19-fb0173a50421@oracle.com> Message-ID: <71c63f79-a893-6e3d-f418-bc16670c65ca@oracle.com> Thanks Erik! Coleen On 12/3/19 7:48 AM, erik.osterlund at oracle.com wrote: > Hi Coleen, > > This looks great. Thanks for sorting this out! > > /Erik > > On 12/2/19 3:43 PM, coleen.phillimore at oracle.com wrote: >> >> >> On 11/26/19 7:03 PM, David Holmes wrote: >>> (adding runtime as well) >>> >>> Hi Coleen, >>> >>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>> Summary: Add local deferred event list to thread to post events >>>> outside CodeCache_lock. >>>> >>>> This patch builds on the patch for JDK-8173361.? With this patch, I >>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >>>> and have one per thread.? The CodeBlob event that used to drop the >>>> CodeCache_lock and raced with the sweeper thread, adds the events >>>> it wants to post to its thread local list, and processes it outside >>>> the lock.? The list is walked in GC and by the sweeper to keep the >>>> nmethods from being unloaded and zombied, respectively. >>> >>> Sorry I don't understand why we would want/need a deferred event >>> queue for every JavaThread? Isn't this only relevant for >>> non-JavaThreads that need to have the ServiceThread process the >>> deferred event? >> >> I thought I'd written this in the bug but I had only discussed this >> with Erik.? I've added a comment to the bug to explain why I added >> the per-JavaThread queue.? In order to process these events after the >> CodeCache_lock is dropped, I have to queue them somewhere safe. The >> ServiceThread queue is safe, *but* the ServiceThread can't keep up >> with the events, especially from this test case.? So the test case >> gets a native OOM. >> >> So I've added the safe queue as a field to each JavaThread because >> multiple JavaThreads could be posting these events at the same time, >> and there didn't seem to be a better safe place to cache them, >> without adding another layer of queuing code. >> >> I did write comments to this effect here: >> >> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >> >> >> Thanks, >> Coleen >> >>> >>> David >>> >>>> Also, the jmethod_id field in nmethod was only used as a boolean so >>>> don't create a jmethod_id until needed for >>>> post_compiled_method_unload. >>>> >>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed >>>> in the original bug report. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>> >>>> Thanks, >>>> Coleen >> > From david.holmes at oracle.com Tue Dec 3 13:31:22 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Dec 2019 23:31:22 +1000 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> Message-ID: On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote: > > > On 12/2/19 11:52 PM, David Holmes wrote: >> Hi Coleen, >> >> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 11/26/19 7:03 PM, David Holmes wrote: >>>> (adding runtime as well) >>>> >>>> Hi Coleen, >>>> >>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>>> Summary: Add local deferred event list to thread to post events >>>>> outside CodeCache_lock. >>>>> >>>>> This patch builds on the patch for JDK-8173361.? With this patch, I >>>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >>>>> and have one per thread.? The CodeBlob event that used to drop the >>>>> CodeCache_lock and raced with the sweeper thread, adds the events >>>>> it wants to post to its thread local list, and processes it outside >>>>> the lock.? The list is walked in GC and by the sweeper to keep the >>>>> nmethods from being unloaded and zombied, respectively. >>>> >>>> Sorry I don't understand why we would want/need a deferred event >>>> queue for every JavaThread? Isn't this only relevant for >>>> non-JavaThreads that need to have the ServiceThread process the >>>> deferred event? >>> >>> I thought I'd written this in the bug but I had only discussed this >>> with Erik.? I've added a comment to the bug to explain why I added >>> the per-JavaThread queue.? In order to process these events after the >>> CodeCache_lock is dropped, I have to queue them somewhere safe. The >>> ServiceThread queue is safe, *but* the ServiceThread can't keep up >>> with the events, especially from this test case.? So the test case >>> gets a native OOM. >>> >>> So I've added the safe queue as a field to each JavaThread because >>> multiple JavaThreads could be posting these events at the same time, >>> and there didn't seem to be a better safe place to cache them, >>> without adding another layer of queuing code. >> >> I think I'm getting the picture now. At the time the events are >> generated we can't post them directly because the current thread is >> inside compiler code. Hence the events must be deferred. Using the >> ServiceThread to handle the deferred events is one way to deal with >> this - but it can't keep up in this scenario. So instead we store the >> events in the current thread and when the current thread returns to >> code where it is safe to post the events, it does so itself. Is that >> generally correct? > > Yes. >> >> I admit I'm not keen on adding this additional field per-thread just >> for a temporary usage. Some kind of stack allocated helper would be >> preferable, but would need to be passed through the call chain so that >> the events could be added to it. > > Right, and the GC and nmethods_do has to find it somehow.? It wasn't my > first choice of where to put it also because there is too many things in > JavaThread.? Might be time for a future cleanup of Thread. I see. >> >> Also I'm not clear why we aggressively delete the _jvmti_event_queue >> after posting the events. I'd be worried about the overhead we are >> introducing for creating and deleting this queue. When the >> JvmtiDeferredEventQueue data structure was intended only for use by >> the ServiceThread its dynamic node allocation may have made more >> sense. But now that seems like a liability to me - if >> JvmtiDeferredEvents could be linked directly we wouldn't need dynamic >> nodes, nor dynamic per-thread queues (just a per-thread pointer). > > I'm not following.? The queue is for multiple events that might be > posted while in the CodeCache_lock, so they need to be in order and > linked together.? While we post them and take them off, if the callback > safepoints (maybe calls back into the JVM), we don't want to have GC or > nmethods_do walk the one that's been posted already. So a queue seems to > make sense. Yes but you can make a queue just by having each event have a _next pointer, rather than dynamically creating nodes to hold the event. Each event is its own queue node implicitly. > One thing that I experimented with was to have the ServiceThread take > ownership of the queue in it's local thread queue and post them all, > which could be a future enhancement.? It didn't help my OOM situation. Your OOM situation seems to be a basic case of overwhelming the ServiceThread. A single serviceThread will always have a limit on how many events it can handle. Maybe this test is being too unrealistic in its expectations of the current design? > Deleting the queue after all the events are posted allows > JavaThread::oops_do and nmethods_do only a null check to deal with this > jvmti wart. If the nodes are not dynamically allocated you don't need to delete you just set the queue-head pointer to NULL - actually it will already be NULL once the last event has been processed. David ----- > Thanks, > Coleen >> >> Just some thoughts. >> >> Thanks, >> David >> >>> I did write comments to this effect here: >>> >>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >>> >>> >>> Thanks, >>> Coleen >>> >>>> >>>> David >>>> >>>>> Also, the jmethod_id field in nmethod was only used as a boolean so >>>>> don't create a jmethod_id until needed for >>>>> post_compiled_method_unload. >>>>> >>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed >>>>> in the original bug report. >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>> >>>>> Thanks, >>>>> Coleen >>> > From coleen.phillimore at oracle.com Tue Dec 3 13:35:58 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Dec 2019 08:35:58 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> Message-ID: On 12/3/19 8:31 AM, David Holmes wrote: > On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote: >> >> >> On 12/2/19 11:52 PM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 11/26/19 7:03 PM, David Holmes wrote: >>>>> (adding runtime as well) >>>>> >>>>> Hi Coleen, >>>>> >>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Add local deferred event list to thread to post events >>>>>> outside CodeCache_lock. >>>>>> >>>>>> This patch builds on the patch for JDK-8173361.? With this patch, >>>>>> I made the JvmtiDeferredEventQueue an instance class (not >>>>>> AllStatic) and have one per thread. The CodeBlob event that used >>>>>> to drop the CodeCache_lock and raced with the sweeper thread, >>>>>> adds the events it wants to post to its thread local list, and >>>>>> processes it outside the lock.? The list is walked in GC and by >>>>>> the sweeper to keep the nmethods from being unloaded and zombied, >>>>>> respectively. >>>>> >>>>> Sorry I don't understand why we would want/need a deferred event >>>>> queue for every JavaThread? Isn't this only relevant for >>>>> non-JavaThreads that need to have the ServiceThread process the >>>>> deferred event? >>>> >>>> I thought I'd written this in the bug but I had only discussed this >>>> with Erik.? I've added a comment to the bug to explain why I added >>>> the per-JavaThread queue.? In order to process these events after >>>> the CodeCache_lock is dropped, I have to queue them somewhere safe. >>>> The ServiceThread queue is safe, *but* the ServiceThread can't keep >>>> up with the events, especially from this test case.? So the test >>>> case gets a native OOM. >>>> >>>> So I've added the safe queue as a field to each JavaThread because >>>> multiple JavaThreads could be posting these events at the same >>>> time, and there didn't seem to be a better safe place to cache >>>> them, without adding another layer of queuing code. >>> >>> I think I'm getting the picture now. At the time the events are >>> generated we can't post them directly because the current thread is >>> inside compiler code. Hence the events must be deferred. Using the >>> ServiceThread to handle the deferred events is one way to deal with >>> this - but it can't keep up in this scenario. So instead we store >>> the events in the current thread and when the current thread returns >>> to code where it is safe to post the events, it does so itself. Is >>> that generally correct? >> >> Yes. >>> >>> I admit I'm not keen on adding this additional field per-thread just >>> for a temporary usage. Some kind of stack allocated helper would be >>> preferable, but would need to be passed through the call chain so >>> that the events could be added to it. >> >> Right, and the GC and nmethods_do has to find it somehow.? It wasn't >> my first choice of where to put it also because there is too many >> things in JavaThread.? Might be time for a future cleanup of Thread. > > I see. > >>> >>> Also I'm not clear why we aggressively delete the _jvmti_event_queue >>> after posting the events. I'd be worried about the overhead we are >>> introducing for creating and deleting this queue. When the >>> JvmtiDeferredEventQueue data structure was intended only for use by >>> the ServiceThread its dynamic node allocation may have made more >>> sense. But now that seems like a liability to me - if >>> JvmtiDeferredEvents could be linked directly we wouldn't need >>> dynamic nodes, nor dynamic per-thread queues (just a per-thread >>> pointer). >> >> I'm not following.? The queue is for multiple events that might be >> posted while in the CodeCache_lock, so they need to be in order and >> linked together.? While we post them and take them off, if the >> callback safepoints (maybe calls back into the JVM), we don't want to >> have GC or nmethods_do walk the one that's been posted already. So a >> queue seems to make sense. > > Yes but you can make a queue just by having each event have a _next > pointer, rather than dynamically creating nodes to hold the event. > Each event is its own queue node implicitly. > >> One thing that I experimented with was to have the ServiceThread take >> ownership of the queue in it's local thread queue and post them all, >> which could be a future enhancement.? It didn't help my OOM situation. > > Your OOM situation seems to be a basic case of overwhelming the > ServiceThread. A single serviceThread will always have a limit on how > many events it can handle. Maybe this test is being too unrealistic in > its expectations of the current design? I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD for all the events in the queue is going to be overwhelming unless it waits for the events to be posted. > >> Deleting the queue after all the events are posted allows >> JavaThread::oops_do and nmethods_do only a null check to deal with >> this jvmti wart. > > If the nodes are not dynamically allocated you don't need to delete > you just set the queue-head pointer to NULL - actually it will already > be NULL once the last event has been processed. I could revisit the data structure as a future RFE.? The goal was to reuse code that's already there, and I don't think there's a significant difference in performance.? I did some measurement of the stress case and the times were equivalent, actually better in the new code. Thanks, Coleen > > David > ----- > >> Thanks, >> Coleen >>> >>> Just some thoughts. >>> >>> Thanks, >>> David >>> >>>> I did write comments to this effect here: >>>> >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >>>> >>>> >>>> Thanks, >>>> Coleen >>>> >>>>> >>>>> David >>>>> >>>>>> Also, the jmethod_id field in nmethod was only used as a boolean >>>>>> so don't create a jmethod_id until needed for >>>>>> post_compiled_method_unload. >>>>>> >>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>> crashed in the original bug report. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>> >> From coleen.phillimore at oracle.com Tue Dec 3 18:21:15 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Dec 2019 13:21:15 -0500 Subject: RFR (XS) 8235273: nmethodLocker not needed for COMPILED_METHOD_UNLOAD events Message-ID: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com> Summary: remove unnecessary nmethodLocker See bug for more details.? Tested with tier2-8. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235273.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8235273 (Note, this has a trivial merge with the change for JDK-8212160). Thanks, Coleen From serguei.spitsyn at oracle.com Tue Dec 3 19:29:16 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 3 Dec 2019 11:29:16 -0800 Subject: RFR (XXS): 8235280: UnProblemList vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java Message-ID: <40efdaca-7402-a809-bd81-ae806a9b3f9b@oracle.com> Please, review a trivial fix for sub-task: ? https://bugs.openjdk.java.net/browse/JDK-8235280 The fix is to remove the test from the ProblemList.txt: diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -182,7 +182,6 @@ ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java 8219652 aix-ppc64 ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java 8219652 aix-ppc64 ?vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java 8219652 aix-ppc64 -vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java 8221372 windows-x64 ?vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8208243,8192647 generic-all Thanks, Serguei From igor.ignatyev at oracle.com Tue Dec 3 19:32:17 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 3 Dec 2019 11:32:17 -0800 Subject: RFR (XXS): 8235280: UnProblemList vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java In-Reply-To: <40efdaca-7402-a809-bd81-ae806a9b3f9b@oracle.com> References: <40efdaca-7402-a809-bd81-ae806a9b3f9b@oracle.com> Message-ID: <83A886BE-4F8A-4E4D-AD4A-7073AA7058A1@oracle.com> LGTM -- Igor > On Dec 3, 2019, at 11:29 AM, serguei.spitsyn at oracle.com wrote: > > Please, review a trivial fix for sub-task: > https://bugs.openjdk.java.net/browse/JDK-8235280 > > The fix is to remove the test from the ProblemList.txt: > > diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -182,7 +182,6 @@ > vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java 8219652 aix-ppc64 > vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java 8219652 aix-ppc64 > vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java 8219652 aix-ppc64 > -vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java 8221372 windows-x64 > > vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8208243,8192647 generic-all > > > Thanks, > Serguei From serguei.spitsyn at oracle.com Tue Dec 3 19:33:02 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 3 Dec 2019 11:33:02 -0800 Subject: RFR (XXS): 8235280: UnProblemList vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java In-Reply-To: <83A886BE-4F8A-4E4D-AD4A-7073AA7058A1@oracle.com> References: <40efdaca-7402-a809-bd81-ae806a9b3f9b@oracle.com> <83A886BE-4F8A-4E4D-AD4A-7073AA7058A1@oracle.com> Message-ID: <0597c486-1905-97ba-8f81-81a9923db6e5@oracle.com> Thanks, Igor! Serguei On 12/3/19 11:32 AM, Igor Ignatyev wrote: > LGTM > -- Igor > >> On Dec 3, 2019, at 11:29 AM, serguei.spitsyn at oracle.com wrote: >> >> Please, review a trivial fix for sub-task: >> https://bugs.openjdk.java.net/browse/JDK-8235280 >> >> The fix is to remove the test from the ProblemList.txt: >> >> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt >> +++ b/test/hotspot/jtreg/ProblemList.txt >> @@ -182,7 +182,6 @@ >> vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java 8219652 aix-ppc64 >> vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java 8219652 aix-ppc64 >> vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java 8219652 aix-ppc64 >> -vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java 8221372 windows-x64 >> >> vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8208243,8192647 generic-all >> >> >> Thanks, >> Serguei From daniil.x.titov at oracle.com Tue Dec 3 19:42:54 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 03 Dec 2019 11:42:54 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware Message-ID: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> Please review the change that makes OperatingSystemMXBean methods return container specific information rather than the host based data. The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined with the spec update David made [3]. The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total and free memory from the returned values. It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file). CSR for the spec changes [3] is approved. Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 Thank you, -Daniil From chris.plummer at oracle.com Tue Dec 3 20:45:55 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 3 Dec 2019 12:45:55 -0800 Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and do a better job of detecting SA failures Message-ID: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8234277 http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/ No longer redirect stderr for the jhsdb/clhsdb process. It results in not seeing attach failures in the output, so OutputAnalyer can't check for them. Execute "verbose true" as the first clhsdb command after launching. This will result in verboseExceptions being true in CommandProcessor.java, so full exception traces will appear in the output. This will make debugging future SA test failures a lot easier. Add an extra check for any DebuggerException. This is mainly for detecting that the attached failed. This previously was going un-noticed, and instead the test would later fail because it noticed some other issue, like missing output, which isn't very informative. Add checks for other unexpected SA exceptions that are caught and printed by CommandProcessor. These will always have an "Error: " prefix, making them easy to detect. Problem list ClhsdbScanOops.java. With the new error checking, it will now always fail on windows due to JDK-8230731 and on macos and linux due to JDK-8235220. These failures are not "new" per se, but are just now being properly detected. thanks, Chris From chris.plummer at oracle.com Tue Dec 3 20:56:34 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 3 Dec 2019 12:56:34 -0800 Subject: RFR(XS): 8235221: Fix ProblemList.txt for sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java Message-ID: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8235221 diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt --- a/test/jdk/ProblemList.txt +++ b/test/jdk/ProblemList.txt @@ -914,8 +914,7 @@ ?sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 solaris-all,linux-ppc64,linux-ppc64le ?sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 windows-all -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231635,8231634 generic-all +sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231634,8230731,8001227 generic-all,windows-all Listing the same test on multiple lines just result in the last entry being used, so merge into one line. Also JDK-8231635 has been fixed. thanks, Chris From igor.ignatyev at oracle.com Tue Dec 3 21:00:00 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 3 Dec 2019 13:00:00 -0800 Subject: RFR(XS): 8235221: Fix ProblemList.txt for sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java In-Reply-To: References: Message-ID: LGTM, -- Igor > On Dec 3, 2019, at 12:56 PM, Chris Plummer wrote: > > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8235221 > > diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt > --- a/test/jdk/ProblemList.txt > +++ b/test/jdk/ProblemList.txt > @@ -914,8 +914,7 @@ > > sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 solaris-all,linux-ppc64,linux-ppc64le > sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all > -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 windows-all > -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231635,8231634 generic-all > +sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231634,8230731,8001227 generic-all,windows-all > > > Listing the same test on multiple lines just result in the last entry being used, so merge into one line. Also JDK-8231635 has been fixed. > > thanks, > > Chris > From serguei.spitsyn at oracle.com Tue Dec 3 21:10:07 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 3 Dec 2019 13:10:07 -0800 Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and do a better job of detecting SA failures In-Reply-To: References: Message-ID: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com> Hi Chris, It looks good. Thanks, Serguei On 12/3/19 12:45 PM, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8234277 > http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/ > > No longer redirect stderr for the jhsdb/clhsdb process. It results in > not seeing attach failures in the output, so OutputAnalyer can't check > for them. > > Execute "verbose true" as the first clhsdb command after launching. > This will result in verboseExceptions being true in > CommandProcessor.java, so full exception traces will appear in the > output. This will make debugging future SA test failures a lot easier. > > Add an extra check for any DebuggerException. This is mainly for > detecting that the attached failed. This previously was going > un-noticed, and instead the test would later fail because it noticed > some other issue, like missing output, which isn't very informative. > > Add checks for other unexpected SA exceptions that are caught and > printed by CommandProcessor. These will always have an "Error: " > prefix, making them easy to detect. > > Problem list ClhsdbScanOops.java. With the new error checking, it will > now always fail on windows due to JDK-8230731 and on macos and linux > due to JDK-8235220. These failures are not "new" per se, but are just > now being properly detected. > > thanks, > > Chris From serguei.spitsyn at oracle.com Tue Dec 3 21:16:38 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 3 Dec 2019 13:16:38 -0800 Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and do a better job of detecting SA failures In-Reply-To: References: Message-ID: <7b22e878-241e-00b7-6434-ed2987497f2f@oracle.com> Hi Chris, It looks good. I'm in favor to always run tests in verbose mode. It is not a good idea in general to optimize on it. Thanks, Serguei On 12/3/19 12:45 PM, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8234277 > http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/ > > No longer redirect stderr for the jhsdb/clhsdb process. It results in > not seeing attach failures in the output, so OutputAnalyer can't check > for them. > > Execute "verbose true" as the first clhsdb command after launching. > This will result in verboseExceptions being true in > CommandProcessor.java, so full exception traces will appear in the > output. This will make debugging future SA test failures a lot easier. > > Add an extra check for any DebuggerException. This is mainly for > detecting that the attached failed. This previously was going > un-noticed, and instead the test would later fail because it noticed > some other issue, like missing output, which isn't very informative. > > Add checks for other unexpected SA exceptions that are caught and > printed by CommandProcessor. These will always have an "Error: " > prefix, making them easy to detect. > > Problem list ClhsdbScanOops.java. With the new error checking, it will > now always fail on windows due to JDK-8230731 and on macos and linux > due to JDK-8235220. These failures are not "new" per se, but are just > now being properly detected. > > thanks, > > Chris From bob.vandette at oracle.com Tue Dec 3 21:30:17 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 3 Dec 2019 16:30:17 -0500 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> Message-ID: <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com> Daniil, Looks good to me. If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the container detection to report containerized. It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to getSystemCpuLoad0. Bob. > On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: > > Please review the change that makes OperatingSystemMXBean methods return container specific information > rather than the host based data. > > The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined > with the spec update David made [3]. > > The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total > and free memory from the returned values. > > It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. > > The webrev also takes into account the case when java.security.AccessControlException exception is thrown > during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to > "/proc/self/mountinfo" file). > > CSR for the spec changes [3] is approved. > > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thank you, > -Daniil > > From mandy.chung at oracle.com Wed Dec 4 00:10:13 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 3 Dec 2019 16:10:13 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> Message-ID: On 12/3/19 11:42 AM, Daniil Titov wrote: > Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data. > > The webrev also takes into account the case when java.security.AccessControlException exception is thrown > during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file). Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy.? The jdk default security policy should grant proper permission to do so. > > CSR for the spec changes [3] is approved. > > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 > > src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java ??? this should wrap the security-sensitive operations with doPrivileged.? jdk.management is trusted and it has all permissions. src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c ??? Formatting nit:? line 346-355: JDK native source uses 4-space identation convention.? A space is missing between "if" and "(". src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java 59 if (limit >= 0 && memLimit >= 0) { 60 return limit - memLimit; 61 } Under what circumstance that limit or memLimit is < 0??? It fallbacks to return the system's total swap space size - this is not really what it should report.?? Is it worth specifying this case?Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. getFreeSwapSpaceSize retry for a few times.? What special about this method but not others like getFreeMemorySize? src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java ???? There is no strong need to make the deprecated methods as default methods.? If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. CheckOperatingSystemMXBean.java ???? System.out.println(String.format(...)) can simply be replaced with System.out.format. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Wed Dec 4 00:54:41 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 4 Dec 2019 09:54:41 +0900 Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> Message-ID: PING: Could you review it? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ This bug is targeted to JDK 14. Thanks, Yasumasa On 2019/11/28 21:39, Yasumasa Suenaga wrote: > Hi, > > I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and > all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). > Could you review new webrev? > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ > > The diff from previous webrev is here: > ? http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b > > > Thanks, > > Yasumasa > > > On 2019/11/25 14:08, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this change: >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >> >> >> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >> for stack unwinding. >> >> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >> library (e.g. libc) might be compiled with this feature. >> >> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >> So it might be lack of stack frames. >> >> I guess JDK-8219201 is caused by same issue. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf From daniil.x.titov at oracle.com Wed Dec 4 02:00:28 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 03 Dec 2019 18:00:28 -0800 Subject: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com> Message-ID: Hi Bob, >> It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to >> getSystemCpuLoad0. I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns to the number of the CPUs configured on the host and returned by sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected. JNIEXPORT jint JNICALL Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0 (JNIEnv *env, jobject mbean) { if(perfInit() == 0) { return counters.nProcs; } else { return -1; } } If there is no objection I will include this change in the new webrev. Thank you, Daniil ?On 12/3/19, 1:30 PM, "Bob Vandette" wrote: Daniil, Looks good to me. If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the container detection to report containerized. It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to getSystemCpuLoad0. Bob. > On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: > > Please review the change that makes OperatingSystemMXBean methods return container specific information > rather than the host based data. > > The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined > with the spec update David made [3]. > > The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total > and free memory from the returned values. > > It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. > > The webrev also takes into account the case when java.security.AccessControlException exception is thrown > during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to > "/proc/self/mountinfo" file). > > CSR for the spec changes [3] is approved. > > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thank you, > -Daniil > > From daniil.x.titov at oracle.com Wed Dec 4 03:34:23 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 03 Dec 2019 19:34:23 -0800 Subject: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> Message-ID: Hi Mandy, Thank you for your comments, please find my answers below. >> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java >> this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. I will include this change in the next webrev, thank you. >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> Formatting nit: line 346-355: JDK native source uses 4-space identation convention. A space is missing between "if" and "(". I will correct this, thanks. >>Under what circumstance that limit or memLimit is < 0? The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and it can use as much memory as the host's OS allows. >> Is it worth specifying this case? I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. >> It fallbacks to return the system's total swap space size - this is not really what it should report. For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. However, I am not sure how we could differentiate these 2 cases. >> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() >>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >> There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. I could make these methods defaults if you feel it is a better approach here. >>CheckOperatingSystemMXBean.java >> System.out.println(String.format(...)) can simply be replaced with System.out.format. I will include this change in the next webrev, thank you! Best regards, Daniil From: Mandy Chung Date: Tuesday, December 3, 2019 at 4:10 PM To: Daniil Titov Cc: OpenJDK Serviceability , "jmx-dev at openjdk.java.net" , Bob Vandette Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware On 12/3/19 11:42 AM, Daniil Titov wrote: Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file). Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy.? The jdk default security policy should grant proper permission to do so. CSR for the spec changes [3] is approved. Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java ??? this should wrap the security-sensitive operations with doPrivileged.? jdk.management is trusted and it has all permissions. src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c ??? Formatting nit:? line 346-355: JDK native source uses 4-space identation convention.? A space is missing between "if" and "(". src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java 59 if (limit >= 0 && memLimit >= 0) { 60 return limit - memLimit; 61 } Under what circumstance that limit or memLimit is < 0??? It fallbacks to return the system's total swap space size - this is not really what it should report.?? Is it worth? specifying this case? Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.?? getFreeSwapSpaceSize retry for a few times.? What special about this method but not others like getFreeMemorySize? src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java ???? There is no strong need to make the deprecated methods as default methods.? If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. CheckOperatingSystemMXBean.java ???? System.out.println(String.format(...)) can simply be replaced with System.out.format. Mandy From chris.plummer at oracle.com Wed Dec 4 04:24:40 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 3 Dec 2019 20:24:40 -0800 Subject: RFR(XS): 8235221: Fix ProblemList.txt for sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java In-Reply-To: References: Message-ID: Thanks Igor! Chris On 12/3/19 1:00 PM, Igor Ignatyev wrote: > LGTM, > > -- Igor > >> On Dec 3, 2019, at 12:56 PM, Chris Plummer wrote: >> >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8235221 >> >> diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt >> --- a/test/jdk/ProblemList.txt >> +++ b/test/jdk/ProblemList.txt >> @@ -914,8 +914,7 @@ >> >> sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 solaris-all,linux-ppc64,linux-ppc64le >> sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all >> -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 windows-all >> -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231635,8231634 generic-all >> +sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231634,8230731,8001227 generic-all,windows-all >> >> >> Listing the same test on multiple lines just result in the last entry being used, so merge into one line. Also JDK-8231635 has been fixed. >> >> thanks, >> >> Chris >> From chris.plummer at oracle.com Wed Dec 4 04:25:03 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 3 Dec 2019 20:25:03 -0800 Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and do a better job of detecting SA failures In-Reply-To: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com> References: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com> Message-ID: Thanks Serguei! Chris On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > It looks good. > > Thanks, > Serguei > > On 12/3/19 12:45 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8234277 >> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/ >> >> No longer redirect stderr for the jhsdb/clhsdb process. It results in >> not seeing attach failures in the output, so OutputAnalyer can't >> check for them. >> >> Execute "verbose true" as the first clhsdb command after launching. >> This will result in verboseExceptions being true in >> CommandProcessor.java, so full exception traces will appear in the >> output. This will make debugging future SA test failures a lot easier. >> >> Add an extra check for any DebuggerException. This is mainly for >> detecting that the attached failed. This previously was going >> un-noticed, and instead the test would later fail because it noticed >> some other issue, like missing output, which isn't very informative. >> >> Add checks for other unexpected SA exceptions that are caught and >> printed by CommandProcessor. These will always have an "Error: " >> prefix, making them easy to detect. >> >> Problem list ClhsdbScanOops.java. With the new error checking, it >> will now always fail on windows due to JDK-8230731 and on macos and >> linux due to JDK-8235220. These failures are not "new" per se, but >> are just now being properly detected. >> >> thanks, >> >> Chris > From david.holmes at oracle.com Wed Dec 4 04:39:20 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 4 Dec 2019 14:39:20 +1000 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> Message-ID: On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote: > > > On 12/3/19 8:31 AM, David Holmes wrote: >> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 12/2/19 11:52 PM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 11/26/19 7:03 PM, David Holmes wrote: >>>>>> (adding runtime as well) >>>>>> >>>>>> Hi Coleen, >>>>>> >>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: Add local deferred event list to thread to post events >>>>>>> outside CodeCache_lock. >>>>>>> >>>>>>> This patch builds on the patch for JDK-8173361.? With this patch, >>>>>>> I made the JvmtiDeferredEventQueue an instance class (not >>>>>>> AllStatic) and have one per thread. The CodeBlob event that used >>>>>>> to drop the CodeCache_lock and raced with the sweeper thread, >>>>>>> adds the events it wants to post to its thread local list, and >>>>>>> processes it outside the lock.? The list is walked in GC and by >>>>>>> the sweeper to keep the nmethods from being unloaded and zombied, >>>>>>> respectively. >>>>>> >>>>>> Sorry I don't understand why we would want/need a deferred event >>>>>> queue for every JavaThread? Isn't this only relevant for >>>>>> non-JavaThreads that need to have the ServiceThread process the >>>>>> deferred event? >>>>> >>>>> I thought I'd written this in the bug but I had only discussed this >>>>> with Erik.? I've added a comment to the bug to explain why I added >>>>> the per-JavaThread queue.? In order to process these events after >>>>> the CodeCache_lock is dropped, I have to queue them somewhere safe. >>>>> The ServiceThread queue is safe, *but* the ServiceThread can't keep >>>>> up with the events, especially from this test case.? So the test >>>>> case gets a native OOM. >>>>> >>>>> So I've added the safe queue as a field to each JavaThread because >>>>> multiple JavaThreads could be posting these events at the same >>>>> time, and there didn't seem to be a better safe place to cache >>>>> them, without adding another layer of queuing code. >>>> >>>> I think I'm getting the picture now. At the time the events are >>>> generated we can't post them directly because the current thread is >>>> inside compiler code. Hence the events must be deferred. Using the >>>> ServiceThread to handle the deferred events is one way to deal with >>>> this - but it can't keep up in this scenario. So instead we store >>>> the events in the current thread and when the current thread returns >>>> to code where it is safe to post the events, it does so itself. Is >>>> that generally correct? >>> >>> Yes. >>>> >>>> I admit I'm not keen on adding this additional field per-thread just >>>> for a temporary usage. Some kind of stack allocated helper would be >>>> preferable, but would need to be passed through the call chain so >>>> that the events could be added to it. >>> >>> Right, and the GC and nmethods_do has to find it somehow.? It wasn't >>> my first choice of where to put it also because there is too many >>> things in JavaThread.? Might be time for a future cleanup of Thread. >> >> I see. >> >>>> >>>> Also I'm not clear why we aggressively delete the _jvmti_event_queue >>>> after posting the events. I'd be worried about the overhead we are >>>> introducing for creating and deleting this queue. When the >>>> JvmtiDeferredEventQueue data structure was intended only for use by >>>> the ServiceThread its dynamic node allocation may have made more >>>> sense. But now that seems like a liability to me - if >>>> JvmtiDeferredEvents could be linked directly we wouldn't need >>>> dynamic nodes, nor dynamic per-thread queues (just a per-thread >>>> pointer). >>> >>> I'm not following.? The queue is for multiple events that might be >>> posted while in the CodeCache_lock, so they need to be in order and >>> linked together.? While we post them and take them off, if the >>> callback safepoints (maybe calls back into the JVM), we don't want to >>> have GC or nmethods_do walk the one that's been posted already. So a >>> queue seems to make sense. >> >> Yes but you can make a queue just by having each event have a _next >> pointer, rather than dynamically creating nodes to hold the event. >> Each event is its own queue node implicitly. >> >>> One thing that I experimented with was to have the ServiceThread take >>> ownership of the queue in it's local thread queue and post them all, >>> which could be a future enhancement.? It didn't help my OOM situation. >> >> Your OOM situation seems to be a basic case of overwhelming the >> ServiceThread. A single serviceThread will always have a limit on how >> many events it can handle. Maybe this test is being too unrealistic in >> its expectations of the current design? > > I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD for > all the events in the queue is going to be overwhelming unless it waits > for the events to be posted. Taking things off the service thread would seem to be a good thing then :) >> >>> Deleting the queue after all the events are posted allows >>> JavaThread::oops_do and nmethods_do only a null check to deal with >>> this jvmti wart. >> >> If the nodes are not dynamically allocated you don't need to delete >> you just set the queue-head pointer to NULL - actually it will already >> be NULL once the last event has been processed. > > I could revisit the data structure as a future RFE.? The goal was to > reuse code that's already there, and I don't think there's a significant > difference in performance.? I did some measurement of the stress case > and the times were equivalent, actually better in the new code. Okay. Thanks, David > > Thanks, > Coleen >> >> David >> ----- >> >>> Thanks, >>> Coleen >>>> >>>> Just some thoughts. >>>> >>>> Thanks, >>>> David >>>> >>>>> I did write comments to this effect here: >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >>>>> >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>>> >>>>>> David >>>>>> >>>>>>> Also, the jmethod_id field in nmethod was only used as a boolean >>>>>>> so don't create a jmethod_id until needed for >>>>>>> post_compiled_method_unload. >>>>>>> >>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>>> crashed in the original bug report. >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>> >>> > From david.holmes at oracle.com Wed Dec 4 04:49:56 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 4 Dec 2019 14:49:56 +1000 Subject: RFR (XS) 8235273: nmethodLocker not needed for COMPILED_METHOD_UNLOAD events In-Reply-To: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com> References: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com> Message-ID: <8530df0a-c67c-3159-4b96-8bde4e925434@oracle.com> Hi Coleen, That all seems fine to me. Thanks, David On 4/12/2019 4:21 am, coleen.phillimore at oracle.com wrote: > Summary: remove unnecessary nmethodLocker > > See bug for more details.? Tested with tier2-8. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235273.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8235273 > > (Note, this has a trivial merge with the change for JDK-8212160). > > Thanks, > Coleen From daniil.x.titov at oracle.com Wed Dec 4 05:37:10 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 03 Dec 2019 21:37:10 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <0A51834F-622B-42DD-A0DA-AFAD59B23D29@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com> <0A51834F-622B-42DD-A0DA-AFAD59B23D29@oracle.com> Message-ID: <1EB171CE-2582-4A42-A7F6-3D37C33DFBDD@oracle.com> Resending with the corrected title, "RFR" was somehow stripped from it that breaks the sorting by subject... Hi Bob, >> It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to >> getSystemCpuLoad0. I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns to the number of the CPUs configured on the host and returned by sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected. JNIEXPORT jint JNICALL Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0 (JNIEnv *env, jobject mbean) { if(perfInit() == 0) { return counters.nProcs; } else { return -1; } } If there is no objection I will include this change in the new webrev. Thank you, Daniil ?On 12/3/19, 1:30 PM, "Bob Vandette" wrote: Daniil, Looks good to me. If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the container detection to report containerized. It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to getSystemCpuLoad0. Bob. > On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: > > Please review the change that makes OperatingSystemMXBean methods return container specific information > rather than the host based data. > > The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined > with the spec update David made [3]. > > The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total > and free memory from the returned values. > > It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. > > The webrev also takes into account the case when java.security.AccessControlException exception is thrown > during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to > "/proc/self/mountinfo" file). > > CSR for the spec changes [3] is approved. > > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thank you, > -Daniil > > From daniil.x.titov at oracle.com Wed Dec 4 05:40:32 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 03 Dec 2019 21:40:32 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> Message-ID: <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> Resending with the corrected subject. "RFR" was somehow stripped from it and that breaks the sorting by subject... ? Hi Mandy, Thank you for your comments, please find my answers below. >> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java >> this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. I will include this change in the next webrev, thank you. >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> Formatting nit: line 346-355: JDK native source uses 4-space identation convention. A space is missing between "if" and "(". I will correct this, thanks. >>Under what circumstance that limit or memLimit is < 0? The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and it can use as much memory as the host's OS allows. >> Is it worth specifying this case? I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. >> It fallbacks to return the system's total swap space size - this is not really what it should report. For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. However, I am not sure how we could differentiate these 2 cases. >> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() >>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >> There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. I could make these methods defaults if you feel it is a better approach here. >>CheckOperatingSystemMXBean.java >> System.out.println(String.format(...)) can simply be replaced with System.out.format. I will include this change in the next webrev, thank you! Best regards, Daniil From: Mandy Chung Date: Tuesday, December 3, 2019 at 4:10 PM To: Daniil Titov Cc: OpenJDK Serviceability , "jmx-dev at openjdk.java.net" , Bob Vandette Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware On 12/3/19 11:42 AM, Daniil Titov wrote: Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file). Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy. The jdk default security policy should grant proper permission to do so. CSR for the spec changes [3] is approved. Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c Formatting nit: line 346-355: JDK native source uses 4-space identation convention. A space is missing between "if" and "(". src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java 59 if (limit >= 0 && memLimit >= 0) { 60 return limit - memLimit; 61 } Under what circumstance that limit or memLimit is < 0? It fallbacks to return the system's total swap space size - this is not really what it should report. Is it worth specifying this case? Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. getFreeSwapSpaceSize retry for a few times. What special about this method but not others like getFreeMemorySize? src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. CheckOperatingSystemMXBean.java System.out.println(String.format(...)) can simply be replaced with System.out.format. Mandy From daniil.x.titov at oracle.com Wed Dec 4 06:36:14 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 03 Dec 2019 22:36:14 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> Message-ID: Hi Mandy, I think in my previous reply I missed to answer one of the questions from your email. >> getFreeSwapSpaceSize retry for a few times. What special about this method but not others like getFreeMemorySize? The specific of method getFreeSwapSpaceSize is that MemoryAndSwapUsage and MemoryUsage metrics it reads are related ( MemoryAndSwapUsage includes MemoryUsage) and they are not constant. Since these metrics are not read atomically it could be that they change their values between these 2 reads. On the contrary, some other metrics, such as MemoryLimit, are constant. They are set when the container starts and are supposed to return the same value over the whole time the JVM runs. The other methods don't use more than one such nonconstant metric, so the only place where this potential issue with not atomic reads could happen is getFreeSwapSpaceSize method. Best regards, Daniil ?On 12/3/19, 7:34 PM, "Daniil Titov" wrote: Hi Mandy, Thank you for your comments, please find my answers below. >> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java >> this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. I will include this change in the next webrev, thank you. >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> Formatting nit: line 346-355: JDK native source uses 4-space identation convention. A space is missing between "if" and "(". I will correct this, thanks. >>Under what circumstance that limit or memLimit is < 0? The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and it can use as much memory as the host's OS allows. >> Is it worth specifying this case? I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. >> It fallbacks to return the system's total swap space size - this is not really what it should report. For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. However, I am not sure how we could differentiate these 2 cases. >> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() >>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >> There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. I could make these methods defaults if you feel it is a better approach here. >>CheckOperatingSystemMXBean.java >> System.out.println(String.format(...)) can simply be replaced with System.out.format. I will include this change in the next webrev, thank you! Best regards, Daniil From: Mandy Chung Date: Tuesday, December 3, 2019 at 4:10 PM To: Daniil Titov Cc: OpenJDK Serviceability , "jmx-dev at openjdk.java.net" , Bob Vandette Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware On 12/3/19 11:42 AM, Daniil Titov wrote: Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data. The webrev also takes into account the case when java.security.AccessControlException exception is thrown during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file). Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy. The jdk default security policy should grant proper permission to do so. CSR for the spec changes [3] is approved. Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java this should wrap the security-sensitive operations with doPrivileged. jdk.management is trusted and it has all permissions. src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c Formatting nit: line 346-355: JDK native source uses 4-space identation convention. A space is missing between "if" and "(". src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java 59 if (limit >= 0 && memLimit >= 0) { 60 return limit - memLimit; 61 } Under what circumstance that limit or memLimit is < 0? It fallbacks to return the system's total swap space size - this is not really what it should report. Is it worth specifying this case? Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. getFreeSwapSpaceSize retry for a few times. What special about this method but not others like getFreeMemorySize? src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. CheckOperatingSystemMXBean.java System.out.println(String.format(...)) can simply be replaced with System.out.format. Mandy From serguei.spitsyn at oracle.com Wed Dec 4 10:14:26 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 4 Dec 2019 02:14:26 -0800 Subject: RFR (XS) 8235273: nmethodLocker not needed for COMPILED_METHOD_UNLOAD events In-Reply-To: <8530df0a-c67c-3159-4b96-8bde4e925434@oracle.com> References: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com> <8530df0a-c67c-3159-4b96-8bde4e925434@oracle.com> Message-ID: <7cf9f5a3-f8c5-1981-b7b9-b83b5cdf7fe9@oracle.com> Hi Coleen, +1 Thanks, Serguei On 12/3/19 20:49, David Holmes wrote: > Hi Coleen, > > That all seems fine to me. > > Thanks, > David > > On 4/12/2019 4:21 am, coleen.phillimore at oracle.com wrote: >> Summary: remove unnecessary nmethodLocker >> >> See bug for more details.? Tested with tier2-8. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8235273.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8235273 >> >> (Note, this has a trivial merge with the change for JDK-8212160). >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Wed Dec 4 12:21:49 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 4 Dec 2019 07:21:49 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> Message-ID: <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com> On 12/3/19 11:39 PM, David Holmes wrote: > > > On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote: >> >> >> On 12/3/19 8:31 AM, David Holmes wrote: >>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 12/2/19 11:52 PM, David Holmes wrote: >>>>> Hi Coleen, >>>>> >>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> >>>>>> On 11/26/19 7:03 PM, David Holmes wrote: >>>>>>> (adding runtime as well) >>>>>>> >>>>>>> Hi Coleen, >>>>>>> >>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Add local deferred event list to thread to post events >>>>>>>> outside CodeCache_lock. >>>>>>>> >>>>>>>> This patch builds on the patch for JDK-8173361. With this >>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class >>>>>>>> (not AllStatic) and have one per thread. The CodeBlob event >>>>>>>> that used to drop the CodeCache_lock and raced with the sweeper >>>>>>>> thread, adds the events it wants to post to its thread local >>>>>>>> list, and processes it outside the lock.? The list is walked in >>>>>>>> GC and by the sweeper to keep the nmethods from being unloaded >>>>>>>> and zombied, respectively. >>>>>>> >>>>>>> Sorry I don't understand why we would want/need a deferred event >>>>>>> queue for every JavaThread? Isn't this only relevant for >>>>>>> non-JavaThreads that need to have the ServiceThread process the >>>>>>> deferred event? >>>>>> >>>>>> I thought I'd written this in the bug but I had only discussed >>>>>> this with Erik.? I've added a comment to the bug to explain why I >>>>>> added the per-JavaThread queue.? In order to process these events >>>>>> after the CodeCache_lock is dropped, I have to queue them >>>>>> somewhere safe. The ServiceThread queue is safe, *but* the >>>>>> ServiceThread can't keep up with the events, especially from this >>>>>> test case.? So the test case gets a native OOM. >>>>>> >>>>>> So I've added the safe queue as a field to each JavaThread >>>>>> because multiple JavaThreads could be posting these events at the >>>>>> same time, and there didn't seem to be a better safe place to >>>>>> cache them, without adding another layer of queuing code. >>>>> >>>>> I think I'm getting the picture now. At the time the events are >>>>> generated we can't post them directly because the current thread >>>>> is inside compiler code. Hence the events must be deferred. Using >>>>> the ServiceThread to handle the deferred events is one way to deal >>>>> with this - but it can't keep up in this scenario. So instead we >>>>> store the events in the current thread and when the current thread >>>>> returns to code where it is safe to post the events, it does so >>>>> itself. Is that generally correct? >>>> >>>> Yes. >>>>> >>>>> I admit I'm not keen on adding this additional field per-thread >>>>> just for a temporary usage. Some kind of stack allocated helper >>>>> would be preferable, but would need to be passed through the call >>>>> chain so that the events could be added to it. >>>> >>>> Right, and the GC and nmethods_do has to find it somehow. It wasn't >>>> my first choice of where to put it also because there is too many >>>> things in JavaThread.? Might be time for a future cleanup of Thread. >>> >>> I see. >>> >>>>> >>>>> Also I'm not clear why we aggressively delete the >>>>> _jvmti_event_queue after posting the events. I'd be worried about >>>>> the overhead we are introducing for creating and deleting this >>>>> queue. When the JvmtiDeferredEventQueue data structure was >>>>> intended only for use by the ServiceThread its dynamic node >>>>> allocation may have made more sense. But now that seems like a >>>>> liability to me - if JvmtiDeferredEvents could be linked directly >>>>> we wouldn't need dynamic nodes, nor dynamic per-thread queues >>>>> (just a per-thread pointer). >>>> >>>> I'm not following.? The queue is for multiple events that might be >>>> posted while in the CodeCache_lock, so they need to be in order and >>>> linked together.? While we post them and take them off, if the >>>> callback safepoints (maybe calls back into the JVM), we don't want >>>> to have GC or nmethods_do walk the one that's been posted already. >>>> So a queue seems to make sense. >>> >>> Yes but you can make a queue just by having each event have a _next >>> pointer, rather than dynamically creating nodes to hold the event. >>> Each event is its own queue node implicitly. >>> >>>> One thing that I experimented with was to have the ServiceThread >>>> take ownership of the queue in it's local thread queue and post >>>> them all, which could be a future enhancement.? It didn't help my >>>> OOM situation. >>> >>> Your OOM situation seems to be a basic case of overwhelming the >>> ServiceThread. A single serviceThread will always have a limit on >>> how many events it can handle. Maybe this test is being too >>> unrealistic in its expectations of the current design? >> >> I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD >> for all the events in the queue is going to be overwhelming unless it >> waits for the events to be posted. > > Taking things off the service thread would seem to be a good thing > then :) > >>> >>>> Deleting the queue after all the events are posted allows >>>> JavaThread::oops_do and nmethods_do only a null check to deal with >>>> this jvmti wart. >>> >>> If the nodes are not dynamically allocated you don't need to delete >>> you just set the queue-head pointer to NULL - actually it will >>> already be NULL once the last event has been processed. >> >> I could revisit the data structure as a future RFE.? The goal was to >> reuse code that's already there, and I don't think there's a >> significant difference in performance.? I did some measurement of the >> stress case and the times were equivalent, actually better in the new >> code. > > Okay. Is this a code review then?? I think Serguei promised to review the code too. thanks, Coleen > > Thanks, > David > >> >> Thanks, >> Coleen >>> >>> David >>> ----- >>> >>>> Thanks, >>>> Coleen >>>>> >>>>> Just some thoughts. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> I did write comments to this effect here: >>>>>> >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>>> >>>>>>> David >>>>>>> >>>>>>>> Also, the jmethod_id field in nmethod was only used as a >>>>>>>> boolean so don't create a jmethod_id until needed for >>>>>>>> post_compiled_method_unload. >>>>>>>> >>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>>>> crashed in the original bug report. >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>> >>>> >> From coleen.phillimore at oracle.com Wed Dec 4 12:24:36 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 4 Dec 2019 07:24:36 -0500 Subject: RFR (XS) 8235273: nmethodLocker not needed for COMPILED_METHOD_UNLOAD events In-Reply-To: <7cf9f5a3-f8c5-1981-b7b9-b83b5cdf7fe9@oracle.com> References: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com> <8530df0a-c67c-3159-4b96-8bde4e925434@oracle.com> <7cf9f5a3-f8c5-1981-b7b9-b83b5cdf7fe9@oracle.com> Message-ID: <332991c8-acdb-9f7d-dc58-7f5e95b28770@oracle.com> Thanks David and Serguei! Coleen On 12/4/19 5:14 AM, serguei.spitsyn at oracle.com wrote: > Hi Coleen, > > +1 > > Thanks, > Serguei > > > On 12/3/19 20:49, David Holmes wrote: >> Hi Coleen, >> >> That all seems fine to me. >> >> Thanks, >> David >> >> On 4/12/2019 4:21 am, coleen.phillimore at oracle.com wrote: >>> Summary: remove unnecessary nmethodLocker >>> >>> See bug for more details.? Tested with tier2-8. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8235273.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8235273 >>> >>> (Note, this has a trivial merge with the change for JDK-8212160). >>> >>> Thanks, >>> Coleen > From coleen.phillimore at oracle.com Wed Dec 4 13:00:53 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 4 Dec 2019 08:00:53 -0500 Subject: RFR (T) 8234355: Buffer overflow in jcmd GC.class_stats due to too many classes Message-ID: <082ecca7-39ff-e027-2090-c3331650cd07@oracle.com> Summary: Remove use of GC.class_stats in testing and failure analysis (plan to deprecate) See bug for more details. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8234355.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8234355 Ran tier8 overnight. Thanks, Coleen From david.holmes at oracle.com Wed Dec 4 13:06:16 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 4 Dec 2019 23:06:16 +1000 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com> Message-ID: <7a6f73fd-ff7f-974d-d213-dc7ead799ead@oracle.com> On 4/12/2019 10:21 pm, coleen.phillimore at oracle.com wrote: > > > On 12/3/19 11:39 PM, David Holmes wrote: >> >> >> On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 12/3/19 8:31 AM, David Holmes wrote: >>>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 12/2/19 11:52 PM, David Holmes wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> >>>>>>> On 11/26/19 7:03 PM, David Holmes wrote: >>>>>>>> (adding runtime as well) >>>>>>>> >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Add local deferred event list to thread to post events >>>>>>>>> outside CodeCache_lock. >>>>>>>>> >>>>>>>>> This patch builds on the patch for JDK-8173361. With this >>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class >>>>>>>>> (not AllStatic) and have one per thread. The CodeBlob event >>>>>>>>> that used to drop the CodeCache_lock and raced with the sweeper >>>>>>>>> thread, adds the events it wants to post to its thread local >>>>>>>>> list, and processes it outside the lock.? The list is walked in >>>>>>>>> GC and by the sweeper to keep the nmethods from being unloaded >>>>>>>>> and zombied, respectively. >>>>>>>> >>>>>>>> Sorry I don't understand why we would want/need a deferred event >>>>>>>> queue for every JavaThread? Isn't this only relevant for >>>>>>>> non-JavaThreads that need to have the ServiceThread process the >>>>>>>> deferred event? >>>>>>> >>>>>>> I thought I'd written this in the bug but I had only discussed >>>>>>> this with Erik.? I've added a comment to the bug to explain why I >>>>>>> added the per-JavaThread queue.? In order to process these events >>>>>>> after the CodeCache_lock is dropped, I have to queue them >>>>>>> somewhere safe. The ServiceThread queue is safe, *but* the >>>>>>> ServiceThread can't keep up with the events, especially from this >>>>>>> test case.? So the test case gets a native OOM. >>>>>>> >>>>>>> So I've added the safe queue as a field to each JavaThread >>>>>>> because multiple JavaThreads could be posting these events at the >>>>>>> same time, and there didn't seem to be a better safe place to >>>>>>> cache them, without adding another layer of queuing code. >>>>>> >>>>>> I think I'm getting the picture now. At the time the events are >>>>>> generated we can't post them directly because the current thread >>>>>> is inside compiler code. Hence the events must be deferred. Using >>>>>> the ServiceThread to handle the deferred events is one way to deal >>>>>> with this - but it can't keep up in this scenario. So instead we >>>>>> store the events in the current thread and when the current thread >>>>>> returns to code where it is safe to post the events, it does so >>>>>> itself. Is that generally correct? >>>>> >>>>> Yes. >>>>>> >>>>>> I admit I'm not keen on adding this additional field per-thread >>>>>> just for a temporary usage. Some kind of stack allocated helper >>>>>> would be preferable, but would need to be passed through the call >>>>>> chain so that the events could be added to it. >>>>> >>>>> Right, and the GC and nmethods_do has to find it somehow. It wasn't >>>>> my first choice of where to put it also because there is too many >>>>> things in JavaThread.? Might be time for a future cleanup of Thread. >>>> >>>> I see. >>>> >>>>>> >>>>>> Also I'm not clear why we aggressively delete the >>>>>> _jvmti_event_queue after posting the events. I'd be worried about >>>>>> the overhead we are introducing for creating and deleting this >>>>>> queue. When the JvmtiDeferredEventQueue data structure was >>>>>> intended only for use by the ServiceThread its dynamic node >>>>>> allocation may have made more sense. But now that seems like a >>>>>> liability to me - if JvmtiDeferredEvents could be linked directly >>>>>> we wouldn't need dynamic nodes, nor dynamic per-thread queues >>>>>> (just a per-thread pointer). >>>>> >>>>> I'm not following.? The queue is for multiple events that might be >>>>> posted while in the CodeCache_lock, so they need to be in order and >>>>> linked together.? While we post them and take them off, if the >>>>> callback safepoints (maybe calls back into the JVM), we don't want >>>>> to have GC or nmethods_do walk the one that's been posted already. >>>>> So a queue seems to make sense. >>>> >>>> Yes but you can make a queue just by having each event have a _next >>>> pointer, rather than dynamically creating nodes to hold the event. >>>> Each event is its own queue node implicitly. >>>> >>>>> One thing that I experimented with was to have the ServiceThread >>>>> take ownership of the queue in it's local thread queue and post >>>>> them all, which could be a future enhancement.? It didn't help my >>>>> OOM situation. >>>> >>>> Your OOM situation seems to be a basic case of overwhelming the >>>> ServiceThread. A single serviceThread will always have a limit on >>>> how many events it can handle. Maybe this test is being too >>>> unrealistic in its expectations of the current design? >>> >>> I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD >>> for all the events in the queue is going to be overwhelming unless it >>> waits for the events to be posted. >> >> Taking things off the service thread would seem to be a good thing >> then :) >> >>>> >>>>> Deleting the queue after all the events are posted allows >>>>> JavaThread::oops_do and nmethods_do only a null check to deal with >>>>> this jvmti wart. >>>> >>>> If the nodes are not dynamically allocated you don't need to delete >>>> you just set the queue-head pointer to NULL - actually it will >>>> already be NULL once the last event has been processed. >>> >>> I could revisit the data structure as a future RFE.? The goal was to >>> reuse code that's already there, and I don't think there's a >>> significant difference in performance.? I did some measurement of the >>> stress case and the times were equivalent, actually better in the new >>> code. >> >> Okay. > > Is this a code review then?? I think Serguei promised to review the code > too. Yes this is a review. Thanks, David > thanks, > Coleen >> >> Thanks, >> David >> >>> >>> Thanks, >>> Coleen >>>> >>>> David >>>> ----- >>>> >>>>> Thanks, >>>>> Coleen >>>>>> >>>>>> Just some thoughts. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> I did write comments to this effect here: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>>> Also, the jmethod_id field in nmethod was only used as a >>>>>>>>> boolean so don't create a jmethod_id until needed for >>>>>>>>> post_compiled_method_unload. >>>>>>>>> >>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>>>>> crashed in the original bug report. >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>> >>>>> >>> > From bob.vandette at oracle.com Wed Dec 4 13:32:05 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 4 Dec 2019 08:32:05 -0500 Subject: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com> Message-ID: <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com> > On Dec 3, 2019, at 9:00 PM, Daniil Titov wrote: > > Hi Bob, > >>> It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to >>> getSystemCpuLoad0. > > I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns > to the number of the CPUs configured on the host and returned by sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native > method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and > inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected. > > JNIEXPORT jint JNICALL > Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0 > (JNIEnv *env, jobject mbean) > { > if(perfInit() == 0) { > return counters.nProcs; > } else { > return -1; > } > } > > > If there is no objection I will include this change in the new webrev. I don?t think this approach will work. Both the array returned and the sysconf(_SC_NPROCESSORS_CONF) report the containers cpuset value so they will be equal causing you to always fallback. You can try to use containerMetrics.getPerCpuUsage() instead of containerMetrics.getEffectiveCpuSetCpus(). The length of the array returned is the number of host cpus. Maybe Severin can confirm if this true in cgroupv2 as well. Bob. > > Thank you, > Daniil > > ?On 12/3/19, 1:30 PM, "Bob Vandette" wrote: > > Daniil, > > Looks good to me. > > If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will > alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the > container detection to report containerized. > > It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to > getSystemCpuLoad0. > > > Bob. > > >> On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: >> >> Please review the change that makes OperatingSystemMXBean methods return container specific information >> rather than the host based data. >> >> The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined >> with the spec update David made [3]. >> >> The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total >> and free memory from the returned values. >> >> It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. >> >> The webrev also takes into account the case when java.security.AccessControlException exception is thrown >> during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to >> "/proc/self/mountinfo" file). >> >> CSR for the spec changes [3] is approved. >> >> Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ >> [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 >> [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 >> >> Thank you, >> -Daniil >> >> > > > > From coleen.phillimore at oracle.com Wed Dec 4 13:45:12 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 4 Dec 2019 08:45:12 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <7a6f73fd-ff7f-974d-d213-dc7ead799ead@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com> <7a6f73fd-ff7f-974d-d213-dc7ead799ead@oracle.com> Message-ID: <2c8b8751-bea1-bfd6-10fa-9170adbb2dd3@oracle.com> Thanks, David! Coleen On 12/4/19 8:06 AM, David Holmes wrote: > > On 4/12/2019 10:21 pm, coleen.phillimore at oracle.com wrote: >> >> >> On 12/3/19 11:39 PM, David Holmes wrote: >>> >>> >>> On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 12/3/19 8:31 AM, David Holmes wrote: >>>>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> >>>>>> On 12/2/19 11:52 PM, David Holmes wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 11/26/19 7:03 PM, David Holmes wrote: >>>>>>>>> (adding runtime as well) >>>>>>>>> >>>>>>>>> Hi Coleen, >>>>>>>>> >>>>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>>>>>>>> Summary: Add local deferred event list to thread to post >>>>>>>>>> events outside CodeCache_lock. >>>>>>>>>> >>>>>>>>>> This patch builds on the patch for JDK-8173361. With this >>>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class >>>>>>>>>> (not AllStatic) and have one per thread. The CodeBlob event >>>>>>>>>> that used to drop the CodeCache_lock and raced with the >>>>>>>>>> sweeper thread, adds the events it wants to post to its >>>>>>>>>> thread local list, and processes it outside the lock.? The >>>>>>>>>> list is walked in GC and by the sweeper to keep the nmethods >>>>>>>>>> from being unloaded and zombied, respectively. >>>>>>>>> >>>>>>>>> Sorry I don't understand why we would want/need a deferred >>>>>>>>> event queue for every JavaThread? Isn't this only relevant for >>>>>>>>> non-JavaThreads that need to have the ServiceThread process >>>>>>>>> the deferred event? >>>>>>>> >>>>>>>> I thought I'd written this in the bug but I had only discussed >>>>>>>> this with Erik.? I've added a comment to the bug to explain why >>>>>>>> I added the per-JavaThread queue.? In order to process these >>>>>>>> events after the CodeCache_lock is dropped, I have to queue >>>>>>>> them somewhere safe. The ServiceThread queue is safe, *but* the >>>>>>>> ServiceThread can't keep up with the events, especially from >>>>>>>> this test case.? So the test case gets a native OOM. >>>>>>>> >>>>>>>> So I've added the safe queue as a field to each JavaThread >>>>>>>> because multiple JavaThreads could be posting these events at >>>>>>>> the same time, and there didn't seem to be a better safe place >>>>>>>> to cache them, without adding another layer of queuing code. >>>>>>> >>>>>>> I think I'm getting the picture now. At the time the events are >>>>>>> generated we can't post them directly because the current thread >>>>>>> is inside compiler code. Hence the events must be deferred. >>>>>>> Using the ServiceThread to handle the deferred events is one way >>>>>>> to deal with this - but it can't keep up in this scenario. So >>>>>>> instead we store the events in the current thread and when the >>>>>>> current thread returns to code where it is safe to post the >>>>>>> events, it does so itself. Is that generally correct? >>>>>> >>>>>> Yes. >>>>>>> >>>>>>> I admit I'm not keen on adding this additional field per-thread >>>>>>> just for a temporary usage. Some kind of stack allocated helper >>>>>>> would be preferable, but would need to be passed through the >>>>>>> call chain so that the events could be added to it. >>>>>> >>>>>> Right, and the GC and nmethods_do has to find it somehow. It >>>>>> wasn't my first choice of where to put it also because there is >>>>>> too many things in JavaThread. Might be time for a future cleanup >>>>>> of Thread. >>>>> >>>>> I see. >>>>> >>>>>>> >>>>>>> Also I'm not clear why we aggressively delete the >>>>>>> _jvmti_event_queue after posting the events. I'd be worried >>>>>>> about the overhead we are introducing for creating and deleting >>>>>>> this queue. When the JvmtiDeferredEventQueue data structure was >>>>>>> intended only for use by the ServiceThread its dynamic node >>>>>>> allocation may have made more sense. But now that seems like a >>>>>>> liability to me - if JvmtiDeferredEvents could be linked >>>>>>> directly we wouldn't need dynamic nodes, nor dynamic per-thread >>>>>>> queues (just a per-thread pointer). >>>>>> >>>>>> I'm not following.? The queue is for multiple events that might >>>>>> be posted while in the CodeCache_lock, so they need to be in >>>>>> order and linked together.? While we post them and take them off, >>>>>> if the callback safepoints (maybe calls back into the JVM), we >>>>>> don't want to have GC or nmethods_do walk the one that's been >>>>>> posted already. So a queue seems to make sense. >>>>> >>>>> Yes but you can make a queue just by having each event have a >>>>> _next pointer, rather than dynamically creating nodes to hold the >>>>> event. Each event is its own queue node implicitly. >>>>> >>>>>> One thing that I experimented with was to have the ServiceThread >>>>>> take ownership of the queue in it's local thread queue and post >>>>>> them all, which could be a future enhancement.? It didn't help my >>>>>> OOM situation. >>>>> >>>>> Your OOM situation seems to be a basic case of overwhelming the >>>>> ServiceThread. A single serviceThread will always have a limit on >>>>> how many events it can handle. Maybe this test is being too >>>>> unrealistic in its expectations of the current design? >>>> >>>> I think the JVMTI API where you can generate an >>>> COMPILED_METHOD_LOAD for all the events in the queue is going to be >>>> overwhelming unless it waits for the events to be posted. >>> >>> Taking things off the service thread would seem to be a good thing >>> then :) >>> >>>>> >>>>>> Deleting the queue after all the events are posted allows >>>>>> JavaThread::oops_do and nmethods_do only a null check to deal >>>>>> with this jvmti wart. >>>>> >>>>> If the nodes are not dynamically allocated you don't need to >>>>> delete you just set the queue-head pointer to NULL - actually it >>>>> will already be NULL once the last event has been processed. >>>> >>>> I could revisit the data structure as a future RFE.? The goal was >>>> to reuse code that's already there, and I don't think there's a >>>> significant difference in performance.? I did some measurement of >>>> the stress case and the times were equivalent, actually better in >>>> the new code. >>> >>> Okay. >> >> Is this a code review then?? I think Serguei promised to review the >> code too. > > Yes this is a review. > > Thanks, > David > >> thanks, >> Coleen >>> >>> Thanks, >>> David >>> >>>> >>>> Thanks, >>>> Coleen >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>>> >>>>>>> Just some thoughts. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> I did write comments to this effect here: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>>> Also, the jmethod_id field in nmethod was only used as a >>>>>>>>>> boolean so don't create a jmethod_id until needed for >>>>>>>>>> post_compiled_method_unload. >>>>>>>>>> >>>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>>>>>> crashed in the original bug report. >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>> >>>>>> >>>> >> From bob.vandette at oracle.com Wed Dec 4 14:13:51 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 4 Dec 2019 09:13:51 -0500 Subject: jmx-dev 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com> <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com> Message-ID: > On Dec 4, 2019, at 8:32 AM, Bob Vandette wrote: > > >> On Dec 3, 2019, at 9:00 PM, Daniil Titov wrote: >> >> Hi Bob, >> >>>> It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to >>>> getSystemCpuLoad0. >> >> I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns >> to the number of the CPUs configured on the host and returned by sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native >> method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and >> inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected. >> >> JNIEXPORT jint JNICALL >> Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0 >> (JNIEnv *env, jobject mbean) >> { >> if(perfInit() == 0) { >> return counters.nProcs; >> } else { >> return -1; >> } >> } >> >> >> If there is no objection I will include this change in the new webrev. > > I don?t think this approach will work. Both the array returned and the sysconf(_SC_NPROCESSORS_CONF) > report the containers cpuset value so they will be equal causing you to always fallback. > > You can try to use containerMetrics.getPerCpuUsage() instead of containerMetrics.getEffectiveCpuSetCpus(). > The length of the array returned is the number of host cpus. Maybe Severin can confirm if this true in cgroupv2 as > well. I just checked the webrev for the cgroupv2 implementation and getPerCpuUsage is not supported. I still think it?s worth implementing this optimization but it won?t be used on cgroupv2 since the array length (0) won?t be equal to _SC_NPROCESSORS_CONF. Here?s the cgroupv2 implementation of this method. 64 @Override 65 public long[] getPerCpuUsage() { 66 // Not supported 67 return new long[0]; 68 } Bob. > > Bob. > > >> >> Thank you, >> Daniil >> >> ?On 12/3/19, 1:30 PM, "Bob Vandette" wrote: >> >> Daniil, >> >> Looks good to me. >> >> If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will >> alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the >> container detection to report containerized. >> >> It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to >> getSystemCpuLoad0. >> >> >> Bob. >> >> >>> On Dec 3, 2019, at 2:42 PM, Daniil Titov wrote: >>> >>> Please review the change that makes OperatingSystemMXBean methods return container specific information >>> rather than the host based data. >>> >>> The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined >>> with the spec update David made [3]. >>> >>> The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total >>> and free memory from the returned values. >>> >>> It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active. >>> >>> The webrev also takes into account the case when java.security.AccessControlException exception is thrown >>> during the initialization of the container subsystem ( e.g. when java.policy doesn?t grant "read" access to >>> "/proc/self/mountinfo" file). >>> >>> CSR for the spec changes [3] is approved. >>> >>> Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker), and tier6 tests passed . >>> >>> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ >>> [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 >>> [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 >>> >>> Thank you, >>> -Daniil >>> >>> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Dec 4 15:46:49 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 4 Dec 2019 10:46:49 -0500 Subject: RFR (T) 8234355: Buffer overflow in jcmd GC.class_stats due to too many classes In-Reply-To: <082ecca7-39ff-e027-2090-c3331650cd07@oracle.com> References: <082ecca7-39ff-e027-2090-c3331650cd07@oracle.com> Message-ID: <1a6df0bc-25fe-3895-2547-8a858fee18e4@oracle.com> On 12/4/19 8:00 AM, coleen.phillimore at oracle.com wrote: > Summary: Remove use of GC.class_stats in testing and failure analysis > (plan to deprecate) > > See bug for more details. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8234355.01/webrev test/failure_handler/src/share/conf/common.properties ??? No comments. Thumbs up. I agree that this change is trivial. Dan > bug link https://bugs.openjdk.java.net/browse/JDK-8234355 > > Ran tier8 overnight. > > Thanks, > Coleen From coleen.phillimore at oracle.com Wed Dec 4 16:05:14 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 4 Dec 2019 11:05:14 -0500 Subject: RFR (T) 8234355: Buffer overflow in jcmd GC.class_stats due to too many classes In-Reply-To: <1a6df0bc-25fe-3895-2547-8a858fee18e4@oracle.com> References: <082ecca7-39ff-e027-2090-c3331650cd07@oracle.com> <1a6df0bc-25fe-3895-2547-8a858fee18e4@oracle.com> Message-ID: <4ca6c9ce-9a3b-02ed-f2a9-bb37fcc562ee@oracle.com> Thanks, Dan! Coleen On 12/4/19 10:46 AM, Daniel D. Daugherty wrote: > On 12/4/19 8:00 AM, coleen.phillimore at oracle.com wrote: >> Summary: Remove use of GC.class_stats in testing and failure analysis >> (plan to deprecate) >> >> See bug for more details. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8234355.01/webrev > > test/failure_handler/src/share/conf/common.properties > ??? No comments. > > Thumbs up. I agree that this change is trivial. > > Dan > > >> bug link https://bugs.openjdk.java.net/browse/JDK-8234355 >> >> Ran tier8 overnight. >> >> Thanks, >> Coleen > From sgehwolf at redhat.com Wed Dec 4 18:22:30 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 04 Dec 2019 19:22:30 +0100 Subject: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com> <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com> Message-ID: <2a47a93f7ee2f479a73b1e7a3dd829a244358095.camel@redhat.com> On Wed, 2019-12-04 at 08:32 -0500, Bob Vandette wrote: > You can try to use containerMetrics.getPerCpuUsage() instead of containerMetrics.getEffectiveCpuSetCpus(). > The length of the array returned is the number of host cpus. Maybe Severin can confirm if this true in cgroupv2 as > well. If I'm not mistaken getPerCpuUsage() is not supported in cgroupv2. Thanks, Severin From serguei.spitsyn at oracle.com Wed Dec 4 19:17:09 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 4 Dec 2019 11:17:09 -0800 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com> <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com> <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com> Message-ID: On 12/4/19 04:21, coleen.phillimore at oracle.com wrote: > > > On 12/3/19 11:39 PM, David Holmes wrote: >> >> >> On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 12/3/19 8:31 AM, David Holmes wrote: >>>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote: >>>>> >>>>> >>>>> On 12/2/19 11:52 PM, David Holmes wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote: >>>>>>> >>>>>>> >>>>>>> On 11/26/19 7:03 PM, David Holmes wrote: >>>>>>>> (adding runtime as well) >>>>>>>> >>>>>>>> Hi Coleen, >>>>>>>> >>>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote: >>>>>>>>> Summary: Add local deferred event list to thread to post >>>>>>>>> events outside CodeCache_lock. >>>>>>>>> >>>>>>>>> This patch builds on the patch for JDK-8173361. With this >>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class >>>>>>>>> (not AllStatic) and have one per thread. The CodeBlob event >>>>>>>>> that used to drop the CodeCache_lock and raced with the >>>>>>>>> sweeper thread, adds the events it wants to post to its thread >>>>>>>>> local list, and processes it outside the lock.? The list is >>>>>>>>> walked in GC and by the sweeper to keep the nmethods from >>>>>>>>> being unloaded and zombied, respectively. >>>>>>>> >>>>>>>> Sorry I don't understand why we would want/need a deferred >>>>>>>> event queue for every JavaThread? Isn't this only relevant for >>>>>>>> non-JavaThreads that need to have the ServiceThread process the >>>>>>>> deferred event? >>>>>>> >>>>>>> I thought I'd written this in the bug but I had only discussed >>>>>>> this with Erik.? I've added a comment to the bug to explain why >>>>>>> I added the per-JavaThread queue. In order to process these >>>>>>> events after the CodeCache_lock is dropped, I have to queue them >>>>>>> somewhere safe. The ServiceThread queue is safe, *but* the >>>>>>> ServiceThread can't keep up with the events, especially from >>>>>>> this test case.? So the test case gets a native OOM. >>>>>>> >>>>>>> So I've added the safe queue as a field to each JavaThread >>>>>>> because multiple JavaThreads could be posting these events at >>>>>>> the same time, and there didn't seem to be a better safe place >>>>>>> to cache them, without adding another layer of queuing code. >>>>>> >>>>>> I think I'm getting the picture now. At the time the events are >>>>>> generated we can't post them directly because the current thread >>>>>> is inside compiler code. Hence the events must be deferred. Using >>>>>> the ServiceThread to handle the deferred events is one way to >>>>>> deal with this - but it can't keep up in this scenario. So >>>>>> instead we store the events in the current thread and when the >>>>>> current thread returns to code where it is safe to post the >>>>>> events, it does so itself. Is that generally correct? >>>>> >>>>> Yes. >>>>>> >>>>>> I admit I'm not keen on adding this additional field per-thread >>>>>> just for a temporary usage. Some kind of stack allocated helper >>>>>> would be preferable, but would need to be passed through the call >>>>>> chain so that the events could be added to it. >>>>> >>>>> Right, and the GC and nmethods_do has to find it somehow. It >>>>> wasn't my first choice of where to put it also because there is >>>>> too many things in JavaThread.? Might be time for a future cleanup >>>>> of Thread. >>>> >>>> I see. >>>> >>>>>> >>>>>> Also I'm not clear why we aggressively delete the >>>>>> _jvmti_event_queue after posting the events. I'd be worried about >>>>>> the overhead we are introducing for creating and deleting this >>>>>> queue. When the JvmtiDeferredEventQueue data structure was >>>>>> intended only for use by the ServiceThread its dynamic node >>>>>> allocation may have made more sense. But now that seems like a >>>>>> liability to me - if JvmtiDeferredEvents could be linked directly >>>>>> we wouldn't need dynamic nodes, nor dynamic per-thread queues >>>>>> (just a per-thread pointer). >>>>> >>>>> I'm not following.? The queue is for multiple events that might be >>>>> posted while in the CodeCache_lock, so they need to be in order >>>>> and linked together.? While we post them and take them off, if the >>>>> callback safepoints (maybe calls back into the JVM), we don't want >>>>> to have GC or nmethods_do walk the one that's been posted already. >>>>> So a queue seems to make sense. >>>> >>>> Yes but you can make a queue just by having each event have a _next >>>> pointer, rather than dynamically creating nodes to hold the event. >>>> Each event is its own queue node implicitly. >>>> >>>>> One thing that I experimented with was to have the ServiceThread >>>>> take ownership of the queue in it's local thread queue and post >>>>> them all, which could be a future enhancement.? It didn't help my >>>>> OOM situation. >>>> >>>> Your OOM situation seems to be a basic case of overwhelming the >>>> ServiceThread. A single serviceThread will always have a limit on >>>> how many events it can handle. Maybe this test is being too >>>> unrealistic in its expectations of the current design? >>> >>> I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD >>> for all the events in the queue is going to be overwhelming unless >>> it waits for the events to be posted. >> >> Taking things off the service thread would seem to be a good thing >> then :) >> >>>> >>>>> Deleting the queue after all the events are posted allows >>>>> JavaThread::oops_do and nmethods_do only a null check to deal with >>>>> this jvmti wart. >>>> >>>> If the nodes are not dynamically allocated you don't need to delete >>>> you just set the queue-head pointer to NULL - actually it will >>>> already be NULL once the last event has been processed. >>> >>> I could revisit the data structure as a future RFE.? The goal was to >>> reuse code that's already there, and I don't think there's a >>> significant difference in performance.? I did some measurement of >>> the stress case and the times were equivalent, actually better in >>> the new code. >> >> Okay. > > Is this a code review then?? I think Serguei promised to review the > code too. Yes, I'm close to send my review soon. Sorry for the latency. Thanks, Serguei > > thanks, > Coleen >> >> Thanks, >> David >> >>> >>> Thanks, >>> Coleen >>>> >>>> David >>>> ----- >>>> >>>>> Thanks, >>>>> Coleen >>>>>> >>>>>> Just some thoughts. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> I did write comments to this effect here: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>>> >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>>> Also, the jmethod_id field in nmethod was only used as a >>>>>>>>> boolean so don't create a jmethod_id until needed for >>>>>>>>> post_compiled_method_unload. >>>>>>>>> >>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>>>>> crashed in the original bug report. >>>>>>>>> >>>>>>>>> open webrev at >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Coleen >>>>>>> >>>>> >>> > From igor.ignatyev at oracle.com Wed Dec 4 19:52:27 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 4 Dec 2019 11:52:27 -0800 Subject: RFR(T) : 8235353 : clean up hotspot problem lists Message-ID: http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00 > 9 lines changed: 0 ins; 0 del; 9 mod; Hi all, could you please review this small and trivial cleanup which returns serviceablility/sa tests back to execution on linux-ppc64. the tests were problem listed due to 8211767[1], which is closed as a dup of resolved 8228649[2]. [1] https://bugs.openjdk.java.net/browse/JDK-8211767 [2] https://bugs.openjdk.java.net/browse/JDK-8228649 Thanks, -- Igor From vladimir.kozlov at oracle.com Wed Dec 4 20:01:21 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 4 Dec 2019 12:01:21 -0800 Subject: RFR(T) : 8235353 : clean up hotspot problem lists In-Reply-To: References: Message-ID: I am fine with changes but we need to ask PPC64 supporter to verify that tests passed now. Thanks, Vladimir K On 12/4/19 11:52 AM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00 >> 9 lines changed: 0 ins; 0 del; 9 mod; > > Hi all, > > could you please review this small and trivial cleanup which returns serviceablility/sa tests back to execution on linux-ppc64. the tests were problem listed due to 8211767[1], which is closed as a dup of resolved 8228649[2]. > > [1] https://bugs.openjdk.java.net/browse/JDK-8211767 > [2] https://bugs.openjdk.java.net/browse/JDK-8228649 > > Thanks, > -- Igor > From serguei.spitsyn at oracle.com Wed Dec 4 22:15:04 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 4 Dec 2019 14:15:04 -0800 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> Message-ID: <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Dec 4 22:16:28 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 4 Dec 2019 14:16:28 -0800 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Wed Dec 4 23:06:44 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 4 Dec 2019 18:06:44 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> Message-ID: <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> Hi Serguei, On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: > Hi Collen, (no problem) > > It looks good in general. > Thank you a lot for sorting this out! > > Just a couple of comments. > > > http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html > 1993 protected: > 1994 // Jvmti Events that cannot be posted in their current context. > 1995 // ServiceThread uses this to collect deferred events from > NonJava threads > 1996 // that cannot post events. > 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; > > As David I also have a concern about footprint of having the > _jvmti_event_queue field in the Thread class. > I'm thinking if it'd be better to move this field into the > JvmtiThreadState class. > Please, see jvmti_thread_state() and > JvmtiThreadState::state_for(JavaThread *thread). The reason I have it directly in JavaThread is so that the GC oops_do and nmethods_do code can find it easily.? I like your idea of hiding it in jvmti but this doesn't seem good to have this code know about jvmtiThreadState, which seems to be a queue of Jvmti states.? I also don't want to have jvmtiThreadState to have to add an oops_do() or nmethods_do() either. > > > http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html > 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { > 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this > method"); > 975 nmethod* nm = _event_data.compiled_method_load; > 976 JvmtiExport::post_compiled_method_load(env, nm); > 977 } > > The JvmtiDeferredEvent::post name looks too generic as it posts > compiled load events only. > Do you consider this function extended in the future to support more > event types? > I don't envision an extension for this function but I do for JvmtiDeferredEventQueue::post().? I have a small enhancement that would handoff the entire queue to the ServiceThread and have it call post() to post all the events rather than one at a time. So I'll rename this one post_compiled_method_load_event() and leave the other post() as is for now. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev Thanks, Coleen > > Thanks, > Serguei > > > On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >> Summary: Add local deferred event list to thread to post events >> outside CodeCache_lock. >> >> This patch builds on the patch for JDK-8173361.? With this patch, I >> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >> and have one per thread.? The CodeBlob event that used to drop the >> CodeCache_lock and raced with the sweeper thread, adds the events it >> wants to post to its thread local list, and processes it outside the >> lock.? The list is walked in GC and by the sweeper to keep the >> nmethods from being unloaded and zombied, respectively. >> >> Also, the jmethod_id field in nmethod was only used as a boolean so >> don't create a jmethod_id until needed for post_compiled_method_unload. >> >> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed in >> the original bug report. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >> >> Thanks, >> Coleen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Dec 4 23:27:51 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 4 Dec 2019 15:27:51 -0800 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> Message-ID: <7d51c3d1-a963-48b3-961f-d119ea9058d1@oracle.com> An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Thu Dec 5 00:09:45 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 4 Dec 2019 16:09:45 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> Message-ID: On 12/3/19 9:40 PM, Daniil Titov wrote: > >>> Under what circumstance that limit or memLimit is < 0? > The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without > specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and > it can use as much memory as the host's OS allows. > OK.? Please add a comment to the code. It may worth considering adding Metrics::getSwapLimit and Metrics::getSwapUsage and move the computation to the implementation of Metrics.? Bob may have an opinion. Also it seems correct for the memory related methods to check if (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).? BTW what does it mean if limit == 0? >>> Is it worth specifying this case? > I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. > I was wondering if the javadoc should specify that. >>> It fallbacks to return the system's total swap space size - this is not really what it should report. > For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. > However, I am not sure how we could differentiate these 2 cases. As this is the case when the limit is not set in the container, it returns the system metrics which sounds appropriate. > >>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. > For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. > For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Will zero memory usage happen? > Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. > > For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was > not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() > returning -1 sounds right. >>> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >>> There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. > I could make these methods defaults if you feel it is a better approach here. > > It's not strictly needed but I can go either way. >>> CheckOperatingSystemMXBean.java >>> System.out.println(String.format(...)) can simply be replaced with System.out.format. > I will include this change in the next webrev, thank you! > > thanks Mandy From daniel.daugherty at oracle.com Thu Dec 5 00:40:05 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 4 Dec 2019 19:40:05 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> Message-ID: <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> Generally speaking, JVM/TI related things should be in JvmtiThreadState instead of directly in the Thread class. That way the extra space is only consumed when JVM/TI is in use and only when a Thread does something that requires a JvmtiThreadState to be created. Please reconsider moving _jvmti_event_queue. Dan On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote: > > Hi Serguei, > > On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: >> Hi Collen, (no problem) >> >> It looks good in general. >> Thank you a lot for sorting this out! >> >> Just a couple of comments. >> >> >> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html >> 1993 protected: >> 1994 // Jvmti Events that cannot be posted in their current context. >> 1995 // ServiceThread uses this to collect deferred events from >> NonJava threads >> 1996 // that cannot post events. >> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; >> >> As David I also have a concern about footprint of having the >> _jvmti_event_queue field in the Thread class. >> I'm thinking if it'd be better to move this field into the >> JvmtiThreadState class. >> Please, see jvmti_thread_state() and >> JvmtiThreadState::state_for(JavaThread *thread). > > The reason I have it directly in JavaThread is so that the GC oops_do > and nmethods_do code can find it easily.? I like your idea of hiding > it in jvmti but this doesn't seem good to have this code know about > jvmtiThreadState, which seems to be a queue of Jvmti states.? I also > don't want to have jvmtiThreadState to have to add an oops_do() or > nmethods_do() either. > >> >> >> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html >> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { >> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this >> method"); >> 975 nmethod* nm = _event_data.compiled_method_load; >> 976 JvmtiExport::post_compiled_method_load(env, nm); >> 977 } >> >> The JvmtiDeferredEvent::post name looks too generic as it posts >> compiled load events only. >> Do you consider this function extended in the future to support more >> event types? >> > > I don't envision an extension for this function but I do for > JvmtiDeferredEventQueue::post().? I have a small enhancement that > would handoff the entire queue to the ServiceThread and have it call > post() to post all the events rather than one at a time. > > So I'll rename this one post_compiled_method_load_event() and leave > the other post() as is for now. > > open webrev at > http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev > > Thanks, > Coleen > > >> >> Thanks, >> Serguei >> >> >> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >>> Summary: Add local deferred event list to thread to post events >>> outside CodeCache_lock. >>> >>> This patch builds on the patch for JDK-8173361.? With this patch, I >>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >>> and have one per thread.? The CodeBlob event that used to drop the >>> CodeCache_lock and raced with the sweeper thread, adds the events it >>> wants to post to its thread local list, and processes it outside the >>> lock.? The list is walked in GC and by the sweeper to keep the >>> nmethods from being unloaded and zombied, respectively. >>> >>> Also, the jmethod_id field in nmethod was only used as a boolean so >>> don't create a jmethod_id until needed for post_compiled_method_unload. >>> >>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed >>> in the original bug report. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>> >>> Thanks, >>> Coleen >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Dec 5 01:39:14 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 4 Dec 2019 17:39:14 -0800 Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and do a better job of detecting SA failures In-Reply-To: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com> References: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com> Message-ID: Can I get one more review please? thanks, Chris On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > It looks good. > > Thanks, > Serguei > > On 12/3/19 12:45 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8234277 >> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/ >> >> No longer redirect stderr for the jhsdb/clhsdb process. It results in >> not seeing attach failures in the output, so OutputAnalyer can't >> check for them. >> >> Execute "verbose true" as the first clhsdb command after launching. >> This will result in verboseExceptions being true in >> CommandProcessor.java, so full exception traces will appear in the >> output. This will make debugging future SA test failures a lot easier. >> >> Add an extra check for any DebuggerException. This is mainly for >> detecting that the attached failed. This previously was going >> un-noticed, and instead the test would later fail because it noticed >> some other issue, like missing output, which isn't very informative. >> >> Add checks for other unexpected SA exceptions that are caught and >> printed by CommandProcessor. These will always have an "Error: " >> prefix, making them easy to detect. >> >> Problem list ClhsdbScanOops.java. With the new error checking, it >> will now always fail on windows due to JDK-8230731 and on macos and >> linux due to JDK-8235220. These failures are not "new" per se, but >> are just now being properly detected. >> >> thanks, >> >> Chris > From igor.ignatyev at oracle.com Thu Dec 5 02:08:23 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 4 Dec 2019 18:08:23 -0800 Subject: RFR(T) : 8235353 : clean up hotspot problem lists In-Reply-To: References: Message-ID: <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com> Martin, Goetz. could you please check that these 9 tests still pass on PPC? -- Igor > On Dec 4, 2019, at 12:01 PM, Vladimir Kozlov wrote: > > I am fine with changes but we need to ask PPC64 supporter to verify that tests passed now. > > Thanks, > Vladimir K > > On 12/4/19 11:52 AM, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00 >>> 9 lines changed: 0 ins; 0 del; 9 mod; >> Hi all, >> could you please review this small and trivial cleanup which returns serviceablility/sa tests back to execution on linux-ppc64. the tests were problem listed due to 8211767[1], which is closed as a dup of resolved 8228649[2]. >> [1] https://bugs.openjdk.java.net/browse/JDK-8211767 >> [2] https://bugs.openjdk.java.net/browse/JDK-8228649 >> Thanks, >> -- Igor From suenaga at oss.nttdata.com Thu Dec 5 02:15:46 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 5 Dec 2019 11:15:46 +0900 Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and do a better job of detecting SA failures In-Reply-To: References: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com> Message-ID: <862c4eab-86cf-5578-dcf8-a6e4b6a995b6@oss.nttdata.com> Looks good. Yasumasa On 2019/12/05 10:39, Chris Plummer wrote: > Can I get one more review please? > > thanks, > > Chris > > On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> It looks good. >> >> Thanks, >> Serguei >> >> On 12/3/19 12:45 PM, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8234277 >>> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/ >>> >>> No longer redirect stderr for the jhsdb/clhsdb process. It results in not seeing attach failures in the output, so OutputAnalyer can't check for them. >>> >>> Execute "verbose true" as the first clhsdb command after launching. This will result in verboseExceptions being true in CommandProcessor.java, so full exception traces will appear in the output. This will make debugging future SA test failures a lot easier. >>> >>> Add an extra check for any DebuggerException. This is mainly for detecting that the attached failed. This previously was going un-noticed, and instead the test would later fail because it noticed some other issue, like missing output, which isn't very informative. >>> >>> Add checks for other unexpected SA exceptions that are caught and printed by CommandProcessor. These will always have an "Error: " prefix, making them easy to detect. >>> >>> Problem list ClhsdbScanOops.java. With the new error checking, it will now always fail on windows due to JDK-8230731 and on macos and linux due to JDK-8235220. These failures are not "new" per se, but are just now being properly detected. >>> >>> thanks, >>> >>> Chris >> > From chris.plummer at oracle.com Thu Dec 5 03:10:27 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 4 Dec 2019 19:10:27 -0800 Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and do a better job of detecting SA failures In-Reply-To: <862c4eab-86cf-5578-dcf8-a6e4b6a995b6@oss.nttdata.com> References: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com> <862c4eab-86cf-5578-dcf8-a6e4b6a995b6@oss.nttdata.com> Message-ID: <045a11a4-c9d4-e55d-3102-781ed3523e81@oracle.com> Thanks! On 12/4/19 6:15 PM, Yasumasa Suenaga wrote: > Looks good. > > Yasumasa > > On 2019/12/05 10:39, Chris Plummer wrote: >> Can I get one more review please? >> >> thanks, >> >> Chris >> >> On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> It looks good. >>> >>> Thanks, >>> Serguei >>> >>> On 12/3/19 12:45 PM, Chris Plummer wrote: >>>> Hello, >>>> >>>> Please review the following: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8234277 >>>> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/ >>>> >>>> No longer redirect stderr for the jhsdb/clhsdb process. It results >>>> in not seeing attach failures in the output, so OutputAnalyer can't >>>> check for them. >>>> >>>> Execute "verbose true" as the first clhsdb command after launching. >>>> This will result in verboseExceptions being true in >>>> CommandProcessor.java, so full exception traces will appear in the >>>> output. This will make debugging future SA test failures a lot easier. >>>> >>>> Add an extra check for any DebuggerException. This is mainly for >>>> detecting that the attached failed. This previously was going >>>> un-noticed, and instead the test would later fail because it >>>> noticed some other issue, like missing output, which isn't very >>>> informative. >>>> >>>> Add checks for other unexpected SA exceptions that are caught and >>>> printed by CommandProcessor. These will always have an "Error: " >>>> prefix, making them easy to detect. >>>> >>>> Problem list ClhsdbScanOops.java. With the new error checking, it >>>> will now always fail on windows due to JDK-8230731 and on macos and >>>> linux due to JDK-8235220. These failures are not "new" per se, but >>>> are just now being properly detected. >>>> >>>> thanks, >>>> >>>> Chris >>> >> From christoph.langer at sap.com Thu Dec 5 11:24:14 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 5 Dec 2019 11:24:14 +0000 Subject: RFR(T) : 8235353 : clean up hotspot problem lists In-Reply-To: <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com> References: <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com> Message-ID: Hi Igor, I have added your update to our test system. I'll let you know the results by tomorrow. Best regards Christoph > -----Original Message----- > From: serviceability-dev On > Behalf Of Igor Ignatyev > Sent: Donnerstag, 5. Dezember 2019 03:08 > To: Doerr, Martin ; Lindenmaier, Goetz > > Cc: serviceability-dev ; Vladimir > Kozlov ; hotspot-dev Source Developers > > Subject: Re: RFR(T) : 8235353 : clean up hotspot problem lists > > Martin, Goetz. > > could you please check that these 9 tests still pass on PPC? > > -- Igor > > > On Dec 4, 2019, at 12:01 PM, Vladimir Kozlov > wrote: > > > > I am fine with changes but we need to ask PPC64 supporter to verify that > tests passed now. > > > > Thanks, > > Vladimir K > > > > On 12/4/19 11:52 AM, Igor Ignatyev wrote: > >> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00 > >>> 9 lines changed: 0 ins; 0 del; 9 mod; > >> Hi all, > >> could you please review this small and trivial cleanup which returns > serviceablility/sa tests back to execution on linux-ppc64. the tests were > problem listed due to 8211767[1], which is closed as a dup of resolved > 8228649[2]. > >> [1] https://bugs.openjdk.java.net/browse/JDK-8211767 > >> [2] https://bugs.openjdk.java.net/browse/JDK-8228649 > >> Thanks, > >> -- Igor From coleen.phillimore at oracle.com Thu Dec 5 12:08:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Dec 2019 07:08:04 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> Message-ID: <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> Thanks Dan.? I moved the field.? For some reason I thought that class did more/different things than hold per-thread information. I've retested this version with tiers 2-6. incr webrev at http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev full? webrev at http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev Thanks to Serguei for offline discussion. Coleen On 12/4/19 7:40 PM, Daniel D. Daugherty wrote: > Generally speaking, JVM/TI related things should be in JvmtiThreadState > instead of directly in the Thread class. That way the extra space is only > consumed when JVM/TI is in use and only when a Thread does something that > requires a JvmtiThreadState to be created. > > Please reconsider moving _jvmti_event_queue. > > Dan > > > On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote: >> >> Hi Serguei, >> >> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Collen, (no problem) >>> >>> It looks good in general. >>> Thank you a lot for sorting this out! >>> >>> Just a couple of comments. >>> >>> >>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html >>> 1993 protected: >>> 1994 // Jvmti Events that cannot be posted in their current context. >>> 1995 // ServiceThread uses this to collect deferred events from >>> NonJava threads >>> 1996 // that cannot post events. >>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; >>> >>> As David I also have a concern about footprint of having the >>> _jvmti_event_queue field in the Thread class. >>> I'm thinking if it'd be better to move this field into the >>> JvmtiThreadState class. >>> Please, see jvmti_thread_state() and >>> JvmtiThreadState::state_for(JavaThread *thread). >> >> The reason I have it directly in JavaThread is so that the GC oops_do >> and nmethods_do code can find it easily.? I like your idea of hiding >> it in jvmti but this doesn't seem good to have this code know about >> jvmtiThreadState, which seems to be a queue of Jvmti states.? I also >> don't want to have jvmtiThreadState to have to add an oops_do() or >> nmethods_do() either. >> >>> >>> >>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html >>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { >>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this >>> method"); >>> 975 nmethod* nm = _event_data.compiled_method_load; >>> 976 JvmtiExport::post_compiled_method_load(env, nm); >>> 977 } >>> >>> The JvmtiDeferredEvent::post name looks too generic as it posts >>> compiled load events only. >>> Do you consider this function extended in the future to support more >>> event types? >>> >> >> I don't envision an extension for this function but I do for >> JvmtiDeferredEventQueue::post().? I have a small enhancement that >> would handoff the entire queue to the ServiceThread and have it call >> post() to post all the events rather than one at a time. >> >> So I'll rename this one post_compiled_method_load_event() and leave >> the other post() as is for now. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev >> >> Thanks, >> Coleen >> >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >>>> Summary: Add local deferred event list to thread to post events >>>> outside CodeCache_lock. >>>> >>>> This patch builds on the patch for JDK-8173361.? With this patch, I >>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >>>> and have one per thread.? The CodeBlob event that used to drop the >>>> CodeCache_lock and raced with the sweeper thread, adds the events >>>> it wants to post to its thread local list, and processes it outside >>>> the lock.? The list is walked in GC and by the sweeper to keep the >>>> nmethods from being unloaded and zombied, respectively. >>>> >>>> Also, the jmethod_id field in nmethod was only used as a boolean so >>>> don't create a jmethod_id until needed for >>>> post_compiled_method_unload. >>>> >>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed >>>> in the original bug report. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>> >>>> Thanks, >>>> Coleen >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu Dec 5 13:05:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Dec 2019 23:05:00 +1000 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> Message-ID: Hi Coleen, On 5/12/2019 10:08 pm, coleen.phillimore at oracle.com wrote: > > Thanks Dan.? I moved the field.? For some reason I thought that class > did more/different things than hold per-thread information. > > I've retested this version with tiers 2-6. > > incr webrev at > http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev That relocation looks good to me! One minor nit: src/hotspot/share/code/nmethod.hpp + void post_compiled_method_load_event(JvmtiThreadState* thread = NULL); parameter should be state not thread. Thanks, David ----- > full? webrev at http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev > > Thanks to Serguei for offline discussion. > > Coleen > > On 12/4/19 7:40 PM, Daniel D. Daugherty wrote: >> Generally speaking, JVM/TI related things should be in JvmtiThreadState >> instead of directly in the Thread class. That way the extra space is only >> consumed when JVM/TI is in use and only when a Thread does something that >> requires a JvmtiThreadState to be created. >> >> Please reconsider moving _jvmti_event_queue. >> >> Dan >> >> >> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote: >>> >>> Hi Serguei, >>> >>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi Collen, (no problem) >>>> >>>> It looks good in general. >>>> Thank you a lot for sorting this out! >>>> >>>> Just a couple of comments. >>>> >>>> >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html >>>> 1993 protected: >>>> 1994 // Jvmti Events that cannot be posted in their current context. >>>> 1995 // ServiceThread uses this to collect deferred events from >>>> NonJava threads >>>> 1996 // that cannot post events. >>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; >>>> >>>> As David I also have a concern about footprint of having the >>>> _jvmti_event_queue field in the Thread class. >>>> I'm thinking if it'd be better to move this field into the >>>> JvmtiThreadState class. >>>> Please, see jvmti_thread_state() and >>>> JvmtiThreadState::state_for(JavaThread *thread). >>> >>> The reason I have it directly in JavaThread is so that the GC oops_do >>> and nmethods_do code can find it easily.? I like your idea of hiding >>> it in jvmti but this doesn't seem good to have this code know about >>> jvmtiThreadState, which seems to be a queue of Jvmti states.? I also >>> don't want to have jvmtiThreadState to have to add an oops_do() or >>> nmethods_do() either. >>> >>>> >>>> >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html >>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { >>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this >>>> method"); >>>> 975 nmethod* nm = _event_data.compiled_method_load; >>>> 976 JvmtiExport::post_compiled_method_load(env, nm); >>>> 977 } >>>> >>>> The JvmtiDeferredEvent::post name looks too generic as it posts >>>> compiled load events only. >>>> Do you consider this function extended in the future to support more >>>> event types? >>>> >>> >>> I don't envision an extension for this function but I do for >>> JvmtiDeferredEventQueue::post().? I have a small enhancement that >>> would handoff the entire queue to the ServiceThread and have it call >>> post() to post all the events rather than one at a time. >>> >>> So I'll rename this one post_compiled_method_load_event() and leave >>> the other post() as is for now. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev >>> >>> Thanks, >>> Coleen >>> >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >>>>> Summary: Add local deferred event list to thread to post events >>>>> outside CodeCache_lock. >>>>> >>>>> This patch builds on the patch for JDK-8173361.? With this patch, I >>>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) >>>>> and have one per thread.? The CodeBlob event that used to drop the >>>>> CodeCache_lock and raced with the sweeper thread, adds the events >>>>> it wants to post to its thread local list, and processes it outside >>>>> the lock.? The list is walked in GC and by the sweeper to keep the >>>>> nmethods from being unloaded and zombied, respectively. >>>>> >>>>> Also, the jmethod_id field in nmethod was only used as a boolean so >>>>> don't create a jmethod_id until needed for >>>>> post_compiled_method_unload. >>>>> >>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed >>>>> in the original bug report. >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>> >> > From coleen.phillimore at oracle.com Thu Dec 5 13:10:32 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Dec 2019 08:10:32 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> Message-ID: <07b00937-b8a3-1c9c-0b7e-229ad0091ab1@oracle.com> On 12/5/19 8:05 AM, David Holmes wrote: > Hi Coleen, > > On 5/12/2019 10:08 pm, coleen.phillimore at oracle.com wrote: >> >> Thanks Dan.? I moved the field.? For some reason I thought that class >> did more/different things than hold per-thread information. >> >> I've retested this version with tiers 2-6. >> >> incr webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev > > That relocation looks good to me! > > One minor nit: > > src/hotspot/share/code/nmethod.hpp > > +? void post_compiled_method_load_event(JvmtiThreadState* thread = NULL); > > parameter should be state not thread. Ok yes, I'll fix it.? Thanks for the code review. Coleen > > Thanks, > David > ----- > > >> full? webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev >> >> Thanks to Serguei for offline discussion. >> >> Coleen >> >> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote: >>> Generally speaking, JVM/TI related things should be in JvmtiThreadState >>> instead of directly in the Thread class. That way the extra space is >>> only >>> consumed when JVM/TI is in use and only when a Thread does something >>> that >>> requires a JvmtiThreadState to be created. >>> >>> Please reconsider moving _jvmti_event_queue. >>> >>> Dan >>> >>> >>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi Serguei, >>>> >>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Collen, (no problem) >>>>> >>>>> It looks good in general. >>>>> Thank you a lot for sorting this out! >>>>> >>>>> Just a couple of comments. >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html >>>>> >>>>> 1993 protected: >>>>> 1994 // Jvmti Events that cannot be posted in their current context. >>>>> 1995 // ServiceThread uses this to collect deferred events from >>>>> NonJava threads >>>>> 1996 // that cannot post events. >>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; >>>>> >>>>> As David I also have a concern about footprint of having the >>>>> _jvmti_event_queue field in the Thread class. >>>>> I'm thinking if it'd be better to move this field into the >>>>> JvmtiThreadState class. >>>>> Please, see jvmti_thread_state() and >>>>> JvmtiThreadState::state_for(JavaThread *thread). >>>> >>>> The reason I have it directly in JavaThread is so that the GC >>>> oops_do and nmethods_do code can find it easily.? I like your idea >>>> of hiding it in jvmti but this doesn't seem good to have this code >>>> know about jvmtiThreadState, which seems to be a queue of Jvmti >>>> states.? I also don't want to have jvmtiThreadState to have to add >>>> an oops_do() or nmethods_do() either. >>>> >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html >>>>> >>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { >>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this >>>>> method"); >>>>> 975 nmethod* nm = _event_data.compiled_method_load; >>>>> 976 JvmtiExport::post_compiled_method_load(env, nm); >>>>> 977 } >>>>> >>>>> The JvmtiDeferredEvent::post name looks too generic as it posts >>>>> compiled load events only. >>>>> Do you consider this function extended in the future to support >>>>> more event types? >>>>> >>>> >>>> I don't envision an extension for this function but I do for >>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that >>>> would handoff the entire queue to the ServiceThread and have it >>>> call post() to post all the events rather than one at a time. >>>> >>>> So I'll rename this one post_compiled_method_load_event() and leave >>>> the other post() as is for now. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev >>>> >>>> Thanks, >>>> Coleen >>>> >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Add local deferred event list to thread to post events >>>>>> outside CodeCache_lock. >>>>>> >>>>>> This patch builds on the patch for JDK-8173361.? With this patch, >>>>>> I made the JvmtiDeferredEventQueue an instance class (not >>>>>> AllStatic) and have one per thread. The CodeBlob event that used >>>>>> to drop the CodeCache_lock and raced with the sweeper thread, >>>>>> adds the events it wants to post to its thread local list, and >>>>>> processes it outside the lock.? The list is walked in GC and by >>>>>> the sweeper to keep the nmethods from being unloaded and zombied, >>>>>> respectively. >>>>>> >>>>>> Also, the jmethod_id field in nmethod was only used as a boolean >>>>>> so don't create a jmethod_id until needed for >>>>>> post_compiled_method_unload. >>>>>> >>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>> crashed in the original bug report. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >>> >> From thomas.stuefe at gmail.com Thu Dec 5 13:32:26 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 5 Dec 2019 14:32:26 +0100 Subject: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump In-Reply-To: References: Message-ID: Hi Ralf, Not a complete review yet. But this looks good. The seeking before seemed awkward. Some remarks: In DumpWriter, _current_entry_left and _entry_ended seem only to be needed for asserting. Please enclose their definition in DEBUG_ONLY, and initialize them in the ctor. -- I like that DumperSupport::dump_object_array() does not write stuff anymore. That felt surprising. Still feels awkward that it warns about large arrays, seems more of a thing the caller should do. And that we have to pass in header size for it to do that. Not a big deal though. -- (not your patch): since DumperSupport::dump_class_and_array_classes(Klass*) should assert that Klass* is an InstanceKlass; or, even better, use InstanceKlass* as parameter. -- This was a bit of a brain teaser. More comments would be helpful. You wrote a good abstract in the JBS issue, could you please copy the proposed implementation as a comment into the class declaration of DumpWriter. -- DumpWriter::start_dump_entry(): It took me a while to understand how the segment size is updated if the entry is huge, since by the time we finish the entry the segment header will already be flushed out. The answer is, I think, that this is not needed since we only write one record so the initial size we wrote into the segment header is still valid. Proposed comment change: -// Will be fixed up later if we can add more entries. +// Seed segment size with size of its first record. Should we add more records later, we will update the segment size (see finish_dump_segment()) -- That's all I have for now. If there are still Reviewers missing after my vacation, I'll take another look. Cheers, Thomas On Mon, Nov 25, 2019 at 3:41 PM Schmelter, Ralf wrote: > Hello, > > this change removes the need to use seek on the hprof file when creating a > heap dump, thus making it possible to stream the dump. This enables us to > dump to a socket or directly gzip the dump. > > Instead of fixing the heap dump segments size on the written file, the > size of the heap dump segments is either fixed up in the buffer instead or, > for entries to big to fit into the buffer fully, the entry get its own > segment with no need to fix up the segment size later. > > To do this, we now need to know how large an heap dump segment entry is > when starting to write the entry. This is either trivial (for the roots) or > already known (for the instance and array dump entries). Just the class > entry needed a little more code to track the size. > > The change results in more heap dump segments in the written heap dump. > But since the overhead per segment is 9 bytes, even for the smallest used > buffer (64K) the overhead is less than 0.02%. Additionally the heap dump > now expects to be able to allocate at least 64k for the buffer. The old > code tried to run even with a buffer of 1 byte or no buffer at all. > > Bugreport: https://bugs.openjdk.java.net/browse/JDK-8234510 > Webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8234510/webrev.0/ > > Best regards, > Ralf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From harold.seigel at oracle.com Thu Dec 5 14:28:04 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Thu, 5 Dec 2019 09:28:04 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute Message-ID: Hi, Please review this trivial change to add documentation about the Record attribute to the JDWP, JDI, and Instrumentation specs. The changed .html pages (best viewed as 'raw') are included in the webrev but will not be pushed. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 The fix was regression tested by running Mach5 tiers 1 and 2 tests and builds on Linux-x64, Solaris, Windows, and Mac OS X. Thanks, Harold From lois.foltan at oracle.com Thu Dec 5 14:59:22 2019 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 5 Dec 2019 09:59:22 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: References: Message-ID: <7da43abb-cf32-a4f6-bdbe-57b525cf13f9@oracle.com> On 12/5/2019 9:28 AM, Harold Seigel wrote: > Hi, > > Please review this trivial change to add documentation about the > Record attribute to the JDWP, JDI, and Instrumentation specs. > > The changed .html pages (best viewed as 'raw') are included in the > webrev but will not be pushed. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 > > The fix was regression tested by running Mach5 tiers 1 and 2 tests and > builds on Linux-x64, Solaris, Windows, and Mac OS X. > > Thanks, Harold > Looks good & trivial.? VirtualMachine.java needs a copyright update. Lois From harold.seigel at oracle.com Thu Dec 5 15:13:15 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Thu, 5 Dec 2019 10:13:15 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: <7da43abb-cf32-a4f6-bdbe-57b525cf13f9@oracle.com> References: <7da43abb-cf32-a4f6-bdbe-57b525cf13f9@oracle.com> Message-ID: <6dd5a7ed-258c-0ccc-f438-af8c66e20eb1@oracle.com> Thanks Lois! I'll fix the copyright before pushing. Harold On 12/5/2019 9:59 AM, Lois Foltan wrote: > On 12/5/2019 9:28 AM, Harold Seigel wrote: >> Hi, >> >> Please review this trivial change to add documentation about the >> Record attribute to the JDWP, JDI, and Instrumentation specs. >> >> The changed .html pages (best viewed as 'raw') are included in the >> webrev but will not be pushed. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-x64, Solaris, Windows, and Mac OS X. >> >> Thanks, Harold >> > Looks good & trivial.? VirtualMachine.java needs a copyright update. > Lois From serguei.spitsyn at oracle.com Thu Dec 5 16:00:01 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Dec 2019 08:00:01 -0800 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> Message-ID: <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Dec 5 16:15:54 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 5 Dec 2019 11:15:54 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: References: Message-ID: <1a6af5af-e64f-39d5-9629-3ae80eec64a8@oracle.com> Do you plan to make JVM/TI spec changes also? Dan On 12/5/19 9:28 AM, Harold Seigel wrote: > Hi, > > Please review this trivial change to add documentation about the > Record attribute to the JDWP, JDI, and Instrumentation specs. > > The changed .html pages (best viewed as 'raw') are included in the > webrev but will not be pushed. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 > > The fix was regression tested by running Mach5 tiers 1 and 2 tests and > builds on Linux-x64, Solaris, Windows, and Mac OS X. > > Thanks, Harold > > From coleen.phillimore at oracle.com Thu Dec 5 18:24:04 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Dec 2019 13:24:04 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> Message-ID: <55f95141-42e1-2ee9-6f0c-fafcafc35356@oracle.com> On 12/5/19 11:00 AM, serguei.spitsyn at oracle.com wrote: > Hi Collen, > > Thank you for making this update! > It looks good to me. > > One nit: > > http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev/test/hotspot/jtreg/serviceability/jvmti/CompiledMethodLoad/libCompiledZombie.cpp.html > > ? 46 // Continuously generate CompiledMethodLoad events for all > currently compiled methods > ? 47 void JNICALL GenerateEventsThread(jvmtiEnv* jvmti, JNIEnv* jni, > void* arg) { > ? 48???? jvmti->SetEventNotificationMode(JVMTI_ENABLE, > JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL); > ? 49???? int count = 0; > ? 50 > ? 51???? while (true) { > ? 52???????? events = 0; > ? 53 jvmti->GenerateEvents(JVMTI_EVENT_COMPILED_METHOD_LOAD); > ? 54???????? if (events != 0 && ++count == 200) { > ? 55???????????? printf("Generated %d events\n", events); > ? 56???????????? count = 0; > ? 57???????? } > ? 58???? } > ? 59 } > > ? The above can be simplified a little bit: > ????????? if (events % 200 == 199) { > ????????????? printf("Generated %d events\n", events); > ????????? } > > ? Then this line is not needed too: > ? ? 49???? int count = 0; > Ok, I'll make that change before I push. Thanks for the review and your help! Coleen > > Thanks, > Serguei > > > On 12/5/19 04:08, coleen.phillimore at oracle.com wrote: >> >> Thanks Dan.? I moved the field.? For some reason I thought that class >> did more/different things than hold per-thread information. >> >> I've retested this version with tiers 2-6. >> >> incr webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev >> full? webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev >> >> Thanks to Serguei for offline discussion. >> >> Coleen >> >> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote: >>> Generally speaking, JVM/TI related things should be in JvmtiThreadState >>> instead of directly in the Thread class. That way the extra space is >>> only >>> consumed when JVM/TI is in use and only when a Thread does something >>> that >>> requires a JvmtiThreadState to be created. >>> >>> Please reconsider moving _jvmti_event_queue. >>> >>> Dan >>> >>> >>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi Serguei, >>>> >>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Collen, (no problem) >>>>> >>>>> It looks good in general. >>>>> Thank you a lot for sorting this out! >>>>> >>>>> Just a couple of comments. >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html >>>>> 1993 protected: >>>>> 1994 // Jvmti Events that cannot be posted in their current context. >>>>> 1995 // ServiceThread uses this to collect deferred events from >>>>> NonJava threads >>>>> 1996 // that cannot post events. >>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; >>>>> >>>>> As David I also have a concern about footprint of having the >>>>> _jvmti_event_queue field in the Thread class. >>>>> I'm thinking if it'd be better to move this field into the >>>>> JvmtiThreadState class. >>>>> Please, see jvmti_thread_state() and >>>>> JvmtiThreadState::state_for(JavaThread *thread). >>>> >>>> The reason I have it directly in JavaThread is so that the GC >>>> oops_do and nmethods_do code can find it easily.? I like your idea >>>> of hiding it in jvmti but this doesn't seem good to have this code >>>> know about jvmtiThreadState, which seems to be a queue of Jvmti >>>> states.? I also don't want to have jvmtiThreadState to have to add >>>> an oops_do() or nmethods_do() either. >>>> >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html >>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { >>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this >>>>> method"); >>>>> 975 nmethod* nm = _event_data.compiled_method_load; >>>>> 976 JvmtiExport::post_compiled_method_load(env, nm); >>>>> 977 } >>>>> >>>>> The JvmtiDeferredEvent::post name looks too generic as it posts >>>>> compiled load events only. >>>>> Do you consider this function extended in the future to support >>>>> more event types? >>>>> >>>> >>>> I don't envision an extension for this function but I do for >>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that >>>> would handoff the entire queue to the ServiceThread and have it >>>> call post() to post all the events rather than one at a time. >>>> >>>> So I'll rename this one post_compiled_method_load_event() and leave >>>> the other post() as is for now. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev >>>> >>>> Thanks, >>>> Coleen >>>> >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Add local deferred event list to thread to post events >>>>>> outside CodeCache_lock. >>>>>> >>>>>> This patch builds on the patch for JDK-8173361.? With this patch, >>>>>> I made the JvmtiDeferredEventQueue an instance class (not >>>>>> AllStatic) and have one per thread. The CodeBlob event that used >>>>>> to drop the CodeCache_lock and raced with the sweeper thread, >>>>>> adds the events it wants to post to its thread local list, and >>>>>> processes it outside the lock.? The list is walked in GC and by >>>>>> the sweeper to keep the nmethods from being unloaded and zombied, >>>>>> respectively. >>>>>> >>>>>> Also, the jmethod_id field in nmethod was only used as a boolean >>>>>> so don't create a jmethod_id until needed for >>>>>> post_compiled_method_unload. >>>>>> >>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>> crashed in the original bug report. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From harold.seigel at oracle.com Thu Dec 5 18:30:24 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Thu, 5 Dec 2019 13:30:24 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: <1a6af5af-e64f-39d5-9629-3ae80eec64a8@oracle.com> References: <1a6af5af-e64f-39d5-9629-3ae80eec64a8@oracle.com> Message-ID: The JVM/TI change for Record attrbute was in the big Records push.? I missed the other three until Serguei pointed it out. Thanks, Harold On 12/5/2019 11:15 AM, Daniel D. Daugherty wrote: > Do you plan to make JVM/TI spec changes also? > > Dan > > > On 12/5/19 9:28 AM, Harold Seigel wrote: >> Hi, >> >> Please review this trivial change to add documentation about the >> Record attribute to the JDWP, JDI, and Instrumentation specs. >> >> The changed .html pages (best viewed as 'raw') are included in the >> webrev but will not be pushed. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-x64, Solaris, Windows, and Mac OS X. >> >> Thanks, Harold >> >> > From daniel.daugherty at oracle.com Thu Dec 5 18:34:02 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 5 Dec 2019 13:34:02 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: References: <1a6af5af-e64f-39d5-9629-3ae80eec64a8@oracle.com> Message-ID: <410d99e1-a893-1b7d-8aa6-c93e47f12118@oracle.com> Thanks for clarifying. Dan On 12/5/19 1:30 PM, Harold Seigel wrote: > The JVM/TI change for Record attrbute was in the big Records push.? I > missed the other three until Serguei pointed it out. > > Thanks, Harold > > On 12/5/2019 11:15 AM, Daniel D. Daugherty wrote: >> Do you plan to make JVM/TI spec changes also? >> >> Dan >> >> >> On 12/5/19 9:28 AM, Harold Seigel wrote: >>> Hi, >>> >>> Please review this trivial change to add documentation about the >>> Record attribute to the JDWP, JDI, and Instrumentation specs. >>> >>> The changed .html pages (best viewed as 'raw') are included in the >>> webrev but will not be pushed. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >>> >>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>> and builds on Linux-x64, Solaris, Windows, and Mac OS X. >>> >>> Thanks, Harold >>> >>> >> From coleen.phillimore at oracle.com Thu Dec 5 18:36:39 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Dec 2019 13:36:39 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> Message-ID: On 12/5/19 11:00 AM, serguei.spitsyn at oracle.com wrote: > Hi Collen, > > Thank you for making this update! > It looks good to me. > > One nit: > > http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev/test/hotspot/jtreg/serviceability/jvmti/CompiledMethodLoad/libCompiledZombie.cpp.html > > ? 46 // Continuously generate CompiledMethodLoad events for all > currently compiled methods > ? 47 void JNICALL GenerateEventsThread(jvmtiEnv* jvmti, JNIEnv* jni, > void* arg) { > ? 48???? jvmti->SetEventNotificationMode(JVMTI_ENABLE, > JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL); > ? 49???? int count = 0; > ? 50 > ? 51???? while (true) { > ? 52???????? events = 0; > ? 53 jvmti->GenerateEvents(JVMTI_EVENT_COMPILED_METHOD_LOAD); > ? 54???????? if (events != 0 && ++count == 200) { > ? 55???????????? printf("Generated %d events\n", events); > ? 56???????????? count = 0; > ? 57???????? } > ? 58???? } > ? 59 } > > ? The above can be simplified a little bit: > ????????? if (events % 200 == 199) { > ????????????? printf("Generated %d events\n", events); > ????????? } > > ? Then this line is not needed too: > ? ? 49???? int count = 0; > I answered this too fast.? There are two conditions where I want this to not print.? First is where events == 0 and the other for every 200 events that are non-zero. I could use if (events != 0 && count++ % 200), but I thought what I had makes more sense and I don't have to worry about when ++ happens. Thanks, Coleen > > Thanks, > Serguei > > > On 12/5/19 04:08, coleen.phillimore at oracle.com wrote: >> >> Thanks Dan.? I moved the field.? For some reason I thought that class >> did more/different things than hold per-thread information. >> >> I've retested this version with tiers 2-6. >> >> incr webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev >> full? webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev >> >> Thanks to Serguei for offline discussion. >> >> Coleen >> >> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote: >>> Generally speaking, JVM/TI related things should be in JvmtiThreadState >>> instead of directly in the Thread class. That way the extra space is >>> only >>> consumed when JVM/TI is in use and only when a Thread does something >>> that >>> requires a JvmtiThreadState to be created. >>> >>> Please reconsider moving _jvmti_event_queue. >>> >>> Dan >>> >>> >>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi Serguei, >>>> >>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Collen, (no problem) >>>>> >>>>> It looks good in general. >>>>> Thank you a lot for sorting this out! >>>>> >>>>> Just a couple of comments. >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html >>>>> 1993 protected: >>>>> 1994 // Jvmti Events that cannot be posted in their current context. >>>>> 1995 // ServiceThread uses this to collect deferred events from >>>>> NonJava threads >>>>> 1996 // that cannot post events. >>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; >>>>> >>>>> As David I also have a concern about footprint of having the >>>>> _jvmti_event_queue field in the Thread class. >>>>> I'm thinking if it'd be better to move this field into the >>>>> JvmtiThreadState class. >>>>> Please, see jvmti_thread_state() and >>>>> JvmtiThreadState::state_for(JavaThread *thread). >>>> >>>> The reason I have it directly in JavaThread is so that the GC >>>> oops_do and nmethods_do code can find it easily.? I like your idea >>>> of hiding it in jvmti but this doesn't seem good to have this code >>>> know about jvmtiThreadState, which seems to be a queue of Jvmti >>>> states.? I also don't want to have jvmtiThreadState to have to add >>>> an oops_do() or nmethods_do() either. >>>> >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html >>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { >>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this >>>>> method"); >>>>> 975 nmethod* nm = _event_data.compiled_method_load; >>>>> 976 JvmtiExport::post_compiled_method_load(env, nm); >>>>> 977 } >>>>> >>>>> The JvmtiDeferredEvent::post name looks too generic as it posts >>>>> compiled load events only. >>>>> Do you consider this function extended in the future to support >>>>> more event types? >>>>> >>>> >>>> I don't envision an extension for this function but I do for >>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that >>>> would handoff the entire queue to the ServiceThread and have it >>>> call post() to post all the events rather than one at a time. >>>> >>>> So I'll rename this one post_compiled_method_load_event() and leave >>>> the other post() as is for now. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev >>>> >>>> Thanks, >>>> Coleen >>>> >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >>>>>> Summary: Add local deferred event list to thread to post events >>>>>> outside CodeCache_lock. >>>>>> >>>>>> This patch builds on the patch for JDK-8173361.? With this patch, >>>>>> I made the JvmtiDeferredEventQueue an instance class (not >>>>>> AllStatic) and have one per thread. The CodeBlob event that used >>>>>> to drop the CodeCache_lock and raced with the sweeper thread, >>>>>> adds the events it wants to post to its thread local list, and >>>>>> processes it outside the lock.? The list is walked in GC and by >>>>>> the sweeper to keep the nmethods from being unloaded and zombied, >>>>>> respectively. >>>>>> >>>>>> Also, the jmethod_id field in nmethod was only used as a boolean >>>>>> so don't create a jmethod_id until needed for >>>>>> post_compiled_method_unload. >>>>>> >>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>> crashed in the original bug report. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Dec 5 18:41:54 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Dec 2019 10:41:54 -0800 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Thu Dec 5 19:15:28 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Dec 2019 14:15:28 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> Message-ID: <66e249d2-ea32-db69-5e1f-1a31fb19ecd5@oracle.com> On 12/5/19 1:41 PM, serguei.spitsyn at oracle.com wrote: > On 12/5/19 10:36, coleen.phillimore at oracle.com wrote: >> >> >> On 12/5/19 11:00 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Collen, >>> >>> Thank you for making this update! >>> It looks good to me. >>> >>> One nit: >>> >>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev/test/hotspot/jtreg/serviceability/jvmti/CompiledMethodLoad/libCompiledZombie.cpp.html >>> >>> ? 46 // Continuously generate CompiledMethodLoad events for all >>> currently compiled methods >>> ? 47 void JNICALL GenerateEventsThread(jvmtiEnv* jvmti, JNIEnv* jni, >>> void* arg) { >>> ? 48???? jvmti->SetEventNotificationMode(JVMTI_ENABLE, >>> JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL); >>> ? 49???? int count = 0; >>> ? 50 >>> ? 51???? while (true) { >>> ? 52???????? events = 0; >>> ? 53 jvmti->GenerateEvents(JVMTI_EVENT_COMPILED_METHOD_LOAD); >>> ? 54???????? if (events != 0 && ++count == 200) { >>> ? 55???????????? printf("Generated %d events\n", events); >>> ? 56???????????? count = 0; >>> ? 57???????? } >>> ? 58???? } >>> ? 59 } >>> >>> ? The above can be simplified a little bit: >>> ????????? if (events % 200 == 199) { >>> ????????????? printf("Generated %d events\n", events); >>> ????????? } >>> >>> ? Then this line is not needed too: >>> ? ? 49???? int count = 0; >>> >> >> I answered this too fast.? There are two conditions where I want this >> to not print.? First is where events == 0 and the other for every 200 >> events that are non-zero. >> >> I could use if (events != 0 && count++ % 200), but I thought what I >> had makes more sense and I don't have to worry about when ++ happens. > > Then you could replace it with: > ? if (events % 200 == 0) { But that would still print when events == 0, which I don't want. If I print them all for the little test case, it's ok, but when I run this with Swingset2, it's too much output.? I only want to see a few lines for this: ----------System.out:(3/113)---------- Test passes if it doesn't crash while posting compiled method events. Generated 285 events Generated 1002 events ----------System.err:(1/15)---------- The count is the number of times through the GenerateEvents loop, which resets events to zero each time, then prints the number of events for every 200 times through the GenerateEvents loop.? So I need both count and events. Coleen > > But it is up to you. :) > > Thanks, > Serguei > >> >> Thanks, >> Coleen >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 12/5/19 04:08, coleen.phillimore at oracle.com wrote: >>>> >>>> Thanks Dan.? I moved the field.? For some reason I thought that >>>> class did more/different things than hold per-thread information. >>>> >>>> I've retested this version with tiers 2-6. >>>> >>>> incr webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev >>>> full? webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev >>>> >>>> Thanks to Serguei for offline discussion. >>>> >>>> Coleen >>>> >>>> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote: >>>>> Generally speaking, JVM/TI related things should be in >>>>> JvmtiThreadState >>>>> instead of directly in the Thread class. That way the extra space >>>>> is only >>>>> consumed when JVM/TI is in use and only when a Thread does >>>>> something that >>>>> requires a JvmtiThreadState to be created. >>>>> >>>>> Please reconsider moving _jvmti_event_queue. >>>>> >>>>> Dan >>>>> >>>>> >>>>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Hi Serguei, >>>>>> >>>>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Collen, (no problem) >>>>>>> >>>>>>> It looks good in general. >>>>>>> Thank you a lot for sorting this out! >>>>>>> >>>>>>> Just a couple of comments. >>>>>>> >>>>>>> >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html >>>>>>> 1993 protected: >>>>>>> 1994 // Jvmti Events that cannot be posted in their current context. >>>>>>> 1995 // ServiceThread uses this to collect deferred events from >>>>>>> NonJava threads >>>>>>> 1996 // that cannot post events. >>>>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; >>>>>>> >>>>>>> As David I also have a concern about footprint of having the >>>>>>> _jvmti_event_queue field in the Thread class. >>>>>>> I'm thinking if it'd be better to move this field into the >>>>>>> JvmtiThreadState class. >>>>>>> Please, see jvmti_thread_state() and >>>>>>> JvmtiThreadState::state_for(JavaThread *thread). >>>>>> >>>>>> The reason I have it directly in JavaThread is so that the GC >>>>>> oops_do and nmethods_do code can find it easily. I like your idea >>>>>> of hiding it in jvmti but this doesn't seem good to have this >>>>>> code know about jvmtiThreadState, which seems to be a queue of >>>>>> Jvmti states.? I also don't want to have jvmtiThreadState to have >>>>>> to add an oops_do() or nmethods_do() either. >>>>>> >>>>>>> >>>>>>> >>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html >>>>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { >>>>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of >>>>>>> this method"); >>>>>>> 975 nmethod* nm = _event_data.compiled_method_load; >>>>>>> 976 JvmtiExport::post_compiled_method_load(env, nm); >>>>>>> 977 } >>>>>>> >>>>>>> The JvmtiDeferredEvent::post name looks too generic as it posts >>>>>>> compiled load events only. >>>>>>> Do you consider this function extended in the future to support >>>>>>> more event types? >>>>>>> >>>>>> >>>>>> I don't envision an extension for this function but I do for >>>>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that >>>>>> would handoff the entire queue to the ServiceThread and have it >>>>>> call post() to post all the events rather than one at a time. >>>>>> >>>>>> So I'll rename this one post_compiled_method_load_event() and >>>>>> leave the other post() as is for now. >>>>>> >>>>>> open webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: Add local deferred event list to thread to post events >>>>>>>> outside CodeCache_lock. >>>>>>>> >>>>>>>> This patch builds on the patch for JDK-8173361. With this >>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class >>>>>>>> (not AllStatic) and have one per thread.? The CodeBlob event >>>>>>>> that used to drop the CodeCache_lock and raced with the sweeper >>>>>>>> thread, adds the events it wants to post to its thread local >>>>>>>> list, and processes it outside the lock.? The list is walked in >>>>>>>> GC and by the sweeper to keep the nmethods from being unloaded >>>>>>>> and zombied, respectively. >>>>>>>> >>>>>>>> Also, the jmethod_id field in nmethod was only used as a >>>>>>>> boolean so don't create a jmethod_id until needed for >>>>>>>> post_compiled_method_unload. >>>>>>>> >>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>>>> crashed in the original bug report. >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Dec 5 19:31:02 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 05 Dec 2019 11:31:02 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> Message-ID: <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> Hi Mandy and Bob, Please review a new version of the webrev that addresses the most of these issues: 1) The interface and spec [3] were updated to use default methods. CSR [3] was re-approved. 2) Security-sensitive operations in j.i.p.cgroupv1.Metrics and in j.i.p.cgroupv1. SubSystem were wrapped with doPrivileged 3) getCpuLoad () method was optimized to fallback to getSystemCpuLoad0 if the cpuset is identical to the host's one. It uses sysconf(_SC_NPROCESSORS_CONF) to retrieve the number of CPUs configured on the host . Testing with different --cpuset-cpus settings inside a Docker container proved that it always returns the same number of hosts configured CPUs regardless of --cpuset-cpus settings while the same settings affect getEffectiveCpuSetCpus and getCpuSetCpus() metrics. In addition, getCpuLoad () method now returns -1 in the cases when quotas are active but cpu usage and cpu period metrics are not available and in the case when for some reason it fails to retrieve a valid CPU load for one of CPUs while iteration over them >> CheckOperatingSystemMXBean.java >> System.out.println(String.format(...)) can simply be replaced with System.out.format. I had to leave it unchanged since replacing it with System.out.format results in the tests instability as it makes the trace output occasionally Intervene here (the trace message sometimes is printed inside this message) and tests cannot find the expected pattern in the output. >> It may worth considering adding Metrics::getSwapLimit and >> Metrics::getSwapUsage and move the computation to the implementation of >> Metrics. Bob may have an opinion. There was no any new input regarding this so I decided to leave it unchanged. >>Also it seems correct for the memory related methods to check if >>(containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >> BTW what does it mean if limit == 0? Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible if something went wrong while retrieving this metric. Testing: Mach5 tier1-tier6 tests (that include open/test/hotspot/jtreg/containers/docker and : jdk_management tests) passed. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 Thank you, Daniil ?On 12/4/19, 4:09 PM, "Mandy Chung" wrote: On 12/3/19 9:40 PM, Daniil Titov wrote: > >>> Under what circumstance that limit or memLimit is < 0? > The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without > specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and > it can use as much memory as the host's OS allows. > OK. Please add a comment to the code. It may worth considering adding Metrics::getSwapLimit and Metrics::getSwapUsage and move the computation to the implementation of Metrics. Bob may have an opinion. Also it seems correct for the memory related methods to check if (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). BTW what does it mean if limit == 0? >>> Is it worth specifying this case? > I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. > I was wondering if the javadoc should specify that. >>> It fallbacks to return the system's total swap space size - this is not really what it should report. > For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. > However, I am not sure how we could differentiate these 2 cases. As this is the case when the limit is not set in the container, it returns the system metrics which sounds appropriate. > >>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. > For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. > For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). Will zero memory usage happen? > Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. > > For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was > not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() > returning -1 sounds right. >>> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >>> There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. > I could make these methods defaults if you feel it is a better approach here. > > It's not strictly needed but I can go either way. >>> CheckOperatingSystemMXBean.java >>> System.out.println(String.format(...)) can simply be replaced with System.out.format. > I will include this change in the next webrev, thank you! > > thanks Mandy From alexey.menkov at oracle.com Thu Dec 5 20:29:06 2019 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 5 Dec 2019 12:29:06 -0800 Subject: RFR(XS): JDK-8235433: Problem list JdwpListenTest.java and JdwpAttachTest.java on Windows Message-ID: <85623de1-5915-522e-6db0-671b9fce4fba@oracle.com> Hi all, Recently JdwpListenTest.java and JdwpAttachTest.java have started to fail on Windows2016 for unclear (yet) reason: https://bugs.openjdk.java.net/browse/JDK-8234935 Until the issue is resolved need to problem list the tests. jira: https://bugs.openjdk.java.net/browse/JDK-8235433 the fix: --- a/test/jdk/ProblemList.txt Thu Dec 05 16:43:06 2019 +0000 +++ b/test/jdk/ProblemList.txt Thu Dec 05 11:59:27 2019 -0800 @@ -904,6 +904,9 @@ com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all +com/sun/jdi/JdwpListenTest.java 8234935 windows-all +com/sun/jdi/JdwpAttachTest.java 8234935 windows-all + ############################################################################ # jdk_time --alex From bob.vandette at oracle.com Thu Dec 5 20:50:49 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 5 Dec 2019 15:50:49 -0500 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> Message-ID: In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html Shouldn?t you keep the IOException catch clauses in case the file is not found? > On Dec 5, 2019, at 2:31 PM, Daniil Titov wrote: > > Hi Mandy and Bob, > > Please review a new version of the webrev that addresses the most of these issues: > > 1) The interface and spec [3] were updated to use default methods. CSR [3] was re-approved. > > 2) Security-sensitive operations in j.i.p.cgroupv1.Metrics and in j.i.p.cgroupv1. SubSystem > were wrapped with doPrivileged > > 3) getCpuLoad () method was optimized to fallback to getSystemCpuLoad0 if the cpuset is identical to the host's one. > It uses sysconf(_SC_NPROCESSORS_CONF) to retrieve the number of CPUs configured on the host . Testing with > different --cpuset-cpus settings inside a Docker container proved that it always returns the same number of hosts configured > CPUs regardless of --cpuset-cpus settings while the same settings affect getEffectiveCpuSetCpus and getCpuSetCpus() metrics. > > In addition, getCpuLoad () method now returns -1 in the cases when quotas are active but cpu usage and cpu period metrics are not available and > in the case when for some reason it fails to retrieve a valid CPU load for one of CPUs while iteration over them Shouldn't you do the same for getCpuLoad 149 int[] cpuSet = containerMetrics.getEffectiveCpuSetCpus(); 150 if (cpuSet != null && cpuSet.length > 0) { If cpuSet.length == 0? > >>> CheckOperatingSystemMXBean.java >>> System.out.println(String.format(...)) can simply be replaced with System.out.format. > I had to leave it unchanged since replacing it with System.out.format results in the tests instability as it makes the trace output > occasionally Intervene here (the trace message sometimes is printed inside this message) and tests cannot find the expected > pattern in the output. > >>> It may worth considering adding Metrics::getSwapLimit and >>> Metrics::getSwapUsage and move the computation to the implementation of >>> Metrics. Bob may have an opinion. > > There was no any new input regarding this so I decided to leave it unchanged. Sorry, I didn?t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of these methods understands the limitations of the API. Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker online documentation so it?s probably best to be consistent. > >>> Also it seems correct for the memory related methods to check if >>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >>> BTW what does it mean if limit == 0? > > Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. > And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible > if something went wrong while retrieving this metric. That is true but shouldn?t you return -1 in that case? I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. You should only fall back to the original logic (host values) if container values are set to unlimited. Bob. > > Testing: Mach5 tier1-tier6 tests (that include open/test/hotspot/jtreg/containers/docker and : jdk_management tests) passed. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/ > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thank you, > Daniil > > ?On 12/4/19, 4:09 PM, "Mandy Chung" wrote: > > > > On 12/3/19 9:40 PM, Daniil Titov wrote: >> >>>> Under what circumstance that limit or memLimit is < 0? >> The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without >> specifying a memory limit ( without '--memory=' Docker option) . In latter there is no limit on how much memory the container can use and >> it can use as much memory as the host's OS allows. >> > > OK. Please add a comment to the code. > > It may worth considering adding Metrics::getSwapLimit and > Metrics::getSwapUsage and move the computation to the implementation of > Metrics. Bob may have an opinion. > > Also it seems correct for the memory related methods to check if > (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). > BTW what does it mean if limit == 0? > >>>> Is it worth specifying this case? >> I believe yes, since it covers the cases when JVM runs on a Linux host or a docker container was started without memory limitation. >> > > I was wondering if the javadoc should specify that. >>>> It fallbacks to return the system's total swap space size - this is not really what it should report. >> For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set. >> However, I am not sure how we could differentiate these 2 cases. > > As this is the case when the limit is not set in the container, it > returns the system metrics which sounds appropriate. > >> >>>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad. >> For getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available. >> For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0). > > Will zero memory usage happen? > >> Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result. >> > >> For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod, CpuNumPeriods , or getCpuUsage are unavailable or if a valid CPU load for some CPU was >> not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just return -1 in these cases rather then falling back to getSystemCpuLoad0() >> > > returning -1 sounds right. >>>> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >>>> There is no strong need to make the deprecated methods as default methods. If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations. >> I could make these methods defaults if you feel it is a better approach here. >> >> > > It's not strictly needed but I can go either way. > > >>>> CheckOperatingSystemMXBean.java >>>> System.out.println(String.format(...)) can simply be replaced with System.out.format. >> I will include this change in the next webrev, thank you! >> >> > > thanks > Mandy > > > From mandy.chung at oracle.com Thu Dec 5 20:59:16 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 5 Dec 2019 12:59:16 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> Message-ID: On 12/5/19 12:50 PM, Bob Vandette wrote: > >>>> It may worth considering adding Metrics::getSwapLimit and >>>> Metrics::getSwapUsage and move the computation to the implementation of >>>> Metrics. Bob may have an opinion. >> There was no any new input regarding this so I decided to leave it unchanged. > Sorry, I didn?t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of > these methods understands the limitations of the API. OK with me. > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker > online documentation so it?s probably best to be consistent. > >>>> Also it seems correct for the memory related methods to check if >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >>>> BTW what does it mean if limit == 0? >> Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. >> And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible >> if something went wrong while retrieving this metric. > That is true but shouldn?t you return -1 in that case? > > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. +1 The javadoc should be changed and returns -1 when it's unavailable and the CSR should also be updated to reflect this.??? I'm sure Joe can re-approve the CSR quickly when the fix is reviewed and approved. > You should only fall back to the original logic (host values) if container values are set to unlimited. > +1 Mandy From serguei.spitsyn at oracle.com Thu Dec 5 21:24:28 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Dec 2019 13:24:28 -0800 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <66e249d2-ea32-db69-5e1f-1a31fb19ecd5@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> <66e249d2-ea32-db69-5e1f-1a31fb19ecd5@oracle.com> Message-ID: <5b865d09-2ec4-8378-b438-264279a9a6fd@oracle.com> An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Dec 5 21:35:42 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 5 Dec 2019 16:35:42 -0500 Subject: RFR(XS): JDK-8235433: Problem list JdwpListenTest.java and JdwpAttachTest.java on Windows In-Reply-To: <85623de1-5915-522e-6db0-671b9fce4fba@oracle.com> References: <85623de1-5915-522e-6db0-671b9fce4fba@oracle.com> Message-ID: <4875b2c1-7a08-f60d-b645-c9bc766d3a78@oracle.com> Thumbs up. This is a trivial change. Dan On 12/5/19 3:29 PM, Alex Menkov wrote: > Hi all, > > Recently JdwpListenTest.java and JdwpAttachTest.java have started to > fail on Windows2016 for unclear (yet) reason: > https://bugs.openjdk.java.net/browse/JDK-8234935 > Until the issue is resolved need to problem list the tests. > > jira: https://bugs.openjdk.java.net/browse/JDK-8235433 > > the fix: > > --- a/test/jdk/ProblemList.txt? Thu Dec 05 16:43:06 2019 +0000 > +++ b/test/jdk/ProblemList.txt? Thu Dec 05 11:59:27 2019 -0800 > @@ -904,6 +904,9 @@ > > ?com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all > > +com/sun/jdi/JdwpListenTest.java 8234935 windows-all > +com/sun/jdi/JdwpAttachTest.java 8234935 windows-all > + > > ############################################################################ > > > ?# jdk_time > > --alex From coleen.phillimore at oracle.com Thu Dec 5 21:46:59 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Thu, 5 Dec 2019 16:46:59 -0500 Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL) failed: resolving NULL _value" In-Reply-To: <5b865d09-2ec4-8378-b438-264279a9a6fd@oracle.com> References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com> <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com> <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com> <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com> <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com> <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com> <66e249d2-ea32-db69-5e1f-1a31fb19ecd5@oracle.com> <5b865d09-2ec4-8378-b438-264279a9a6fd@oracle.com> Message-ID: <0d9c92d5-a9af-102b-9f0c-ef26d6210574@oracle.com> Thanks Serguei! Coleen On 12/5/19 4:24 PM, serguei.spitsyn at oracle.com wrote: > Got it, thanks! > Serguei > > > On 12/5/19 11:15, coleen.phillimore at oracle.com wrote: >> >> >> On 12/5/19 1:41 PM, serguei.spitsyn at oracle.com wrote: >>> On 12/5/19 10:36, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 12/5/19 11:00 AM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Collen, >>>>> >>>>> Thank you for making this update! >>>>> It looks good to me. >>>>> >>>>> One nit: >>>>> >>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev/test/hotspot/jtreg/serviceability/jvmti/CompiledMethodLoad/libCompiledZombie.cpp.html >>>>> >>>>> ? 46 // Continuously generate CompiledMethodLoad events for all >>>>> currently compiled methods >>>>> ? 47 void JNICALL GenerateEventsThread(jvmtiEnv* jvmti, JNIEnv* >>>>> jni, void* arg) { >>>>> ? 48 jvmti->SetEventNotificationMode(JVMTI_ENABLE, >>>>> JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL); >>>>> ? 49???? int count = 0; >>>>> ? 50 >>>>> ? 51???? while (true) { >>>>> ? 52???????? events = 0; >>>>> ? 53 jvmti->GenerateEvents(JVMTI_EVENT_COMPILED_METHOD_LOAD); >>>>> ? 54???????? if (events != 0 && ++count == 200) { >>>>> ? 55???????????? printf("Generated %d events\n", events); >>>>> ? 56???????????? count = 0; >>>>> ? 57???????? } >>>>> ? 58???? } >>>>> ? 59 } >>>>> >>>>> ? The above can be simplified a little bit: >>>>> ????????? if (events % 200 == 199) { >>>>> ????????????? printf("Generated %d events\n", events); >>>>> ????????? } >>>>> >>>>> ? Then this line is not needed too: >>>>> ? ? 49???? int count = 0; >>>>> >>>> >>>> I answered this too fast.? There are two conditions where I want >>>> this to not print.? First is where events == 0 and the other for >>>> every 200 events that are non-zero. >>>> >>>> I could use if (events != 0 && count++ % 200), but I thought what I >>>> had makes more sense and I don't have to worry about when ++ happens. >>> >>> Then you could replace it with: >>> ? if (events % 200 == 0) { >> >> But that would still print when events == 0, which I don't want.?? If >> I print them all for the little test case, it's ok, but when I run >> this with Swingset2, it's too much output.? I only want to see a few >> lines for this: >> >> ----------System.out:(3/113)---------- >> Test passes if it doesn't crash while posting compiled method events. >> Generated 285 events >> Generated 1002 events >> ----------System.err:(1/15)---------- >> >> The count is the number of times through the GenerateEvents loop, >> which resets events to zero each time, then prints the number of >> events for every 200 times through the GenerateEvents loop.? So I >> need both count and events. >> >> Coleen >>> >>> But it is up to you. :) >>> >>> Thanks, >>> Serguei >>> >>>> >>>> Thanks, >>>> Coleen >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 12/5/19 04:08, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Thanks Dan.? I moved the field.? For some reason I thought that >>>>>> class did more/different things than hold per-thread information. >>>>>> >>>>>> I've retested this version with tiers 2-6. >>>>>> >>>>>> incr webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev >>>>>> full? webrev at >>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev >>>>>> >>>>>> Thanks to Serguei for offline discussion. >>>>>> >>>>>> Coleen >>>>>> >>>>>> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote: >>>>>>> Generally speaking, JVM/TI related things should be in >>>>>>> JvmtiThreadState >>>>>>> instead of directly in the Thread class. That way the extra >>>>>>> space is only >>>>>>> consumed when JVM/TI is in use and only when a Thread does >>>>>>> something that >>>>>>> requires a JvmtiThreadState to be created. >>>>>>> >>>>>>> Please reconsider moving _jvmti_event_queue. >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> >>>>>>>> Hi Serguei, >>>>>>>> >>>>>>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Collen, (no problem) >>>>>>>>> >>>>>>>>> It looks good in general. >>>>>>>>> Thank you a lot for sorting this out! >>>>>>>>> >>>>>>>>> Just a couple of comments. >>>>>>>>> >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html >>>>>>>>> 1993 protected: >>>>>>>>> 1994 // Jvmti Events that cannot be posted in their current >>>>>>>>> context. >>>>>>>>> 1995 // ServiceThread uses this to collect deferred events >>>>>>>>> from NonJava threads >>>>>>>>> 1996 // that cannot post events. >>>>>>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue; >>>>>>>>> >>>>>>>>> As David I also have a concern about footprint of having the >>>>>>>>> _jvmti_event_queue field in the Thread class. >>>>>>>>> I'm thinking if it'd be better to move this field into the >>>>>>>>> JvmtiThreadState class. >>>>>>>>> Please, see jvmti_thread_state() and >>>>>>>>> JvmtiThreadState::state_for(JavaThread *thread). >>>>>>>> >>>>>>>> The reason I have it directly in JavaThread is so that the GC >>>>>>>> oops_do and nmethods_do code can find it easily.? I like your >>>>>>>> idea of hiding it in jvmti but this doesn't seem good to have >>>>>>>> this code know about jvmtiThreadState, which seems to be a >>>>>>>> queue of Jvmti states.? I also don't want to have >>>>>>>> jvmtiThreadState to have to add an oops_do() or nmethods_do() >>>>>>>> either. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html >>>>>>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) { >>>>>>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of >>>>>>>>> this method"); >>>>>>>>> 975 nmethod* nm = _event_data.compiled_method_load; >>>>>>>>> 976 JvmtiExport::post_compiled_method_load(env, nm); >>>>>>>>> 977 } >>>>>>>>> >>>>>>>>> The JvmtiDeferredEvent::post name looks too generic as it >>>>>>>>> posts compiled load events only. >>>>>>>>> Do you consider this function extended in the future to >>>>>>>>> support more event types? >>>>>>>>> >>>>>>>> >>>>>>>> I don't envision an extension for this function but I do for >>>>>>>> JvmtiDeferredEventQueue::post().? I have a small enhancement >>>>>>>> that would handoff the entire queue to the ServiceThread and >>>>>>>> have it call post() to post all the events rather than one at a >>>>>>>> time. >>>>>>>> >>>>>>>> So I'll rename this one post_compiled_method_load_event() and >>>>>>>> leave the other post() as is for now. >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote: >>>>>>>>>> Summary: Add local deferred event list to thread to post >>>>>>>>>> events outside CodeCache_lock. >>>>>>>>>> >>>>>>>>>> This patch builds on the patch for JDK-8173361. With this >>>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class >>>>>>>>>> (not AllStatic) and have one per thread.? The CodeBlob event >>>>>>>>>> that used to drop the CodeCache_lock and raced with the >>>>>>>>>> sweeper thread, adds the events it wants to post to its >>>>>>>>>> thread local list, and processes it outside the lock.? The >>>>>>>>>> list is walked in GC and by the sweeper to keep the nmethods >>>>>>>>>> from being unloaded and zombied, respectively. >>>>>>>>>> >>>>>>>>>> Also, the jmethod_id field in nmethod was only used as a >>>>>>>>>> boolean so don't create a jmethod_id until needed for >>>>>>>>>> post_compiled_method_unload. >>>>>>>>>> >>>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that >>>>>>>>>> crashed in the original bug report. >>>>>>>>>> >>>>>>>>>> open webrev at >>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev >>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Coleen >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu Dec 5 22:52:31 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Dec 2019 08:52:31 +1000 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: References: Message-ID: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> Looks good Harold! If we get any more of these unmodifiable attributes we may have to look at a way to refer to them more abstractly and only define them in one place. Thanks, David On 6/12/2019 12:28 am, Harold Seigel wrote: > Hi, > > Please review this trivial change to add documentation about the Record > attribute to the JDWP, JDI, and Instrumentation specs. > > The changed .html pages (best viewed as 'raw') are included in the > webrev but will not be pushed. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 > > The fix was regression tested by running Mach5 tiers 1 and 2 tests and > builds on Linux-x64, Solaris, Windows, and Mac OS X. > > Thanks, Harold > From serguei.spitsyn at oracle.com Fri Dec 6 00:25:40 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Dec 2019 16:25:40 -0800 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> References: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> Message-ID: Hi David, Agreed. I was thinking about the same. Thanks, Serguei On 12/5/19 2:52 PM, David Holmes wrote: > Looks good Harold! > > If we get any more of these unmodifiable attributes we may have to > look at a way to refer to them more abstractly and only define them in > one place. > > Thanks, > David > > On 6/12/2019 12:28 am, Harold Seigel wrote: >> Hi, >> >> Please review this trivial change to add documentation about the >> Record attribute to the JDWP, JDI, and Instrumentation specs. >> >> The changed .html pages (best viewed as 'raw') are included in the >> webrev but will not be pushed. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-x64, Solaris, Windows, and Mac OS X. >> >> Thanks, Harold >> From daniil.x.titov at oracle.com Fri Dec 6 01:03:21 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 05 Dec 2019 17:03:21 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> Message-ID: <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> Hi Mandy and Bob, Thank you for your comments. Please review a new version of the fix [1] that makes OperatingSystemImpl methods return -1 if one of the metric has value 0. As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean indicating that methods could return -1 if the information is not available. There were no changes in CSR [3] yet, I plan to proceed with them after the fix is reviewed. > In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html > Shouldn?t you keep the IOException catch clauses in case the file is not found? There is no need in keeping IOException catch in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException. Now these calls are performed inside AccessController.doPrivileged(PrivilegedExceptionAction) that wraps all checked exceptions in PrivilegedActionException that we are catching now instead of IOException. Here is the sampe of the stacktrace: java.security.PrivilegedActionException: java.io.FileNotFoundException at java.base/java.security.AccessController.doPrivileged(AccessController.java:558) at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113) at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390) at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109) at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36) Caused by: java.io.FileNotFoundException at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116) at java.base/java.security.AccessController.doPrivileged(AccessController.java:554) In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch IOException inside this code block or convert this block to PrivilegedExceptionAction and then put AccessController.doPrivileged call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable. Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 Thanks, Daniil ?On 12/5/19, 12:59 PM, "Mandy Chung" wrote: On 12/5/19 12:50 PM, Bob Vandette wrote: > >>>> It may worth considering adding Metrics::getSwapLimit and >>>> Metrics::getSwapUsage and move the computation to the implementation of >>>> Metrics. Bob may have an opinion. >> There was no any new input regarding this so I decided to leave it unchanged. > Sorry, I didn?t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of > these methods understands the limitations of the API. OK with me. > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker > online documentation so it?s probably best to be consistent. > >>>> Also it seems correct for the memory related methods to check if >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >>>> BTW what does it mean if limit == 0? >> Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. >> And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible >> if something went wrong while retrieving this metric. > That is true but shouldn?t you return -1 in that case? > > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. +1 The javadoc should be changed and returns -1 when it's unavailable and the CSR should also be updated to reflect this. I'm sure Joe can re-approve the CSR quickly when the fix is reviewed and approved. > You should only fall back to the original logic (host values) if container values are set to unlimited. > +1 Mandy From serguei.spitsyn at oracle.com Fri Dec 6 01:31:42 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Dec 2019 17:31:42 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> Message-ID: Hi Chris and Alex, (I've also included Dan, David and Dean to the mailing list) We have to reach a consensus about this. We have 3 options: Option #1: ? The JIT optimization to delete a code which "looks useless" ? has to be disabled if can_pop_frame capability is enabled. ? Than this problem becomes a JIT compiler bug. Option #2: ? Consider to relax the JVMTI PopFrame spec by changing it to something like: ? "Note however, that the original argument values are not ?? preserved and can be changed by the called method;" ? Than this problem becomes a JVM TI spec bug. Option #3: ? Consider it is Okay for compiler to eliminate useless code, ? so the argument values can be reinitialized by the PopFrame. ? Than this problem becomes just a test bug. My preference is option #3. The point is that if the arguments are not really used in a method then restoring them to any values is a no-op. It is really meaningless use case, so why should we care about it. Thanks, Serguei On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > The fix itself looks Okay. > Minor: replace in the comment: "compiler don't drop" => "compiler > doesn't drop". > > However, we still have to reach a consensus on how we treat this issue > (as Chris already commented). > > Thanks, > Serguei > > > On 11/8/19 15:22, Alex Menkov wrote: >> Hi all, >> >> Please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8215196 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >> >> Currently PopFrame is disabled with JVMCI by [1], so for testing I >> reverted [1] changes. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >> >> --alex > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Fri Dec 6 02:45:06 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Dec 2019 12:45:06 +1000 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> Message-ID: <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> Hi Serguei, On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: > Hi Chris and Alex, > > (I've also included Dan, David and Dean to the mailing list) > > We have to reach a consensus about this. This is just part of a much broader issue with JVM TI that I tried to have a discussion started based on Richard Reingruber's proposals around Escape Analysis: http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html Unfortunately that discussion did not get much traction. > We have 3 options: > > Option #1: > ? The JIT optimization to delete a code which "looks useless" > ? has to be disabled if can_pop_frame capability is enabled. > ? Than this problem becomes a JIT compiler bug. > > Option #2: > ? Consider to relax the JVMTI PopFrame spec by changing it to something > like: > ? "Note however, that the original argument values are not > ?? preserved and can be changed by the called method;" > ? Than this problem becomes a JVM TI spec bug. > > Option #3: > ? Consider it is Okay for compiler to eliminate useless code, > ? so the argument values can be reinitialized by the PopFrame. > ? Than this problem becomes just a test bug. > > > My preference is option #3. > The point is that if the arguments are not really used in > a method then restoring them to any values is a no-op. > It is really meaningless use case, so why should we care about it. Thanks for setting that out clearly. I'd like to agree this is particular case is a test bug. If we have a method: int incr(int val) { val++; popFrameHere(); return val; } then the change to the argument is necessary and must be preserved. In contrast: void incr(int val) { val++; popFrameHere(); } the change to the argument is meaningless and I would hope any decent JIT would simply elide it. But we must have a consistent approach to such things. What would happen if a breakpoint were to be placed on the instruction that uselessly modified the argument - would we still see the modification or would it be elided? And how do C1 and C2 avoid this issue? Do they simply not optimise away the useless assignment? Or do they actively disable that optimization in this context? We need, IMO, to establish the basic philosophy of how to manage JVM TI / JIT interactions, so we know what things must remain visible and which can be optimised away. That said, changing the test allows us to defer having to reach that consensus. David ----- > Thanks, > Serguei > > > On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >> Hi Alex, >> >> The fix itself looks Okay. >> Minor: replace in the comment: "compiler don't drop" => "compiler >> doesn't drop". >> >> However, we still have to reach a consensus on how we treat this issue >> (as Chris already commented). >> >> Thanks, >> Serguei >> >> >> On 11/8/19 15:22, Alex Menkov wrote: >>> Hi all, >>> >>> Please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>> >>> Currently PopFrame is disabled with JVMCI by [1], so for testing I >>> reverted [1] changes. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>> >>> --alex >> > From serguei.spitsyn at oracle.com Fri Dec 6 03:00:32 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Dec 2019 19:00:32 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> Message-ID: <9efb702f-a46c-5c26-475d-34e7165fc32a@oracle.com> Hi David, Thank you for writing this down. Totally agree with you here. On 12/5/19 6:45 PM, David Holmes wrote: > Hi Serguei, > > On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >> Hi Chris and Alex, >> >> (I've also included Dan, David and Dean to the mailing list) >> >> We have to reach a consensus about this. > > This is just part of a much broader issue with JVM TI that I tried to > have a discussion started based on Richard Reingruber's proposals > around Escape Analysis: > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html > > > Unfortunately that discussion did not get much traction. I've mentioned the general discussion you started about JIT compiler optimizations in one of my previous replies to this review threads. Sorry, I was busy with other things and was not able to participate in it properly. But I'm looking forward to continue this when there is a chance. >> We have 3 options: >> >> Option #1: >> ?? The JIT optimization to delete a code which "looks useless" >> ?? has to be disabled if can_pop_frame capability is enabled. >> ?? Than this problem becomes a JIT compiler bug. >> >> Option #2: >> ?? Consider to relax the JVMTI PopFrame spec by changing it to >> something like: >> ?? "Note however, that the original argument values are not >> ??? preserved and can be changed by the called method;" >> ?? Than this problem becomes a JVM TI spec bug. >> >> Option #3: >> ?? Consider it is Okay for compiler to eliminate useless code, >> ?? so the argument values can be reinitialized by the PopFrame. >> ?? Than this problem becomes just a test bug. >> >> >> My preference is option #3. >> The point is that if the arguments are not really used in >> a method then restoring them to any values is a no-op. >> It is really meaningless use case, so why should we care about it. > > Thanks for setting that out clearly. > > I'd like to agree this is particular case is a test bug. If we have a > method: > > int incr(int val) { > ? val++; > ? popFrameHere(); > ? return val; > } > > then the change to the argument is necessary and must be preserved. In > contrast: > > void incr(int val) { > ? val++; > ? popFrameHere(); > } > > the change to the argument is meaningless and I would hope any decent > JIT would simply elide it. > > But we must have a consistent approach to such things. What would > happen if a breakpoint were to be placed on the instruction that > uselessly modified the argument - would we still see the modification > or would it be elided? > > And how do C1 and C2 avoid this issue? Do they simply not optimise > away the useless assignment? Or do they actively disable that > optimization in this context? > > We need, IMO, to establish the basic philosophy of how to manage JVM > TI / JIT interactions, so we know what things must remain visible and > which can be optimised away. It is painful that we have not established it yet. > > That said, changing the test allows us to defer having to reach that > consensus. Right. Thanks, Serguei > David > ----- > >> Thanks, >> Serguei >> >> >> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Alex, >>> >>> The fix itself looks Okay. >>> Minor: replace in the comment: "compiler don't drop" => "compiler >>> doesn't drop". >>> >>> However, we still have to reach a consensus on how we treat this >>> issue (as Chris already commented). >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/8/19 15:22, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>> >>>> Currently PopFrame is disabled with JVMCI by [1], so for testing I >>>> reverted [1] changes. >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>> >>>> --alex >>> >> From david.holmes at oracle.com Fri Dec 6 07:49:34 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Dec 2019 17:49:34 +1000 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> Message-ID: Hi Daniil, I'm not familiar with all the details of the various API's involved here so just a few general comments in places. I do have one major issue flagged below. --- src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c ! static int initialized=1; Am I reading this right that the code currently fails to actually do the initialization because of this ??? Style nit: if(perfInit() space after "if" --- src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. Surely there must always be some information available from the operating environment? I see from the impl file: // the host data, value 0 indicates that something went wrong while the metric was read and // in this case we return "information unavailable" code -1. I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. --- src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java Can you please rename the legacy methods so that, for example, getTotalMemorySize() calls getTotalMemorySize0() rather than getTotalPhysicalMemorySize0(). That way we relegate the legacy names to the interface only. --- test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java System.out.println(String.format(...) Why not simply System.out.printf(..) ? --- Thanks, David ----- On 6/12/2019 11:03 am, Daniil Titov wrote: > Hi Mandy and Bob, > > Thank you for your comments. Please review a new version of the fix [1] that makes > OperatingSystemImpl methods return -1 if one of the metric has value 0. > > As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean > indicating that methods could return -1 if the information is not available. > There were no changes in CSR [3] yet, I plan to proceed with them after the fix is > reviewed. > >> In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html >> Shouldn?t you keep the IOException catch clauses in case the file is not found? > > There is no need in keeping IOException catch in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). > As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException. > Now these calls are performed inside AccessController.doPrivileged(PrivilegedExceptionAction) that wraps > all checked exceptions in PrivilegedActionException that we are catching now instead of IOException. > > Here is the sampe of the stacktrace: > java.security.PrivilegedActionException: java.io.FileNotFoundException > at java.base/java.security.AccessController.doPrivileged(AccessController.java:558) > at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113) > at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390) > at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109) > at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36) > Caused by: java.io.FileNotFoundException > at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116) > at java.base/java.security.AccessController.doPrivileged(AccessController.java:554) > > > In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch > IOException inside this code block or convert this block to PrivilegedExceptionAction and then put AccessController.doPrivileged > call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable. > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/ > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > Thanks, > Daniil > > ?On 12/5/19, 12:59 PM, "Mandy Chung" wrote: > > > > On 12/5/19 12:50 PM, Bob Vandette wrote: > > > >>>> It may worth considering adding Metrics::getSwapLimit and > >>>> Metrics::getSwapUsage and move the computation to the implementation of > >>>> Metrics. Bob may have an opinion. > >> There was no any new input regarding this so I decided to leave it unchanged. > > Sorry, I didn?t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries > > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of > > these methods understands the limitations of the API. > > OK with me. > > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker > > online documentation so it?s probably best to be consistent. > > > >>>> Also it seems correct for the memory related methods to check if > >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). > >>>> BTW what does it mean if limit == 0? > >> Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. > >> And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible > >> if something went wrong while retrieving this metric. > > That is true but shouldn?t you return -1 in that case? > > > > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) > > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. > > +1 > > The javadoc should be changed and returns -1 when it's unavailable and > the CSR should also be updated to reflect this. I'm sure Joe can > re-approve the CSR quickly when the fix is reviewed and approved. > > > You should only fall back to the original logic (host values) if container values are set to unlimited. > > > +1 > > Mandy > > > From martin.doerr at sap.com Fri Dec 6 10:51:11 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 6 Dec 2019 10:51:11 +0000 Subject: RFR(T) : 8235353 : clean up hotspot problem lists In-Reply-To: References: <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com> Message-ID: Hi Igor and Vladimir, the tests have passed on PPC64. Change is good. Thanks for checking with us. Best regards, Martin > -----Original Message----- > From: Langer, Christoph > Sent: Donnerstag, 5. Dezember 2019 12:24 > To: Igor Ignatyev ; Doerr, Martin > ; Lindenmaier, Goetz > > Cc: serviceability-dev ; Vladimir > Kozlov ; hotspot-dev Source Developers > > Subject: RE: RFR(T) : 8235353 : clean up hotspot problem lists > > Hi Igor, > > I have added your update to our test system. I'll let you know the results by > tomorrow. > > Best regards > Christoph > > > -----Original Message----- > > From: serviceability-dev > On > > Behalf Of Igor Ignatyev > > Sent: Donnerstag, 5. Dezember 2019 03:08 > > To: Doerr, Martin ; Lindenmaier, Goetz > > > > Cc: serviceability-dev ; Vladimir > > Kozlov ; hotspot-dev Source Developers > > > > Subject: Re: RFR(T) : 8235353 : clean up hotspot problem lists > > > > Martin, Goetz. > > > > could you please check that these 9 tests still pass on PPC? > > > > -- Igor > > > > > On Dec 4, 2019, at 12:01 PM, Vladimir Kozlov > > > wrote: > > > > > > I am fine with changes but we need to ask PPC64 supporter to verify that > > tests passed now. > > > > > > Thanks, > > > Vladimir K > > > > > > On 12/4/19 11:52 AM, Igor Ignatyev wrote: > > >> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00 > > >>> 9 lines changed: 0 ins; 0 del; 9 mod; > > >> Hi all, > > >> could you please review this small and trivial cleanup which returns > > serviceablility/sa tests back to execution on linux-ppc64. the tests were > > problem listed due to 8211767[1], which is closed as a dup of resolved > > 8228649[2]. > > >> [1] https://bugs.openjdk.java.net/browse/JDK-8211767 > > >> [2] https://bugs.openjdk.java.net/browse/JDK-8228649 > > >> Thanks, > > >> -- Igor From harold.seigel at oracle.com Fri Dec 6 13:14:47 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 6 Dec 2019 08:14:47 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> References: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> Message-ID: <3f304c0e-00fd-988d-521a-1b17104bb6f4@oracle.com> Thanks David! Harold On 12/5/2019 5:52 PM, David Holmes wrote: > Looks good Harold! > > If we get any more of these unmodifiable attributes we may have to > look at a way to refer to them more abstractly and only define them in > one place. > > Thanks, > David > > On 6/12/2019 12:28 am, Harold Seigel wrote: >> Hi, >> >> Please review this trivial change to add documentation about the >> Record attribute to the JDWP, JDI, and Instrumentation specs. >> >> The changed .html pages (best viewed as 'raw') are included in the >> webrev but will not be pushed. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >> >> The fix was regression tested by running Mach5 tiers 1 and 2 tests >> and builds on Linux-x64, Solaris, Windows, and Mac OS X. >> >> Thanks, Harold >> From harold.seigel at oracle.com Fri Dec 6 13:16:18 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 6 Dec 2019 08:16:18 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: References: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> Message-ID: <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com> There will be another unmodifiable attribute with sealed types called PermittedSubtypes. Harold On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote: > Hi David, > > Agreed. I was thinking about the same. > > Thanks, > Serguei > > On 12/5/19 2:52 PM, David Holmes wrote: >> Looks good Harold! >> >> If we get any more of these unmodifiable attributes we may have to >> look at a way to refer to them more abstractly and only define them >> in one place. >> >> Thanks, >> David >> >> On 6/12/2019 12:28 am, Harold Seigel wrote: >>> Hi, >>> >>> Please review this trivial change to add documentation about the >>> Record attribute to the JDWP, JDI, and Instrumentation specs. >>> >>> The changed .html pages (best viewed as 'raw') are included in the >>> webrev but will not be pushed. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >>> >>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>> and builds on Linux-x64, Solaris, Windows, and Mac OS X. >>> >>> Thanks, Harold >>> > From bob.vandette at oracle.com Fri Dec 6 13:59:10 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Fri, 6 Dec 2019 08:59:10 -0500 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> Message-ID: > On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > > Hi Daniil, > > I'm not familiar with all the details of the various API's involved here so just a few general comments in places. I do have one major issue flagged below. > > --- > > src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > > ! static int initialized=1; > > Am I reading this right that the code currently fails to actually do the initialization because of this ??? > > Style nit: if(perfInit() > > space after "if" > > --- > > src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > > The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. Surely there must always be some information available from the operating environment? I see from the impl file: > > // the host data, value 0 indicates that something went wrong while the metric was read and > // in this case we return "information unavailable" code -1. > > I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no limits. Bob. > > --- > > src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java > src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java > > Can you please rename the legacy methods so that, for example, getTotalMemorySize() calls getTotalMemorySize0() rather than getTotalPhysicalMemorySize0(). That way we relegate the legacy names to the interface only. > > --- > > test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > System.out.println(String.format(...) > > Why not simply > > System.out.printf(..) > > ? > > --- > > Thanks, > David > ----- > > On 6/12/2019 11:03 am, Daniil Titov wrote: >> Hi Mandy and Bob, >> Thank you for your comments. Please review a new version of the fix [1] that makes >> OperatingSystemImpl methods return -1 if one of the metric has value 0. >> As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean >> indicating that methods could return -1 if the information is not available. >> There were no changes in CSR [3] yet, I plan to proceed with them after the fix is >> reviewed. >>> In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html >>> Shouldn?t you keep the IOException catch clauses in case the file is not found? >> There is no need in keeping IOException catch in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). >> As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException. >> Now these calls are performed inside AccessController.doPrivileged(PrivilegedExceptionAction) that wraps >> all checked exceptions in PrivilegedActionException that we are catching now instead of IOException. >> Here is the sampe of the stacktrace: >> java.security.PrivilegedActionException: java.io.FileNotFoundException >> at java.base/java.security.AccessController.doPrivileged(AccessController.java:558) >> at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113) >> at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390) >> at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109) >> at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36) >> Caused by: java.io.FileNotFoundException >> at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116) >> at java.base/java.security.AccessController.doPrivileged(AccessController.java:554) >> In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch >> IOException inside this code block or convert this block to PrivilegedExceptionAction and then put AccessController.doPrivileged >> call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable. >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/ >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 >> Thanks, >> Daniil >> ?On 12/5/19, 12:59 PM, "Mandy Chung" wrote: >> On 12/5/19 12:50 PM, Bob Vandette wrote: >> > >> >>>> It may worth considering adding Metrics::getSwapLimit and >> >>>> Metrics::getSwapUsage and move the computation to the implementation of >> >>>> Metrics. Bob may have an opinion. >> >> There was no any new input regarding this so I decided to leave it unchanged. >> > Sorry, I didn?t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries >> > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of >> > these methods understands the limitations of the API. >> OK with me. >> > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker >> > online documentation so it?s probably best to be consistent. >> > >> >>>> Also it seems correct for the memory related methods to check if >> >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >> >>>> BTW what does it mean if limit == 0? >> >> Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. >> >> And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible >> >> if something went wrong while retrieving this metric. >> > That is true but shouldn?t you return -1 in that case? >> > >> > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) >> > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. >> +1 >> The javadoc should be changed and returns -1 when it's unavailable and >> the CSR should also be updated to reflect this. I'm sure Joe can >> re-approve the CSR quickly when the fix is reviewed and approved. >> > You should only fall back to the original logic (host values) if container values are set to unlimited. >> > >> +1 >> Mandy >> From larry.cable at oracle.com Fri Dec 6 17:09:42 2019 From: larry.cable at oracle.com (Laurence Cable) Date: Fri, 6 Dec 2019 09:09:42 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> Message-ID: <6fbd0b6f-90bf-457c-88d8-754ce9ef45ec@oracle.com> +1 On 12/6/19 5:59 AM, Bob Vandette wrote: >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >> >> Hi Daniil, >> >> I'm not familiar with all the details of the various API's involved here so just a few general comments in places. I do have one major issue flagged below. >> >> --- >> >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the initialization because of this ??? >> >> Style nit: if(perfInit() >> >> space after "if" >> >> --- >> >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >> >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. Surely there must always be some information available from the operating environment? I see from the impl file: >> >> // the host data, value 0 indicates that something went wrong while the metric was read and >> // in this case we return "information unavailable" code -1. >> >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > limits. > > Bob. > >> --- >> >> src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java >> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java >> >> Can you please rename the legacy methods so that, for example, getTotalMemorySize() calls getTotalMemorySize0() rather than getTotalPhysicalMemorySize0(). That way we relegate the legacy names to the interface only. >> >> --- >> >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) >> >> ? >> >> --- >> >> Thanks, >> David >> ----- >> >> On 6/12/2019 11:03 am, Daniil Titov wrote: >>> Hi Mandy and Bob, >>> Thank you for your comments. Please review a new version of the fix [1] that makes >>> OperatingSystemImpl methods return -1 if one of the metric has value 0. >>> As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean >>> indicating that methods could return -1 if the information is not available. >>> There were no changes in CSR [3] yet, I plan to proceed with them after the fix is >>> reviewed. >>>> In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html >>>> Shouldn?t you keep the IOException catch clauses in case the file is not found? >>> There is no need in keeping IOException catch in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). >>> As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException. >>> Now these calls are performed inside AccessController.doPrivileged(PrivilegedExceptionAction) that wraps >>> all checked exceptions in PrivilegedActionException that we are catching now instead of IOException. >>> Here is the sampe of the stacktrace: >>> java.security.PrivilegedActionException: java.io.FileNotFoundException >>> at java.base/java.security.AccessController.doPrivileged(AccessController.java:558) >>> at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113) >>> at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390) >>> at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109) >>> at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36) >>> Caused by: java.io.FileNotFoundException >>> at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116) >>> at java.base/java.security.AccessController.doPrivileged(AccessController.java:554) >>> In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch >>> IOException inside this code block or convert this block to PrivilegedExceptionAction and then put AccessController.doPrivileged >>> call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable. >>> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. >>> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/ >>> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 >>> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 >>> Thanks, >>> Daniil >>> ?On 12/5/19, 12:59 PM, "Mandy Chung" wrote: >>> On 12/5/19 12:50 PM, Bob Vandette wrote: >>> > >>> >>>> It may worth considering adding Metrics::getSwapLimit and >>> >>>> Metrics::getSwapUsage and move the computation to the implementation of >>> >>>> Metrics. Bob may have an opinion. >>> >> There was no any new input regarding this so I decided to leave it unchanged. >>> > Sorry, I didn?t respond to this. Since the calculation required for getFreeSwapSpaceSize requires retries >>> > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of >>> > these methods understands the limitations of the API. >>> OK with me. >>> > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker >>> > online documentation so it?s probably best to be consistent. >>> > >>> >>>> Also it seems correct for the memory related methods to check if >>> >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0). >>> >>>> BTW what does it mean if limit == 0? >>> >> Per Docker docs the minimum allowed value for memory limit (--memory option) is 4 megabytes. >>> >> And if memory limit is unset the return value is -1. Thus, in my understanding the value 0 is only possible >>> >> if something went wrong while retrieving this metric. >>> > That is true but shouldn?t you return -1 in that case? >>> > >>> > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1) >>> > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0. This would be more consistent. >>> +1 >>> The javadoc should be changed and returns -1 when it's unavailable and >>> the CSR should also be updated to reflect this. I'm sure Joe can >>> re-approve the CSR quickly when the fix is reviewed and approved. >>> > You should only fall back to the original logic (host values) if container values are set to unlimited. >>> > >>> +1 >>> Mandy >>> From igor.ignatyev at oracle.com Fri Dec 6 17:18:16 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 6 Dec 2019 09:18:16 -0800 Subject: RFR(T) : 8235353 : clean up hotspot problem lists In-Reply-To: References: <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com> Message-ID: Martin, Christoph, thanks for verifying this. pushed. -- Igor > On Dec 6, 2019, at 2:51 AM, Doerr, Martin wrote: > > Hi Igor and Vladimir, > > the tests have passed on PPC64. Change is good. Thanks for checking with us. > > Best regards, > Martin > > >> -----Original Message----- >> From: Langer, Christoph >> Sent: Donnerstag, 5. Dezember 2019 12:24 >> To: Igor Ignatyev ; Doerr, Martin >> ; Lindenmaier, Goetz >> >> Cc: serviceability-dev ; Vladimir >> Kozlov ; hotspot-dev Source Developers >> >> Subject: RE: RFR(T) : 8235353 : clean up hotspot problem lists >> >> Hi Igor, >> >> I have added your update to our test system. I'll let you know the results by >> tomorrow. >> >> Best regards >> Christoph >> >>> -----Original Message----- >>> From: serviceability-dev >> On >>> Behalf Of Igor Ignatyev >>> Sent: Donnerstag, 5. Dezember 2019 03:08 >>> To: Doerr, Martin ; Lindenmaier, Goetz >>> >>> Cc: serviceability-dev ; Vladimir >>> Kozlov ; hotspot-dev Source Developers >>> >>> Subject: Re: RFR(T) : 8235353 : clean up hotspot problem lists >>> >>> Martin, Goetz. >>> >>> could you please check that these 9 tests still pass on PPC? >>> >>> -- Igor >>> >>>> On Dec 4, 2019, at 12:01 PM, Vladimir Kozlov >> >>> wrote: >>>> >>>> I am fine with changes but we need to ask PPC64 supporter to verify that >>> tests passed now. >>>> >>>> Thanks, >>>> Vladimir K >>>> >>>> On 12/4/19 11:52 AM, Igor Ignatyev wrote: >>>>> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00 >>>>>> 9 lines changed: 0 ins; 0 del; 9 mod; >>>>> Hi all, >>>>> could you please review this small and trivial cleanup which returns >>> serviceablility/sa tests back to execution on linux-ppc64. the tests were >>> problem listed due to 8211767[1], which is closed as a dup of resolved >>> 8228649[2]. >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8211767 >>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8228649 >>>>> Thanks, >>>>> -- Igor > From serguei.spitsyn at oracle.com Fri Dec 6 18:21:54 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Dec 2019 10:21:54 -0800 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com> References: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com> Message-ID: Hi Harold, Okay, thanks! Thanks, Serguei On 12/6/19 05:16, Harold Seigel wrote: > There will be another unmodifiable attribute with sealed types called > PermittedSubtypes. > > Harold > > On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> Agreed. I was thinking about the same. >> >> Thanks, >> Serguei >> >> On 12/5/19 2:52 PM, David Holmes wrote: >>> Looks good Harold! >>> >>> If we get any more of these unmodifiable attributes we may have to >>> look at a way to refer to them more abstractly and only define them >>> in one place. >>> >>> Thanks, >>> David >>> >>> On 6/12/2019 12:28 am, Harold Seigel wrote: >>>> Hi, >>>> >>>> Please review this trivial change to add documentation about the >>>> Record attribute to the JDWP, JDI, and Instrumentation specs. >>>> >>>> The changed .html pages (best viewed as 'raw') are included in the >>>> webrev but will not be pushed. >>>> >>>> Open Webrev: >>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >>>> >>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >>>> >>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>>> and builds on Linux-x64, Solaris, Windows, and Mac OS X. >>>> >>>> Thanks, Harold >>>> >> From serguei.spitsyn at oracle.com Fri Dec 6 18:27:28 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Dec 2019 10:27:28 -0800 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: References: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com> Message-ID: <4755bf60-4701-d1d2-9218-e72cd58ed9a6@oracle.com> Forgot to ask. Is this new attribute for 14? Will it also come from Amber? Thanks, Serguei On 12/6/19 10:21, serguei.spitsyn at oracle.com wrote: > Hi Harold, > > Okay, thanks! > > Thanks, > Serguei > > > On 12/6/19 05:16, Harold Seigel wrote: >> There will be another unmodifiable attribute with sealed types called >> PermittedSubtypes. >> >> Harold >> >> On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote: >>> Hi David, >>> >>> Agreed. I was thinking about the same. >>> >>> Thanks, >>> Serguei >>> >>> On 12/5/19 2:52 PM, David Holmes wrote: >>>> Looks good Harold! >>>> >>>> If we get any more of these unmodifiable attributes we may have to >>>> look at a way to refer to them more abstractly and only define them >>>> in one place. >>>> >>>> Thanks, >>>> David >>>> >>>> On 6/12/2019 12:28 am, Harold Seigel wrote: >>>>> Hi, >>>>> >>>>> Please review this trivial change to add documentation about the >>>>> Record attribute to the JDWP, JDI, and Instrumentation specs. >>>>> >>>>> The changed .html pages (best viewed as 'raw') are included in the >>>>> webrev but will not be pushed. >>>>> >>>>> Open Webrev: >>>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >>>>> >>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >>>>> >>>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests >>>>> and builds on Linux-x64, Solaris, Windows, and Mac OS X. >>>>> >>>>> Thanks, Harold >>>>> >>> > From harold.seigel at oracle.com Fri Dec 6 18:29:12 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 6 Dec 2019 13:29:12 -0500 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: <4755bf60-4701-d1d2-9218-e72cd58ed9a6@oracle.com> References: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com> <4755bf60-4701-d1d2-9218-e72cd58ed9a6@oracle.com> Message-ID: <58015dfe-a587-c1ab-c5bc-453ad2cd13c6@oracle.com> Hi Serguei, >> Is this new attribute for 14? No.? 15, maybe? >>Will it also come from Amber? Yes. Harold On 12/6/2019 1:27 PM, serguei.spitsyn at oracle.com wrote: > Forgot to ask. > Is this new attribute for 14? > Will it also come from Amber? > > Thanks, > Serguei > > > On 12/6/19 10:21, serguei.spitsyn at oracle.com wrote: >> Hi Harold, >> >> Okay, thanks! >> >> Thanks, >> Serguei >> >> >> On 12/6/19 05:16, Harold Seigel wrote: >>> There will be another unmodifiable attribute with sealed types >>> called PermittedSubtypes. >>> >>> Harold >>> >>> On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>> Hi David, >>>> >>>> Agreed. I was thinking about the same. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> On 12/5/19 2:52 PM, David Holmes wrote: >>>>> Looks good Harold! >>>>> >>>>> If we get any more of these unmodifiable attributes we may have to >>>>> look at a way to refer to them more abstractly and only define >>>>> them in one place. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 6/12/2019 12:28 am, Harold Seigel wrote: >>>>>> Hi, >>>>>> >>>>>> Please review this trivial change to add documentation about the >>>>>> Record attribute to the JDWP, JDI, and Instrumentation specs. >>>>>> >>>>>> The changed .html pages (best viewed as 'raw') are included in >>>>>> the webrev but will not be pushed. >>>>>> >>>>>> Open Webrev: >>>>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >>>>>> >>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >>>>>> >>>>>> The fix was regression tested by running Mach5 tiers 1 and 2 >>>>>> tests and builds on Linux-x64, Solaris, Windows, and Mac OS X. >>>>>> >>>>>> Thanks, Harold >>>>>> >>>> >> > From chris.plummer at oracle.com Fri Dec 6 19:07:38 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 6 Dec 2019 11:07:38 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> Message-ID: <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> On 12/5/19 6:45 PM, David Holmes wrote: > Hi Serguei, > > On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >> Hi Chris and Alex, >> >> (I've also included Dan, David and Dean to the mailing list) >> >> We have to reach a consensus about this. > > This is just part of a much broader issue with JVM TI that I tried to > have a discussion started based on Richard Reingruber's proposals > around Escape Analysis: > > http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html > > > Unfortunately that discussion did not get much traction. Hmm. I have the emails that precede yours above, but not that one. Not sure how what happened. Just read through it and it did give me one thought. Consider a model where the program is designed drive behavior of the agent, triggering the agent to do certain things by having the program do certain things. Normally an agent monitors the application, but in this case the application is purposefully controlling actions performed by the agent. If code is elided from the program, then the agent no longer performs as expected. It's a kind of backwards jvmti programming model, and you may ask why would anyone do this. I'm not sure if there's a good reason for it, but should it be expected to work given how the spec is written? > >> We have 3 options: >> >> Option #1: >> ?? The JIT optimization to delete a code which "looks useless" >> ?? has to be disabled if can_pop_frame capability is enabled. >> ?? Than this problem becomes a JIT compiler bug. >> >> Option #2: >> ?? Consider to relax the JVMTI PopFrame spec by changing it to >> something like: >> ?? "Note however, that the original argument values are not >> ??? preserved and can be changed by the called method;" >> ?? Than this problem becomes a JVM TI spec bug. >> >> Option #3: >> ?? Consider it is Okay for compiler to eliminate useless code, >> ?? so the argument values can be reinitialized by the PopFrame. >> ?? Than this problem becomes just a test bug. >> >> >> My preference is option #3. >> The point is that if the arguments are not really used in >> a method then restoring them to any values is a no-op. >> It is really meaningless use case, so why should we care about it. Is "restoring" the proper term here? I thought they were just left on the stack and reused on the subsequent invoke. In fact I figured the reason for the language in the spec in the first place is to alleviate JVMTI from having to restore them to their original values, which is probably not even possible. > > Thanks for setting that out clearly. > > I'd like to agree this is particular case is a test bug. If we have a > method: > > int incr(int val) { > ? val++; > ? popFrameHere(); > ? return val; > } > > then the change to the argument is necessary and must be preserved. In > contrast: > > void incr(int val) { > ? val++; > ? popFrameHere(); > } > > the change to the argument is meaningless and I would hope any decent > JIT would simply elide it. So, this goes back to my example above where the program is trying to elicit behavior from the agent. It's not meaningless in that case, but that doesn't mean I think we need to support it. > > But we must have a consistent approach to such things. What would > happen if a breakpoint were to be placed on the instruction that > uselessly modified the argument - would we still see the modification > or would it be elided? Breakpoints force interpreted mode for the method, although I suppose that's a hotspot implementation detail and not something a VM would be required to do. A VM that allows breakpoints in compiled methods has the potential to miss the breakpoint if code is elided. Also, what if you put a breakpoint in a method, the call to it is elided. You would never hit the breakpoint. That could cause some serious head scratching for a debugger user if they know the code doing the method call is "executed". > > And how do C1 and C2 avoid this issue? Do they simply not optimise > away the useless assignment? Or do they actively disable that > optimization in this context? > > We need, IMO, to establish the basic philosophy of how to manage JVM > TI / JIT interactions, so we know what things must remain visible and > which can be optimised away. > > That said, changing the test allows us to defer having to reach that > consensus. Agreed. I think it's ok to work around the test issue as long as we keep this overall issue on the radar. Do we have a bug field for that? thanks, Chris > > David > ----- > >> Thanks, >> Serguei >> >> >> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Alex, >>> >>> The fix itself looks Okay. >>> Minor: replace in the comment: "compiler don't drop" => "compiler >>> doesn't drop". >>> >>> However, we still have to reach a consensus on how we treat this >>> issue (as Chris already commented). >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/8/19 15:22, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>> >>>> Currently PopFrame is disabled with JVMCI by [1], so for testing I >>>> reverted [1] changes. >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>> >>>> --alex >>> >> From serguei.spitsyn at oracle.com Fri Dec 6 20:33:17 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Dec 2019 12:33:17 -0800 Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record attribute In-Reply-To: <58015dfe-a587-c1ab-c5bc-453ad2cd13c6@oracle.com> References: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com> <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com> <4755bf60-4701-d1d2-9218-e72cd58ed9a6@oracle.com> <58015dfe-a587-c1ab-c5bc-453ad2cd13c6@oracle.com> Message-ID: Thanks, Harold. Serguei On 12/6/19 10:29, Harold Seigel wrote: > Hi Serguei, > > >> Is this new attribute for 14? > > No.? 15, maybe? > > >>Will it also come from Amber? > > Yes. > > Harold > > On 12/6/2019 1:27 PM, serguei.spitsyn at oracle.com wrote: >> Forgot to ask. >> Is this new attribute for 14? >> Will it also come from Amber? >> >> Thanks, >> Serguei >> >> >> On 12/6/19 10:21, serguei.spitsyn at oracle.com wrote: >>> Hi Harold, >>> >>> Okay, thanks! >>> >>> Thanks, >>> Serguei >>> >>> >>> On 12/6/19 05:16, Harold Seigel wrote: >>>> There will be another unmodifiable attribute with sealed types >>>> called PermittedSubtypes. >>>> >>>> Harold >>>> >>>> On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote: >>>>> Hi David, >>>>> >>>>> Agreed. I was thinking about the same. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> On 12/5/19 2:52 PM, David Holmes wrote: >>>>>> Looks good Harold! >>>>>> >>>>>> If we get any more of these unmodifiable attributes we may have >>>>>> to look at a way to refer to them more abstractly and only define >>>>>> them in one place. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 6/12/2019 12:28 am, Harold Seigel wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please review this trivial change to add documentation about the >>>>>>> Record attribute to the JDWP, JDI, and Instrumentation specs. >>>>>>> >>>>>>> The changed .html pages (best viewed as 'raw') are included in >>>>>>> the webrev but will not be pushed. >>>>>>> >>>>>>> Open Webrev: >>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html >>>>>>> >>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360 >>>>>>> >>>>>>> The fix was regression tested by running Mach5 tiers 1 and 2 >>>>>>> tests and builds on Linux-x64, Solaris, Windows, and Mac OS X. >>>>>>> >>>>>>> Thanks, Harold >>>>>>> >>>>> >>> >> From serguei.spitsyn at oracle.com Fri Dec 6 21:18:30 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Dec 2019 13:18:30 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> Message-ID: <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> On 12/6/19 11:07, Chris Plummer wrote: > On 12/5/19 6:45 PM, David Holmes wrote: >> Hi Serguei, >> >> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >>> Hi Chris and Alex, >>> >>> (I've also included Dan, David and Dean to the mailing list) >>> >>> We have to reach a consensus about this. >> >> This is just part of a much broader issue with JVM TI that I tried to >> have a discussion started based on Richard Reingruber's proposals >> around Escape Analysis: >> >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html >> >> >> Unfortunately that discussion did not get much traction. > Hmm. I have the emails that precede yours above, but not that one. Not > sure how what happened. Just read through it and it did give me one > thought. > Consider a model where the program is designed drive behavior of the > agent, triggering the agent to do certain things by having the program > do certain things. Normally an agent monitors the application, but in > this case the application is purposefully controlling actions > performed by the agent. If code is elided from the program, then the > agent no longer performs as expected. It's a kind of backwards jvmti > programming model, and you may ask why would anyone do this. I'm not > sure if there's a good reason for it, but should it be expected to > work given how the spec is written? My interpretation is that the current JVM TI PopFrame behavior does not break this model. The spec says: "any changes to the arguments, which occurred in the called method, remain;" As the code was eliminated by the compiler then no changes to this argument occurred. So, the PopFrame behavior follows the spec. So, I think, the option #2 is not right. But it depends on our basic philosophy. If the developer wants to control the agent then the program has to be designed to do something meaningful that is not going to be optimized out by the JIT compiler. >> >>> We have 3 options: >>> >>> Option #1: >>> ?? The JIT optimization to delete a code which "looks useless" >>> ?? has to be disabled if can_pop_frame capability is enabled. >>> ?? Than this problem becomes a JIT compiler bug. >>> >>> Option #2: >>> ?? Consider to relax the JVMTI PopFrame spec by changing it to >>> something like: >>> ?? "Note however, that the original argument values are not >>> ??? preserved and can be changed by the called method;" >>> ?? Than this problem becomes a JVM TI spec bug. >>> >>> Option #3: >>> ?? Consider it is Okay for compiler to eliminate useless code, >>> ?? so the argument values can be reinitialized by the PopFrame. >>> ?? Than this problem becomes just a test bug. >>> >>> >>> My preference is option #3. >>> The point is that if the arguments are not really used in >>> a method then restoring them to any values is a no-op. >>> It is really meaningless use case, so why should we care about it. > Is "restoring" the proper term here? I thought they were just left on > the stack and reused on the subsequent invoke. Agreed. The term "restoring" is not accurate here. > In fact I figured the reason for the language in the spec in the first > place is to alleviate JVMTI from having to restore them to their > original values, which is probably not even possible. Right. >> >> Thanks for setting that out clearly. >> >> I'd like to agree this is particular case is a test bug. If we have a >> method: >> >> int incr(int val) { >> ? val++; >> ? popFrameHere(); >> ? return val; >> } >> >> then the change to the argument is necessary and must be preserved. >> In contrast: >> >> void incr(int val) { >> ? val++; >> ? popFrameHere(); >> } >> >> the change to the argument is meaningless and I would hope any decent >> JIT would simply elide it. > So, this goes back to my example above where the program is trying to > elicit behavior from the agent. It's not meaningless in that case, but > that doesn't mean I think we need to support it. Even with this model it is possible and better to do something meaningful to control the agent. This model is very rare use case. It is hard to justify a need to support it. :) >> >> But we must have a consistent approach to such things. What would >> happen if a breakpoint were to be placed on the instruction that >> uselessly modified the argument - would we still see the modification >> or would it be elided? > Breakpoints force interpreted mode for the method, although I suppose > that's a hotspot implementation detail and not something a VM would be > required to do. A VM that allows breakpoints in compiled methods has > the potential to miss the breakpoint if code is elided. > > Also, what if you put a breakpoint in a method, the call to it is > elided. You would never hit the breakpoint. That could cause some > serious head scratching for a debugger user if they know the code > doing the method call is "executed". If the method is not actually being called then missing breakpoints there gives a clue what is going on. Otherwise, it will cause cause some serious head scratching for a debugger user. In general, my preference would be to debug actual behavior. It is not good we have no support breakpoints in compiled methods. >> >> And how do C1 and C2 avoid this issue? Do they simply not optimise >> away the useless assignment? Or do they actively disable that >> optimization in this context? >> >> We need, IMO, to establish the basic philosophy of how to manage JVM >> TI / JIT interactions, so we know what things must remain visible and >> which can be optimised away. >> >> That said, changing the test allows us to defer having to reach that >> consensus. > Agreed. I think it's ok to work around the test issue as long as we > keep this overall issue on the radar. Do we have a bug field for that? I thought, it is a little bit early to file a bug for it. Also, probably, it can be an umbrella enhancement or task. Thanks, Serguei > > thanks, > > Chris >> >> David >> ----- >> >>> Thanks, >>> Serguei >>> >>> >>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>>> Hi Alex, >>>> >>>> The fix itself looks Okay. >>>> Minor: replace in the comment: "compiler don't drop" => "compiler >>>> doesn't drop". >>>> >>>> However, we still have to reach a consensus on how we treat this >>>> issue (as Chris already commented). >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/8/19 15:22, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> Please review the fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>>> >>>>> Currently PopFrame is disabled with JVMCI by [1], so for testing I >>>>> reverted [1] changes. >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>>> >>>>> --alex >>>> >>> > > From mandy.chung at oracle.com Fri Dec 6 21:38:22 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 6 Dec 2019 13:38:22 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> Message-ID: <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> On 12/6/19 5:59 AM, Bob Vandette wrote: >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >> >> >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >> >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. I thought that the error case we are referring to is limit == 0 which indicates something unexpected goes wrong.? So the compatibility concern should be low.? This is very specific to Metrics implementation for cgroup v1 and let me know if I'm wrong. >> Surely there must always be some information available from the operating environment? I see from the impl file: >> >> // the host data, value 0 indicates that something went wrong while the metric was read and >> // in this case we return "information unavailable" code -1. >> >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > limits. > It's important to consider carefully if the monitoring API indicates an error vs unavailable and an application should continue to run when the monitoring system fails to get the metrics. There are several choices to report "something goes wrong" scenarios (should unlikely happen???): 1. fall back to a random positive value? (e.g. host value) 2. return a negative value 3. throw an exception #3 is not an option as the application is not expecting this.? For #2, the application can filter bad values if desirable. I'm okay if you want to file a JBS issue to follow up and thoroughly look at the cases that the metrics are unavailable and the cases when fails to obtain. >> --- >> >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) >> >> ? or simply (as I commented [1]) ??? System.out.format Mandy [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html From chris.plummer at oracle.com Fri Dec 6 21:52:32 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 6 Dec 2019 13:52:32 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> Message-ID: <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote: > On 12/6/19 11:07, Chris Plummer wrote: >> On 12/5/19 6:45 PM, David Holmes wrote: >>> Hi Serguei, >>> >>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris and Alex, >>>> >>>> (I've also included Dan, David and Dean to the mailing list) >>>> >>>> We have to reach a consensus about this. >>> >>> This is just part of a much broader issue with JVM TI that I tried >>> to have a discussion started based on Richard Reingruber's proposals >>> around Escape Analysis: >>> >>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html >>> >>> >>> Unfortunately that discussion did not get much traction. >> Hmm. I have the emails that precede yours above, but not that one. >> Not sure how what happened. Just read through it and it did give me >> one thought. > >> Consider a model where the program is designed drive behavior of the >> agent, triggering the agent to do certain things by having the >> program do certain things. Normally an agent monitors the >> application, but in this case the application is purposefully >> controlling actions performed by the agent. If code is elided from >> the program, then the agent no longer performs as expected. It's a >> kind of backwards jvmti programming model, and you may ask why would >> anyone do this. I'm not sure if there's a good reason for it, but >> should it be expected to work given how the spec is written? > > My interpretation is that the current JVM TI PopFrame behavior does > not break this model. > The spec says: "any changes to the arguments, which occurred in the > called method, remain;" > As the code was eliminated by the compiler then no changes to this > argument occurred. > So, the PopFrame behavior follows the spec. So, I think, the option #2 > is not right. But it depends on our basic philosophy. > If the developer wants to control the agent then the program has to be > designed to do something meaningful that is not going to be optimized > out by the JIT compiler. You misunderstood my point. What I'm saying is that someone might do something like assign to a local with the specific intent of having that trigger a jmvti event, with the specific intent of having the agent perform some expected action as a result. Think of it as being a trigger for the agent, not as the agent monitoring the app. For example, you could right a program + agent, and setting a specific local in the program triggers the agent to turn on a light, and setting some other local turns it off. Absurd, but possible, and maybe there are less absurd applications. Chris > >>> >>>> We have 3 options: >>>> >>>> Option #1: >>>> ?? The JIT optimization to delete a code which "looks useless" >>>> ?? has to be disabled if can_pop_frame capability is enabled. >>>> ?? Than this problem becomes a JIT compiler bug. >>>> >>>> Option #2: >>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to >>>> something like: >>>> ?? "Note however, that the original argument values are not >>>> ??? preserved and can be changed by the called method;" >>>> ?? Than this problem becomes a JVM TI spec bug. >>>> >>>> Option #3: >>>> ?? Consider it is Okay for compiler to eliminate useless code, >>>> ?? so the argument values can be reinitialized by the PopFrame. >>>> ?? Than this problem becomes just a test bug. >>>> >>>> >>>> My preference is option #3. >>>> The point is that if the arguments are not really used in >>>> a method then restoring them to any values is a no-op. >>>> It is really meaningless use case, so why should we care about it. >> Is "restoring" the proper term here? I thought they were just left on >> the stack and reused on the subsequent invoke. > > Agreed. The term "restoring" is not accurate here. > >> In fact I figured the reason for the language in the spec in the >> first place is to alleviate JVMTI from having to restore them to >> their original values, which is probably not even possible. > > Right. > >>> >>> Thanks for setting that out clearly. >>> >>> I'd like to agree this is particular case is a test bug. If we have >>> a method: >>> >>> int incr(int val) { >>> ? val++; >>> ? popFrameHere(); >>> ? return val; >>> } >>> >>> then the change to the argument is necessary and must be preserved. >>> In contrast: >>> >>> void incr(int val) { >>> ? val++; >>> ? popFrameHere(); >>> } >>> >>> the change to the argument is meaningless and I would hope any >>> decent JIT would simply elide it. >> So, this goes back to my example above where the program is trying to >> elicit behavior from the agent. It's not meaningless in that case, >> but that doesn't mean I think we need to support it. > > Even with this model it is possible and better to do something > meaningful to control the agent. > This model is very rare use case. > It is hard to justify a need to support it. :) > >>> >>> But we must have a consistent approach to such things. What would >>> happen if a breakpoint were to be placed on the instruction that >>> uselessly modified the argument - would we still see the >>> modification or would it be elided? >> Breakpoints force interpreted mode for the method, although I suppose >> that's a hotspot implementation detail and not something a VM would >> be required to do. A VM that allows breakpoints in compiled methods >> has the potential to miss the breakpoint if code is elided. >> >> Also, what if you put a breakpoint in a method, the call to it is >> elided. You would never hit the breakpoint. That could cause some >> serious head scratching for a debugger user if they know the code >> doing the method call is "executed". > > If the method is not actually being called then missing breakpoints > there gives a clue what is going on. > Otherwise, it will cause cause some serious head scratching for a > debugger user. > In general, my preference would be to debug actual behavior. > It is not good we have no support breakpoints in compiled methods. > > >>> >>> And how do C1 and C2 avoid this issue? Do they simply not optimise >>> away the useless assignment? Or do they actively disable that >>> optimization in this context? >>> >>> We need, IMO, to establish the basic philosophy of how to manage JVM >>> TI / JIT interactions, so we know what things must remain visible >>> and which can be optimised away. >>> >>> That said, changing the test allows us to defer having to reach that >>> consensus. >> Agreed. I think it's ok to work around the test issue as long as we >> keep this overall issue on the radar. Do we have a bug field for that? > > I thought, it is a little bit early to file a bug for it. > Also, probably, it can be an umbrella enhancement or task. > > Thanks, > Serguei > >> >> thanks, >> >> Chris >>> >>> David >>> ----- >>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Alex, >>>>> >>>>> The fix itself looks Okay. >>>>> Minor: replace in the comment: "compiler don't drop" => "compiler >>>>> doesn't drop". >>>>> >>>>> However, we still have to reach a consensus on how we treat this >>>>> issue (as Chris already commented). >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/8/19 15:22, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review the fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>>>> >>>>>> Currently PopFrame is disabled with JVMCI by [1], so for testing >>>>>> I reverted [1] changes. >>>>>> >>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>>>> >>>>>> --alex >>>>> >>>> >> >> > From dean.long at oracle.com Fri Dec 6 22:59:05 2019 From: dean.long at oracle.com (Dean Long) Date: Fri, 6 Dec 2019 14:59:05 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> Message-ID: This might be a dumb question, but how is PopFrame used in practice?? Re-invoking the method, especially with modified argument values seems dangerous. dl From serguei.spitsyn at oracle.com Fri Dec 6 23:26:34 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Dec 2019 15:26:34 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> Message-ID: <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com> On 12/6/19 13:52, Chris Plummer wrote: > On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote: >> On 12/6/19 11:07, Chris Plummer wrote: >>> On 12/5/19 6:45 PM, David Holmes wrote: >>>> Hi Serguei, >>>> >>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >>>>> Hi Chris and Alex, >>>>> >>>>> (I've also included Dan, David and Dean to the mailing list) >>>>> >>>>> We have to reach a consensus about this. >>>> >>>> This is just part of a much broader issue with JVM TI that I tried >>>> to have a discussion started based on Richard Reingruber's >>>> proposals around Escape Analysis: >>>> >>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html >>>> >>>> >>>> Unfortunately that discussion did not get much traction. >>> Hmm. I have the emails that precede yours above, but not that one. >>> Not sure how what happened. Just read through it and it did give me >>> one thought. >> >>> Consider a model where the program is designed drive behavior of the >>> agent, triggering the agent to do certain things by having the >>> program do certain things. Normally an agent monitors the >>> application, but in this case the application is purposefully >>> controlling actions performed by the agent. If code is elided from >>> the program, then the agent no longer performs as expected. It's a >>> kind of backwards jvmti programming model, and you may ask why would >>> anyone do this. I'm not sure if there's a good reason for it, but >>> should it be expected to work given how the spec is written? >> >> My interpretation is that the current JVM TI PopFrame behavior does >> not break this model. >> The spec says: "any changes to the arguments, which occurred in the >> called method, remain;" >> As the code was eliminated by the compiler then no changes to this >> argument occurred. >> So, the PopFrame behavior follows the spec. So, I think, the option >> #2 is not right. But it depends on our basic philosophy. >> If the developer wants to control the agent then the program has to >> be designed to do something meaningful that is not going to be >> optimized out by the JIT compiler. > You misunderstood my point. What I'm saying is that someone might do > something like assign to a local with the specific intent of having > that trigger a jmvti event, with the specific intent of having the > agent perform some expected action as a result. Think of it as being a > trigger for the agent, not as the agent monitoring the app. For > example, you could right a program + agent, and setting a specific > local in the program triggers the agent to turn on a light, and > setting some other local turns it off. Absurd, but possible, and maybe > there are less absurd applications. I think, I understood your point correctly. Your point is that the code that can be eliminated (e.g. local++) is not that meaningless as it seems to be. My point is that there are still other more reliable ways to trigger the agent. So that relying on something that can be eliminated by JIT compilers is not important to support. Thanks, Serguei > Chris >> >>>> >>>>> We have 3 options: >>>>> >>>>> Option #1: >>>>> ?? The JIT optimization to delete a code which "looks useless" >>>>> ?? has to be disabled if can_pop_frame capability is enabled. >>>>> ?? Than this problem becomes a JIT compiler bug. >>>>> >>>>> Option #2: >>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to >>>>> something like: >>>>> ?? "Note however, that the original argument values are not >>>>> ??? preserved and can be changed by the called method;" >>>>> ?? Than this problem becomes a JVM TI spec bug. >>>>> >>>>> Option #3: >>>>> ?? Consider it is Okay for compiler to eliminate useless code, >>>>> ?? so the argument values can be reinitialized by the PopFrame. >>>>> ?? Than this problem becomes just a test bug. >>>>> >>>>> >>>>> My preference is option #3. >>>>> The point is that if the arguments are not really used in >>>>> a method then restoring them to any values is a no-op. >>>>> It is really meaningless use case, so why should we care about it. >>> Is "restoring" the proper term here? I thought they were just left >>> on the stack and reused on the subsequent invoke. >> >> Agreed. The term "restoring" is not accurate here. >> >>> In fact I figured the reason for the language in the spec in the >>> first place is to alleviate JVMTI from having to restore them to >>> their original values, which is probably not even possible. >> >> Right. >> >>>> >>>> Thanks for setting that out clearly. >>>> >>>> I'd like to agree this is particular case is a test bug. If we have >>>> a method: >>>> >>>> int incr(int val) { >>>> ? val++; >>>> ? popFrameHere(); >>>> ? return val; >>>> } >>>> >>>> then the change to the argument is necessary and must be preserved. >>>> In contrast: >>>> >>>> void incr(int val) { >>>> ? val++; >>>> ? popFrameHere(); >>>> } >>>> >>>> the change to the argument is meaningless and I would hope any >>>> decent JIT would simply elide it. >>> So, this goes back to my example above where the program is trying >>> to elicit behavior from the agent. It's not meaningless in that >>> case, but that doesn't mean I think we need to support it. >> >> Even with this model it is possible and better to do something >> meaningful to control the agent. >> This model is very rare use case. >> It is hard to justify a need to support it. :) >> >>>> >>>> But we must have a consistent approach to such things. What would >>>> happen if a breakpoint were to be placed on the instruction that >>>> uselessly modified the argument - would we still see the >>>> modification or would it be elided? >>> Breakpoints force interpreted mode for the method, although I >>> suppose that's a hotspot implementation detail and not something a >>> VM would be required to do. A VM that allows breakpoints in compiled >>> methods has the potential to miss the breakpoint if code is elided. >>> >>> Also, what if you put a breakpoint in a method, the call to it is >>> elided. You would never hit the breakpoint. That could cause some >>> serious head scratching for a debugger user if they know the code >>> doing the method call is "executed". >> >> If the method is not actually being called then missing breakpoints >> there gives a clue what is going on. >> Otherwise, it will cause cause some serious head scratching for a >> debugger user. >> In general, my preference would be to debug actual behavior. >> It is not good we have no support breakpoints in compiled methods. >> >> >>>> >>>> And how do C1 and C2 avoid this issue? Do they simply not optimise >>>> away the useless assignment? Or do they actively disable that >>>> optimization in this context? >>>> >>>> We need, IMO, to establish the basic philosophy of how to manage >>>> JVM TI / JIT interactions, so we know what things must remain >>>> visible and which can be optimised away. >>>> >>>> That said, changing the test allows us to defer having to reach >>>> that consensus. >>> Agreed. I think it's ok to work around the test issue as long as we >>> keep this overall issue on the radar. Do we have a bug field for that? >> >> I thought, it is a little bit early to file a bug for it. >> Also, probably, it can be an umbrella enhancement or task. >> >> Thanks, >> Serguei >> >>> >>> thanks, >>> >>> Chris >>>> >>>> David >>>> ----- >>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Alex, >>>>>> >>>>>> The fix itself looks Okay. >>>>>> Minor: replace in the comment: "compiler don't drop" => "compiler >>>>>> doesn't drop". >>>>>> >>>>>> However, we still have to reach a consensus on how we treat this >>>>>> issue (as Chris already commented). >>>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 11/8/19 15:22, Alex Menkov wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review the fix for >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>>>>> webrev: >>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>>>>> >>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for testing >>>>>>> I reverted [1] changes. >>>>>>> >>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>>>>> >>>>>>> --alex >>>>>> >>>>> >>> >>> >> > > From serguei.spitsyn at oracle.com Fri Dec 6 23:39:26 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Dec 2019 15:39:26 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> Message-ID: <5c3e704e-7b2c-f72c-1342-2af7d16af53c@oracle.com> The PopFrame together with RedefineClasses is a part of the JVM TI HotSwap feature. The use case is to hot patch the methods. If after class redefinition there are still some method frames then the PopFrame is an option to "refresh" such frames. I agree, this is unreliable and dangerous. But the whole class redefinition feature is somewhat dangerous. :) It is because the responsibility is on the agents. And there are many ways for the agents to break the methods execution semantics with redefinition. Thanks, Serguei On 12/6/19 14:59, Dean Long wrote: > This might be a dumb question, but how is PopFrame used in practice? > Re-invoking the method, especially with modified argument values seems > dangerous. > > dl From daniel.daugherty at oracle.com Sat Dec 7 01:24:11 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 6 Dec 2019 20:24:11 -0500 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com> Message-ID: <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com> On 12/6/19 6:26 PM, serguei.spitsyn at oracle.com wrote: > On 12/6/19 13:52, Chris Plummer wrote: >> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote: >>> On 12/6/19 11:07, Chris Plummer wrote: >>>> On 12/5/19 6:45 PM, David Holmes wrote: >>>>> Hi Serguei, >>>>> >>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Chris and Alex, >>>>>> >>>>>> (I've also included Dan, David and Dean to the mailing list) >>>>>> >>>>>> We have to reach a consensus about this. >>>>> >>>>> This is just part of a much broader issue with JVM TI that I tried >>>>> to have a discussion started based on Richard Reingruber's >>>>> proposals around Escape Analysis: >>>>> >>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html >>>>> >>>>> >>>>> Unfortunately that discussion did not get much traction. >>>> Hmm. I have the emails that precede yours above, but not that one. >>>> Not sure how what happened. Just read through it and it did give me >>>> one thought. >>> >>>> Consider a model where the program is designed drive behavior of >>>> the agent, triggering the agent to do certain things by having the >>>> program do certain things. Normally an agent monitors the >>>> application, but in this case the application is purposefully >>>> controlling actions performed by the agent. If code is elided from >>>> the program, then the agent no longer performs as expected. It's a >>>> kind of backwards jvmti programming model, and you may ask why >>>> would anyone do this. I'm not sure if there's a good reason for it, >>>> but should it be expected to work given how the spec is written? >>> >>> My interpretation is that the current JVM TI PopFrame behavior does >>> not break this model. >>> The spec says: "any changes to the arguments, which occurred in the >>> called method, remain;" >>> As the code was eliminated by the compiler then no changes to this >>> argument occurred. >>> So, the PopFrame behavior follows the spec. So, I think, the option >>> #2 is not right. But it depends on our basic philosophy. >>> If the developer wants to control the agent then the program has to >>> be designed to do something meaningful that is not going to be >>> optimized out by the JIT compiler. >> You misunderstood my point. What I'm saying is that someone might do >> something like assign to a local with the specific intent of having >> that trigger a jmvti event, with the specific intent of having the >> agent perform some expected action as a result. Think of it as being >> a trigger for the agent, not as the agent monitoring the app. For >> example, you could right a program + agent, and setting a specific >> local in the program triggers the agent to turn on a light, and >> setting some other local turns it off. Absurd, but possible, and >> maybe there are less absurd applications. > > I think, I understood your point correctly. > Your point is that the code that can be eliminated (e.g. local++) is > not that meaningless as it seems to be. > My point is that there are still other more reliable ways to trigger > the agent. > So that relying on something that can be eliminated by JIT compilers > is not important to support. You are making the assumption that the agent author understands what Java code/variables *might* be eliminated by the JIT compiler. I don't think that's a good assumption. I might have code that does a really complicated thing in a local variable that is only useful to the agent itself. The JIT will see that the local variable cannot escape the function and is not used outside the function (as far as it can see) so it will elide the local variable and the code that was used to generated the local result in the variable. If that local result happens to be some computation that the agent needed to see to do its next operation... Dan > > Thanks, > Serguei > >> Chris >>> >>>>> >>>>>> We have 3 options: >>>>>> >>>>>> Option #1: >>>>>> ?? The JIT optimization to delete a code which "looks useless" >>>>>> ?? has to be disabled if can_pop_frame capability is enabled. >>>>>> ?? Than this problem becomes a JIT compiler bug. >>>>>> >>>>>> Option #2: >>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to >>>>>> something like: >>>>>> ?? "Note however, that the original argument values are not >>>>>> ??? preserved and can be changed by the called method;" >>>>>> ?? Than this problem becomes a JVM TI spec bug. >>>>>> >>>>>> Option #3: >>>>>> ?? Consider it is Okay for compiler to eliminate useless code, >>>>>> ?? so the argument values can be reinitialized by the PopFrame. >>>>>> ?? Than this problem becomes just a test bug. >>>>>> >>>>>> >>>>>> My preference is option #3. >>>>>> The point is that if the arguments are not really used in >>>>>> a method then restoring them to any values is a no-op. >>>>>> It is really meaningless use case, so why should we care about it. >>>> Is "restoring" the proper term here? I thought they were just left >>>> on the stack and reused on the subsequent invoke. >>> >>> Agreed. The term "restoring" is not accurate here. >>> >>>> In fact I figured the reason for the language in the spec in the >>>> first place is to alleviate JVMTI from having to restore them to >>>> their original values, which is probably not even possible. >>> >>> Right. >>> >>>>> >>>>> Thanks for setting that out clearly. >>>>> >>>>> I'd like to agree this is particular case is a test bug. If we >>>>> have a method: >>>>> >>>>> int incr(int val) { >>>>> ? val++; >>>>> ? popFrameHere(); >>>>> ? return val; >>>>> } >>>>> >>>>> then the change to the argument is necessary and must be >>>>> preserved. In contrast: >>>>> >>>>> void incr(int val) { >>>>> ? val++; >>>>> ? popFrameHere(); >>>>> } >>>>> >>>>> the change to the argument is meaningless and I would hope any >>>>> decent JIT would simply elide it. >>>> So, this goes back to my example above where the program is trying >>>> to elicit behavior from the agent. It's not meaningless in that >>>> case, but that doesn't mean I think we need to support it. >>> >>> Even with this model it is possible and better to do something >>> meaningful to control the agent. >>> This model is very rare use case. >>> It is hard to justify a need to support it. :) >>> >>>>> >>>>> But we must have a consistent approach to such things. What would >>>>> happen if a breakpoint were to be placed on the instruction that >>>>> uselessly modified the argument - would we still see the >>>>> modification or would it be elided? >>>> Breakpoints force interpreted mode for the method, although I >>>> suppose that's a hotspot implementation detail and not something a >>>> VM would be required to do. A VM that allows breakpoints in >>>> compiled methods has the potential to miss the breakpoint if code >>>> is elided. >>>> >>>> Also, what if you put a breakpoint in a method, the call to it is >>>> elided. You would never hit the breakpoint. That could cause some >>>> serious head scratching for a debugger user if they know the code >>>> doing the method call is "executed". >>> >>> If the method is not actually being called then missing breakpoints >>> there gives a clue what is going on. >>> Otherwise, it will cause cause some serious head scratching for a >>> debugger user. >>> In general, my preference would be to debug actual behavior. >>> It is not good we have no support breakpoints in compiled methods. >>> >>> >>>>> >>>>> And how do C1 and C2 avoid this issue? Do they simply not optimise >>>>> away the useless assignment? Or do they actively disable that >>>>> optimization in this context? >>>>> >>>>> We need, IMO, to establish the basic philosophy of how to manage >>>>> JVM TI / JIT interactions, so we know what things must remain >>>>> visible and which can be optimised away. >>>>> >>>>> That said, changing the test allows us to defer having to reach >>>>> that consensus. >>>> Agreed. I think it's ok to work around the test issue as long as we >>>> keep this overall issue on the radar. Do we have a bug field for that? >>> >>> I thought, it is a little bit early to file a bug for it. >>> Also, probably, it can be an umbrella enhancement or task. >>> >>> Thanks, >>> Serguei >>> >>>> >>>> thanks, >>>> >>>> Chris >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Alex, >>>>>>> >>>>>>> The fix itself looks Okay. >>>>>>> Minor: replace in the comment: "compiler don't drop" => >>>>>>> "compiler doesn't drop". >>>>>>> >>>>>>> However, we still have to reach a consensus on how we treat this >>>>>>> issue (as Chris already commented). >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 11/8/19 15:22, Alex Menkov wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review the fix for >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>>>>>> >>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for >>>>>>>> testing I reverted [1] changes. >>>>>>>> >>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>>>>>> >>>>>>>> --alex >>>>>>> >>>>>> >>>> >>>> >>> >> >> > From daniil.x.titov at oracle.com Sat Dec 7 01:41:13 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 06 Dec 2019 17:41:13 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> Message-ID: Hi David, Mandy, and Bob, Thank you for reviewing this fix. Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, but I agree that the changes proposed in the previous version of the webrev increase such probability. I filed the follow-up issue [4] as Mandy suggested. 3. The legacy methods were renamed as David suggested. > src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > ! static int initialized=1; > > Am I reading this right that the code currently fails to actually do the > initialization because of this ??? Yes, currently the code fails to do the initialization but it was unnoticed since method get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" was always -1. > test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > System.out.println(String.format(...) > > Why not simply > > System.out.printf(..) As I tried explain it earlier it would make the tests unstable. System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. Instead it parses the format string into a list of FormatString objects and then iterates over the list. As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find in the output. For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" and "1030762496". [0.304s][trace][os,container] Memory Usage is: 42983424 OperatingSystemMXBean.getFreeMemorySize: 1030758400 [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes [0.305s][trace][os,container] Memory Usage is: 42979328 [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 1030762496 OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) at TestMemoryAwareness.main(TestMemoryAwareness.java:73) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) at java.base/java.lang.Thread.run(Thread.java:832) Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 [4] https://bugs.openjdk.java.net/browse/JDK-8235522 Thank you, Daniil ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: On 12/6/19 5:59 AM, Bob Vandette wrote: >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >> >> >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >> >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. I thought that the error case we are referring to is limit == 0 which indicates something unexpected goes wrong. So the compatibility concern should be low. This is very specific to Metrics implementation for cgroup v1 and let me know if I'm wrong. >> Surely there must always be some information available from the operating environment? I see from the impl file: >> >> // the host data, value 0 indicates that something went wrong while the metric was read and >> // in this case we return "information unavailable" code -1. >> >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > limits. > It's important to consider carefully if the monitoring API indicates an error vs unavailable and an application should continue to run when the monitoring system fails to get the metrics. There are several choices to report "something goes wrong" scenarios (should unlikely happen???): 1. fall back to a random positive value (e.g. host value) 2. return a negative value 3. throw an exception #3 is not an option as the application is not expecting this. For #2, the application can filter bad values if desirable. I'm okay if you want to file a JBS issue to follow up and thoroughly look at the cases that the metrics are unavailable and the cases when fails to obtain. >> --- >> >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) >> >> ? or simply (as I commented [1]) System.out.format Mandy [1] https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html From serguei.spitsyn at oracle.com Sat Dec 7 02:12:07 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Dec 2019 18:12:07 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com> <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com> Message-ID: <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com> On 12/6/19 17:24, Daniel D. Daugherty wrote: > On 12/6/19 6:26 PM, serguei.spitsyn at oracle.com wrote: >> On 12/6/19 13:52, Chris Plummer wrote: >>> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote: >>>> On 12/6/19 11:07, Chris Plummer wrote: >>>>> On 12/5/19 6:45 PM, David Holmes wrote: >>>>>> Hi Serguei, >>>>>> >>>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Chris and Alex, >>>>>>> >>>>>>> (I've also included Dan, David and Dean to the mailing list) >>>>>>> >>>>>>> We have to reach a consensus about this. >>>>>> >>>>>> This is just part of a much broader issue with JVM TI that I >>>>>> tried to have a discussion started based on Richard Reingruber's >>>>>> proposals around Escape Analysis: >>>>>> >>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html >>>>>> >>>>>> >>>>>> Unfortunately that discussion did not get much traction. >>>>> Hmm. I have the emails that precede yours above, but not that one. >>>>> Not sure how what happened. Just read through it and it did give >>>>> me one thought. >>>> >>>>> Consider a model where the program is designed drive behavior of >>>>> the agent, triggering the agent to do certain things by having the >>>>> program do certain things. Normally an agent monitors the >>>>> application, but in this case the application is purposefully >>>>> controlling actions performed by the agent. If code is elided from >>>>> the program, then the agent no longer performs as expected. It's a >>>>> kind of backwards jvmti programming model, and you may ask why >>>>> would anyone do this. I'm not sure if there's a good reason for >>>>> it, but should it be expected to work given how the spec is written? >>>> >>>> My interpretation is that the current JVM TI PopFrame behavior does >>>> not break this model. >>>> The spec says: "any changes to the arguments, which occurred in the >>>> called method, remain;" >>>> As the code was eliminated by the compiler then no changes to this >>>> argument occurred. >>>> So, the PopFrame behavior follows the spec. So, I think, the option >>>> #2 is not right. But it depends on our basic philosophy. >>>> If the developer wants to control the agent then the program has to >>>> be designed to do something meaningful that is not going to be >>>> optimized out by the JIT compiler. >>> You misunderstood my point. What I'm saying is that someone might do >>> something like assign to a local with the specific intent of having >>> that trigger a jmvti event, with the specific intent of having the >>> agent perform some expected action as a result. Think of it as being >>> a trigger for the agent, not as the agent monitoring the app. For >>> example, you could right a program + agent, and setting a specific >>> local in the program triggers the agent to turn on a light, and >>> setting some other local turns it off. Absurd, but possible, and >>> maybe there are less absurd applications. >> >> I think, I understood your point correctly. >> Your point is that the code that can be eliminated (e.g. local++) is >> not that meaningless as it seems to be. >> My point is that there are still other more reliable ways to trigger >> the agent. >> So that relying on something that can be eliminated by JIT compilers >> is not important to support. > > You are making the assumption that the agent author understands what > Java code/variables *might* be eliminated by the JIT compiler. I don't > think that's a good assumption. I might have code that does a really > complicated thing in a local variable that is only useful to the > agent itself. The JIT will see that the local variable cannot escape > the function and is not used outside the function (as far as it can > see) so it will elide the local variable and the code that was used > to generated the local result in the variable. > > If that local result happens to be some computation that the agent > needed to see to do its next operation... Thank you for sharing your point. I'm not insisting on my assumptions here, just not sure this is more important than allowing optimizations. Do you actually think this use case needs to be supported? In general, to identify our philosophy about interaction between JIT compiler code elimination and JVM TI we need to make some assumptions. Let's temporarily put JVM TI out of scope. Are there any assumptions when JIT compilers eliminate some code? Is it based on some vision what code is observable? If it was decided some code can be eliminated then is it JVM TI only that breaks such assumptions about observability? If so, then such optimizations can be disabled at some level. Then we end up debugging/profiling/monitoring, and finally, observing a slightly different application. Are we Okay with this? Do we need any compromises here? Maybe we need more flags to control the JIT compiler behavior. Thanks, Serguei > > Dan > > >> >> Thanks, >> Serguei >> >>> Chris >>>> >>>>>> >>>>>>> We have 3 options: >>>>>>> >>>>>>> Option #1: >>>>>>> ?? The JIT optimization to delete a code which "looks useless" >>>>>>> ?? has to be disabled if can_pop_frame capability is enabled. >>>>>>> ?? Than this problem becomes a JIT compiler bug. >>>>>>> >>>>>>> Option #2: >>>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to >>>>>>> something like: >>>>>>> ?? "Note however, that the original argument values are not >>>>>>> ??? preserved and can be changed by the called method;" >>>>>>> ?? Than this problem becomes a JVM TI spec bug. >>>>>>> >>>>>>> Option #3: >>>>>>> ?? Consider it is Okay for compiler to eliminate useless code, >>>>>>> ?? so the argument values can be reinitialized by the PopFrame. >>>>>>> ?? Than this problem becomes just a test bug. >>>>>>> >>>>>>> >>>>>>> My preference is option #3. >>>>>>> The point is that if the arguments are not really used in >>>>>>> a method then restoring them to any values is a no-op. >>>>>>> It is really meaningless use case, so why should we care about it. >>>>> Is "restoring" the proper term here? I thought they were just left >>>>> on the stack and reused on the subsequent invoke. >>>> >>>> Agreed. The term "restoring" is not accurate here. >>>> >>>>> In fact I figured the reason for the language in the spec in the >>>>> first place is to alleviate JVMTI from having to restore them to >>>>> their original values, which is probably not even possible. >>>> >>>> Right. >>>> >>>>>> >>>>>> Thanks for setting that out clearly. >>>>>> >>>>>> I'd like to agree this is particular case is a test bug. If we >>>>>> have a method: >>>>>> >>>>>> int incr(int val) { >>>>>> ? val++; >>>>>> ? popFrameHere(); >>>>>> ? return val; >>>>>> } >>>>>> >>>>>> then the change to the argument is necessary and must be >>>>>> preserved. In contrast: >>>>>> >>>>>> void incr(int val) { >>>>>> ? val++; >>>>>> ? popFrameHere(); >>>>>> } >>>>>> >>>>>> the change to the argument is meaningless and I would hope any >>>>>> decent JIT would simply elide it. >>>>> So, this goes back to my example above where the program is trying >>>>> to elicit behavior from the agent. It's not meaningless in that >>>>> case, but that doesn't mean I think we need to support it. >>>> >>>> Even with this model it is possible and better to do something >>>> meaningful to control the agent. >>>> This model is very rare use case. >>>> It is hard to justify a need to support it. :) >>>> >>>>>> >>>>>> But we must have a consistent approach to such things. What would >>>>>> happen if a breakpoint were to be placed on the instruction that >>>>>> uselessly modified the argument - would we still see the >>>>>> modification or would it be elided? >>>>> Breakpoints force interpreted mode for the method, although I >>>>> suppose that's a hotspot implementation detail and not something a >>>>> VM would be required to do. A VM that allows breakpoints in >>>>> compiled methods has the potential to miss the breakpoint if code >>>>> is elided. >>>>> >>>>> Also, what if you put a breakpoint in a method, the call to it is >>>>> elided. You would never hit the breakpoint. That could cause some >>>>> serious head scratching for a debugger user if they know the code >>>>> doing the method call is "executed". >>>> >>>> If the method is not actually being called then missing breakpoints >>>> there gives a clue what is going on. >>>> Otherwise, it will cause cause some serious head scratching for a >>>> debugger user. >>>> In general, my preference would be to debug actual behavior. >>>> It is not good we have no support breakpoints in compiled methods. >>>> >>>> >>>>>> >>>>>> And how do C1 and C2 avoid this issue? Do they simply not >>>>>> optimise away the useless assignment? Or do they actively disable >>>>>> that optimization in this context? >>>>>> >>>>>> We need, IMO, to establish the basic philosophy of how to manage >>>>>> JVM TI / JIT interactions, so we know what things must remain >>>>>> visible and which can be optimised away. >>>>>> >>>>>> That said, changing the test allows us to defer having to reach >>>>>> that consensus. >>>>> Agreed. I think it's ok to work around the test issue as long as >>>>> we keep this overall issue on the radar. Do we have a bug field >>>>> for that? >>>> >>>> I thought, it is a little bit early to file a bug for it. >>>> Also, probably, it can be an umbrella enhancement or task. >>>> >>>> Thanks, >>>> Serguei >>>> >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Alex, >>>>>>>> >>>>>>>> The fix itself looks Okay. >>>>>>>> Minor: replace in the comment: "compiler don't drop" => >>>>>>>> "compiler doesn't drop". >>>>>>>> >>>>>>>> However, we still have to reach a consensus on how we treat >>>>>>>> this issue (as Chris already commented). >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 11/8/19 15:22, Alex Menkov wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review the fix for >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>>>>>>> >>>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for >>>>>>>>> testing I reverted [1] changes. >>>>>>>>> >>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>>>>>>> >>>>>>>>> --alex >>>>>>>> >>>>>>> >>>>> >>>>> >>>> >>> >>> >> > From chris.plummer at oracle.com Sat Dec 7 05:28:42 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 6 Dec 2019 21:28:42 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com> Message-ID: <84d715ed-6b54-1456-0b2b-0b7291e01698@oracle.com> On 12/6/19 3:26 PM, serguei.spitsyn at oracle.com wrote: > On 12/6/19 13:52, Chris Plummer wrote: >> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote: >>> On 12/6/19 11:07, Chris Plummer wrote: >>>> On 12/5/19 6:45 PM, David Holmes wrote: >>>>> Hi Serguei, >>>>> >>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >>>>>> Hi Chris and Alex, >>>>>> >>>>>> (I've also included Dan, David and Dean to the mailing list) >>>>>> >>>>>> We have to reach a consensus about this. >>>>> >>>>> This is just part of a much broader issue with JVM TI that I tried >>>>> to have a discussion started based on Richard Reingruber's >>>>> proposals around Escape Analysis: >>>>> >>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html >>>>> >>>>> >>>>> Unfortunately that discussion did not get much traction. >>>> Hmm. I have the emails that precede yours above, but not that one. >>>> Not sure how what happened. Just read through it and it did give me >>>> one thought. >>> >>>> Consider a model where the program is designed drive behavior of >>>> the agent, triggering the agent to do certain things by having the >>>> program do certain things. Normally an agent monitors the >>>> application, but in this case the application is purposefully >>>> controlling actions performed by the agent. If code is elided from >>>> the program, then the agent no longer performs as expected. It's a >>>> kind of backwards jvmti programming model, and you may ask why >>>> would anyone do this. I'm not sure if there's a good reason for it, >>>> but should it be expected to work given how the spec is written? >>> >>> My interpretation is that the current JVM TI PopFrame behavior does >>> not break this model. >>> The spec says: "any changes to the arguments, which occurred in the >>> called method, remain;" >>> As the code was eliminated by the compiler then no changes to this >>> argument occurred. >>> So, the PopFrame behavior follows the spec. So, I think, the option >>> #2 is not right. But it depends on our basic philosophy. >>> If the developer wants to control the agent then the program has to >>> be designed to do something meaningful that is not going to be >>> optimized out by the JIT compiler. >> You misunderstood my point. What I'm saying is that someone might do >> something like assign to a local with the specific intent of having >> that trigger a jmvti event, with the specific intent of having the >> agent perform some expected action as a result. Think of it as being >> a trigger for the agent, not as the agent monitoring the app. For >> example, you could right a program + agent, and setting a specific >> local in the program triggers the agent to turn on a light, and >> setting some other local turns it off. Absurd, but possible, and >> maybe there are less absurd applications. > > I think, I understood your point correctly. > Your point is that the code that can be eliminated (e.g. local++) is > not that meaningless as it seems to be. > My point is that there are still other more reliable ways to trigger > the agent. > So that relying on something that can be eliminated by JIT compilers > is not important to support. > Yes, I wasn't trying to imply that it is important. However, the spec should be clear about it. Chris > Thanks, > Serguei > >> Chris >>> >>>>> >>>>>> We have 3 options: >>>>>> >>>>>> Option #1: >>>>>> ?? The JIT optimization to delete a code which "looks useless" >>>>>> ?? has to be disabled if can_pop_frame capability is enabled. >>>>>> ?? Than this problem becomes a JIT compiler bug. >>>>>> >>>>>> Option #2: >>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to >>>>>> something like: >>>>>> ?? "Note however, that the original argument values are not >>>>>> ??? preserved and can be changed by the called method;" >>>>>> ?? Than this problem becomes a JVM TI spec bug. >>>>>> >>>>>> Option #3: >>>>>> ?? Consider it is Okay for compiler to eliminate useless code, >>>>>> ?? so the argument values can be reinitialized by the PopFrame. >>>>>> ?? Than this problem becomes just a test bug. >>>>>> >>>>>> >>>>>> My preference is option #3. >>>>>> The point is that if the arguments are not really used in >>>>>> a method then restoring them to any values is a no-op. >>>>>> It is really meaningless use case, so why should we care about it. >>>> Is "restoring" the proper term here? I thought they were just left >>>> on the stack and reused on the subsequent invoke. >>> >>> Agreed. The term "restoring" is not accurate here. >>> >>>> In fact I figured the reason for the language in the spec in the >>>> first place is to alleviate JVMTI from having to restore them to >>>> their original values, which is probably not even possible. >>> >>> Right. >>> >>>>> >>>>> Thanks for setting that out clearly. >>>>> >>>>> I'd like to agree this is particular case is a test bug. If we >>>>> have a method: >>>>> >>>>> int incr(int val) { >>>>> ? val++; >>>>> ? popFrameHere(); >>>>> ? return val; >>>>> } >>>>> >>>>> then the change to the argument is necessary and must be >>>>> preserved. In contrast: >>>>> >>>>> void incr(int val) { >>>>> ? val++; >>>>> ? popFrameHere(); >>>>> } >>>>> >>>>> the change to the argument is meaningless and I would hope any >>>>> decent JIT would simply elide it. >>>> So, this goes back to my example above where the program is trying >>>> to elicit behavior from the agent. It's not meaningless in that >>>> case, but that doesn't mean I think we need to support it. >>> >>> Even with this model it is possible and better to do something >>> meaningful to control the agent. >>> This model is very rare use case. >>> It is hard to justify a need to support it. :) >>> >>>>> >>>>> But we must have a consistent approach to such things. What would >>>>> happen if a breakpoint were to be placed on the instruction that >>>>> uselessly modified the argument - would we still see the >>>>> modification or would it be elided? >>>> Breakpoints force interpreted mode for the method, although I >>>> suppose that's a hotspot implementation detail and not something a >>>> VM would be required to do. A VM that allows breakpoints in >>>> compiled methods has the potential to miss the breakpoint if code >>>> is elided. >>>> >>>> Also, what if you put a breakpoint in a method, the call to it is >>>> elided. You would never hit the breakpoint. That could cause some >>>> serious head scratching for a debugger user if they know the code >>>> doing the method call is "executed". >>> >>> If the method is not actually being called then missing breakpoints >>> there gives a clue what is going on. >>> Otherwise, it will cause cause some serious head scratching for a >>> debugger user. >>> In general, my preference would be to debug actual behavior. >>> It is not good we have no support breakpoints in compiled methods. >>> >>> >>>>> >>>>> And how do C1 and C2 avoid this issue? Do they simply not optimise >>>>> away the useless assignment? Or do they actively disable that >>>>> optimization in this context? >>>>> >>>>> We need, IMO, to establish the basic philosophy of how to manage >>>>> JVM TI / JIT interactions, so we know what things must remain >>>>> visible and which can be optimised away. >>>>> >>>>> That said, changing the test allows us to defer having to reach >>>>> that consensus. >>>> Agreed. I think it's ok to work around the test issue as long as we >>>> keep this overall issue on the radar. Do we have a bug field for that? >>> >>> I thought, it is a little bit early to file a bug for it. >>> Also, probably, it can be an umbrella enhancement or task. >>> >>> Thanks, >>> Serguei >>> >>>> >>>> thanks, >>>> >>>> Chris >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Thanks, >>>>>> Serguei >>>>>> >>>>>> >>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Alex, >>>>>>> >>>>>>> The fix itself looks Okay. >>>>>>> Minor: replace in the comment: "compiler don't drop" => >>>>>>> "compiler doesn't drop". >>>>>>> >>>>>>> However, we still have to reach a consensus on how we treat this >>>>>>> issue (as Chris already commented). >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 11/8/19 15:22, Alex Menkov wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Please review the fix for >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>>>>>> >>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for >>>>>>>> testing I reverted [1] changes. >>>>>>>> >>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>>>>>> >>>>>>>> --alex >>>>>>> >>>>>> >>>> >>>> >>> >> >> > From chris.plummer at oracle.com Sat Dec 7 05:31:57 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 6 Dec 2019 21:31:57 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com> <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com> <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com> Message-ID: On 12/6/19 6:12 PM, serguei.spitsyn at oracle.com wrote: > On 12/6/19 17:24, Daniel D. Daugherty wrote: >> On 12/6/19 6:26 PM, serguei.spitsyn at oracle.com wrote: >>> On 12/6/19 13:52, Chris Plummer wrote: >>>> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote: >>>>> On 12/6/19 11:07, Chris Plummer wrote: >>>>>> On 12/5/19 6:45 PM, David Holmes wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Chris and Alex, >>>>>>>> >>>>>>>> (I've also included Dan, David and Dean to the mailing list) >>>>>>>> >>>>>>>> We have to reach a consensus about this. >>>>>>> >>>>>>> This is just part of a much broader issue with JVM TI that I >>>>>>> tried to have a discussion started based on Richard Reingruber's >>>>>>> proposals around Escape Analysis: >>>>>>> >>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html >>>>>>> >>>>>>> >>>>>>> Unfortunately that discussion did not get much traction. >>>>>> Hmm. I have the emails that precede yours above, but not that >>>>>> one. Not sure how what happened. Just read through it and it did >>>>>> give me one thought. >>>>> >>>>>> Consider a model where the program is designed drive behavior of >>>>>> the agent, triggering the agent to do certain things by having >>>>>> the program do certain things. Normally an agent monitors the >>>>>> application, but in this case the application is purposefully >>>>>> controlling actions performed by the agent. If code is elided >>>>>> from the program, then the agent no longer performs as expected. >>>>>> It's a kind of backwards jvmti programming model, and you may ask >>>>>> why would anyone do this. I'm not sure if there's a good reason >>>>>> for it, but should it be expected to work given how the spec is >>>>>> written? >>>>> >>>>> My interpretation is that the current JVM TI PopFrame behavior >>>>> does not break this model. >>>>> The spec says: "any changes to the arguments, which occurred in >>>>> the called method, remain;" >>>>> As the code was eliminated by the compiler then no changes to this >>>>> argument occurred. >>>>> So, the PopFrame behavior follows the spec. So, I think, the >>>>> option #2 is not right. But it depends on our basic philosophy. >>>>> If the developer wants to control the agent then the program has >>>>> to be designed to do something meaningful that is not going to be >>>>> optimized out by the JIT compiler. >>>> You misunderstood my point. What I'm saying is that someone might >>>> do something like assign to a local with the specific intent of >>>> having that trigger a jmvti event, with the specific intent of >>>> having the agent perform some expected action as a result. Think of >>>> it as being a trigger for the agent, not as the agent monitoring >>>> the app. For example, you could right a program + agent, and >>>> setting a specific local in the program triggers the agent to turn >>>> on a light, and setting some other local turns it off. Absurd, but >>>> possible, and maybe there are less absurd applications. >>> >>> I think, I understood your point correctly. >>> Your point is that the code that can be eliminated (e.g. local++) is >>> not that meaningless as it seems to be. >>> My point is that there are still other more reliable ways to trigger >>> the agent. >>> So that relying on something that can be eliminated by JIT compilers >>> is not important to support. >> >> You are making the assumption that the agent author understands what >> Java code/variables *might* be eliminated by the JIT compiler. I don't >> think that's a good assumption. I might have code that does a really >> complicated thing in a local variable that is only useful to the >> agent itself. The JIT will see that the local variable cannot escape >> the function and is not used outside the function (as far as it can >> see) so it will elide the local variable and the code that was used >> to generated the local result in the variable. >> >> If that local result happens to be some computation that the agent >> needed to see to do its next operation... > > Thank you for sharing your point. > I'm not insisting on my assumptions here, just not sure this is more > important than allowing optimizations. > Do you actually think this use case needs to be supported? > > In general, to identify our philosophy about interaction between JIT > compiler > code elimination and JVM TI we need to make some assumptions. > > > Let's temporarily put JVM TI out of scope. > Are there any assumptions when JIT compilers eliminate some code? > Is it based on some vision what code is observable? > If it was decided some code can be eliminated then is it JVM TI only > that breaks such assumptions about observability? > > If so, then such optimizations can be disabled at some level. > Then we end up debugging/profiling/monitoring, and finally, observing > a slightly different application. > Are we Okay with this? Do we need any compromises here? > Maybe we need more flags to control the JIT compiler behavior. > Dan is restating the point I was making, but I also agree that unless someone can show us a useful application of that kind of use of jvmti events, I don't think we need to support it. We do need to clarify it in the spec however. Chris > Thanks, > Serguei > >> >> Dan >> >> >>> >>> Thanks, >>> Serguei >>> >>>> Chris >>>>> >>>>>>> >>>>>>>> We have 3 options: >>>>>>>> >>>>>>>> Option #1: >>>>>>>> ?? The JIT optimization to delete a code which "looks useless" >>>>>>>> ?? has to be disabled if can_pop_frame capability is enabled. >>>>>>>> ?? Than this problem becomes a JIT compiler bug. >>>>>>>> >>>>>>>> Option #2: >>>>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to >>>>>>>> something like: >>>>>>>> ?? "Note however, that the original argument values are not >>>>>>>> ??? preserved and can be changed by the called method;" >>>>>>>> ?? Than this problem becomes a JVM TI spec bug. >>>>>>>> >>>>>>>> Option #3: >>>>>>>> ?? Consider it is Okay for compiler to eliminate useless code, >>>>>>>> ?? so the argument values can be reinitialized by the PopFrame. >>>>>>>> ?? Than this problem becomes just a test bug. >>>>>>>> >>>>>>>> >>>>>>>> My preference is option #3. >>>>>>>> The point is that if the arguments are not really used in >>>>>>>> a method then restoring them to any values is a no-op. >>>>>>>> It is really meaningless use case, so why should we care about it. >>>>>> Is "restoring" the proper term here? I thought they were just >>>>>> left on the stack and reused on the subsequent invoke. >>>>> >>>>> Agreed. The term "restoring" is not accurate here. >>>>> >>>>>> In fact I figured the reason for the language in the spec in the >>>>>> first place is to alleviate JVMTI from having to restore them to >>>>>> their original values, which is probably not even possible. >>>>> >>>>> Right. >>>>> >>>>>>> >>>>>>> Thanks for setting that out clearly. >>>>>>> >>>>>>> I'd like to agree this is particular case is a test bug. If we >>>>>>> have a method: >>>>>>> >>>>>>> int incr(int val) { >>>>>>> ? val++; >>>>>>> ? popFrameHere(); >>>>>>> ? return val; >>>>>>> } >>>>>>> >>>>>>> then the change to the argument is necessary and must be >>>>>>> preserved. In contrast: >>>>>>> >>>>>>> void incr(int val) { >>>>>>> ? val++; >>>>>>> ? popFrameHere(); >>>>>>> } >>>>>>> >>>>>>> the change to the argument is meaningless and I would hope any >>>>>>> decent JIT would simply elide it. >>>>>> So, this goes back to my example above where the program is >>>>>> trying to elicit behavior from the agent. It's not meaningless in >>>>>> that case, but that doesn't mean I think we need to support it. >>>>> >>>>> Even with this model it is possible and better to do something >>>>> meaningful to control the agent. >>>>> This model is very rare use case. >>>>> It is hard to justify a need to support it. :) >>>>> >>>>>>> >>>>>>> But we must have a consistent approach to such things. What >>>>>>> would happen if a breakpoint were to be placed on the >>>>>>> instruction that uselessly modified the argument - would we >>>>>>> still see the modification or would it be elided? >>>>>> Breakpoints force interpreted mode for the method, although I >>>>>> suppose that's a hotspot implementation detail and not something >>>>>> a VM would be required to do. A VM that allows breakpoints in >>>>>> compiled methods has the potential to miss the breakpoint if code >>>>>> is elided. >>>>>> >>>>>> Also, what if you put a breakpoint in a method, the call to it is >>>>>> elided. You would never hit the breakpoint. That could cause some >>>>>> serious head scratching for a debugger user if they know the code >>>>>> doing the method call is "executed". >>>>> >>>>> If the method is not actually being called then missing >>>>> breakpoints there gives a clue what is going on. >>>>> Otherwise, it will cause cause some serious head scratching for a >>>>> debugger user. >>>>> In general, my preference would be to debug actual behavior. >>>>> It is not good we have no support breakpoints in compiled methods. >>>>> >>>>> >>>>>>> >>>>>>> And how do C1 and C2 avoid this issue? Do they simply not >>>>>>> optimise away the useless assignment? Or do they actively >>>>>>> disable that optimization in this context? >>>>>>> >>>>>>> We need, IMO, to establish the basic philosophy of how to manage >>>>>>> JVM TI / JIT interactions, so we know what things must remain >>>>>>> visible and which can be optimised away. >>>>>>> >>>>>>> That said, changing the test allows us to defer having to reach >>>>>>> that consensus. >>>>>> Agreed. I think it's ok to work around the test issue as long as >>>>>> we keep this overall issue on the radar. Do we have a bug field >>>>>> for that? >>>>> >>>>> I thought, it is a little bit early to file a bug for it. >>>>> Also, probably, it can be an umbrella enhancement or task. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Alex, >>>>>>>>> >>>>>>>>> The fix itself looks Okay. >>>>>>>>> Minor: replace in the comment: "compiler don't drop" => >>>>>>>>> "compiler doesn't drop". >>>>>>>>> >>>>>>>>> However, we still have to reach a consensus on how we treat >>>>>>>>> this issue (as Chris already commented). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/8/19 15:22, Alex Menkov wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Please review the fix for >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>>>>>>>> webrev: >>>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>>>>>>>> >>>>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for >>>>>>>>>> testing I reverted [1] changes. >>>>>>>>>> >>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>>>>>>>> >>>>>>>>>> --alex >>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> > From leonid.mesnik at oracle.com Sun Dec 8 02:17:00 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Sat, 7 Dec 2019 18:17:00 -0800 Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi tests Message-ID: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com> Hi Could you please review following fix which just remove duplicated threadByName methods and JDITestRuntimeException exceptions in nsk/jdi tests. I don't see any reason to have so many copies of them. The method threadByName is added nsk.share.jdi.Debugee class as 'threadByNameOrThrow' because slightly different 'threadByName' already exist there. I filed another sub-task https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage and merge these 2 methods later. This fix affects about ~4000 lines and I want to keep it as straight-forward as possible. webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8235530 The next planned steps are in: https://bugs.openjdk.java.net/browse/JDK-8233830 Leonid From serguei.spitsyn at oracle.com Sun Dec 8 04:30:33 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sat, 7 Dec 2019 20:30:33 -0800 Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi tests In-Reply-To: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com> References: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com> Message-ID: Hi Leonid, The fix looks good. Thank you for taking care about it! I agree, it is an awful duplication. Thanks, Serguei On 12/7/19 18:17, Leonid Mesnik wrote: > Hi > > Could you please review following fix which just remove duplicated > threadByName methods and JDITestRuntimeException exceptions in nsk/jdi > tests. I don't see any reason to have so many copies of them. > > The method threadByName is added nsk.share.jdi.Debugee class as > 'threadByNameOrThrow' because slightly different 'threadByName' > already exist there. I filed another sub-task > https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage and > merge these 2 methods later. > > This fix affects about ~4000 lines and I want to keep it as > straight-forward as possible. > > webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8235530 > > The next planned steps are in: > > https://bugs.openjdk.java.net/browse/JDK-8233830 > > Leonid > From david.holmes at oracle.com Sun Dec 8 05:19:36 2019 From: david.holmes at oracle.com (David Holmes) Date: Sun, 8 Dec 2019 15:19:36 +1000 Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi tests In-Reply-To: References: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com> Message-ID: <1a9d76e5-3a73-9804-be3f-93d6864465be@oracle.com> +1 on both counts Not sure JDITestRuntimeException is really necessary/useful versus just using RuntimeException, but that's a different issue. Thanks, David On 8/12/2019 2:30 pm, serguei.spitsyn at oracle.com wrote: > Hi Leonid, > > The fix looks good. > > Thank you for taking care about it! > I agree, it is an awful duplication. > > Thanks, > Serguei > > > On 12/7/19 18:17, Leonid Mesnik wrote: >> Hi >> >> Could you please review following fix which just remove duplicated >> threadByName methods and JDITestRuntimeException exceptions in nsk/jdi >> tests. I don't see any reason to have so many copies of them. >> >> The method threadByName is added nsk.share.jdi.Debugee class as >> 'threadByNameOrThrow' because slightly different 'threadByName' >> already exist there. I filed another sub-task >> https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage and >> merge these 2 methods later. >> >> This fix affects about ~4000 lines and I want to keep it as >> straight-forward as possible. >> >> webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/ >> >> bug: https://bugs.openjdk.java.net/browse/JDK-8235530 >> >> The next planned steps are in: >> >> https://bugs.openjdk.java.net/browse/JDK-8233830 >> >> Leonid >> > From david.holmes at oracle.com Mon Dec 9 04:49:11 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Dec 2019 14:49:11 +1000 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> Message-ID: <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com> Hi Daniil, On 7/12/2019 11:41 am, Daniil Titov wrote: > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. Okay. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. Okay. > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. I added a comment to the bug. This is potentially a difficult problem to resolve - it all depends on the likelihood of any errors and what they really indicate. > 3. The legacy methods were renamed as David suggested. Thanks! > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed since method > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > was always -1. So we never try to access the uninitialized counters.cpus array which is good but we still return garbage for counters.jvmTicks and counters.cpuTicks - surely that should have been noticeable? >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > in the output. Sorry I missed the earlier explanation. I find it somewhat surprising that format() works that way, but without unlimited buffering there will always be a need to flush the outputstream at some point. Thanks, David ----- > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > and "1030762496". > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > [0.305s][trace][os,container] Memory Usage is: 42979328 > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > 1030762496 > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > at java.base/java.lang.Thread.run(Thread.java:832) > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > >> > >> > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > >> > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > I thought that the error case we are referring to is limit == 0 which > indicates something unexpected goes wrong. So the compatibility concern > should be low. This is very specific to Metrics implementation for > cgroup v1 and let me know if I'm wrong. > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > >> > >> // the host data, value 0 indicates that something went wrong while the metric was read and > >> // in this case we return "information unavailable" code -1. > >> > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > limits. > > > > It's important to consider carefully if the monitoring API indicates an > error vs unavailable and an application should continue to run when the > monitoring system fails to get the metrics. > > There are several choices to report "something goes wrong" scenarios > (should unlikely happen???): > 1. fall back to a random positive value (e.g. host value) > 2. return a negative value > 3. throw an exception > > #3 is not an option as the application is not expecting this. For #2, > the application can filter bad values if desirable. > > I'm okay if you want to file a JBS issue to follow up and thoroughly > look at the cases that the metrics are unavailable and the cases when > fails to obtain. > > >> --- > >> > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > >> > >> ? > > or simply (as I commented [1]) > System.out.format > > Mandy > [1] > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > From ralf.schmelter at sap.com Mon Dec 9 09:01:08 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Mon, 9 Dec 2019 09:01:08 +0000 Subject: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump In-Reply-To: References: Message-ID: Hi Thomas, thanks for the feedback. > In DumpWriter, _current_entry_left and _entry_ended seem only to be needed for > asserting. Please enclose their definition in DEBUG_ONLY, and initialize them in the ctor. Good catch. I made them debug only. > (not your patch): since DumperSupport::dump_class_and_array_classes(Klass*) > should assert that Klass* is an InstanceKlass; or, even better, use InstanceKlass* > as parameter. A former version of the patch had a lot of the Klass* types replaced by InstanceKlass* where appropriate. I removed those changes ultimately because they had not much to with making the heap dump streamable. But a later patch could change this. > DumpWriter::start_dump_entry(): It took me a while to understand how the > segment size is updated if the entry is huge, since by the time we finish the > entry the segment header will already be flushed out. The answer is, I think, > that this is not needed since we only write one record so the initial size we wrote > into the segment header is still valid. Correct. > Proposed comment change: > > -// Will be fixed up later if we can add more entries. > +// Seed segment size with size of its first record. Should we add more records later, > we will update the segment size (see >finish_dump_segment()) I?ve changed the comment to make it more clear. Best regards, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob.vandette at oracle.com Mon Dec 9 15:17:33 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Mon, 9 Dec 2019 10:17:33 -0500 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> Message-ID: Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException so it?s consistent with the other get functions? Bob. > On Dec 6, 2019, at 8:41 PM, Daniil Titov wrote: > > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. > 3. The legacy methods were renamed as David suggested. > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed since method > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > was always -1. > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > in the output. > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > and "1030762496". > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > [0.305s][trace][os,container] Memory Usage is: 42979328 > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > 1030762496 > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > at java.base/java.lang.Thread.run(Thread.java:832) > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: >>> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >>> >>> >>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >>> >>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > I thought that the error case we are referring to is limit == 0 which > indicates something unexpected goes wrong. So the compatibility concern > should be low. This is very specific to Metrics implementation for > cgroup v1 and let me know if I'm wrong. > >>> Surely there must always be some information available from the operating environment? I see from the impl file: >>> >>> // the host data, value 0 indicates that something went wrong while the metric was read and >>> // in this case we return "information unavailable" code -1. >>> >>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. >> I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. >> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others >> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no >> limits. >> > > It's important to consider carefully if the monitoring API indicates an > error vs unavailable and an application should continue to run when the > monitoring system fails to get the metrics. > > There are several choices to report "something goes wrong" scenarios > (should unlikely happen???): > 1. fall back to a random positive value (e.g. host value) > 2. return a negative value > 3. throw an exception > > #3 is not an option as the application is not expecting this. For #2, > the application can filter bad values if desirable. > > I'm okay if you want to file a JBS issue to follow up and thoroughly > look at the cases that the metrics are unavailable and the cases when > fails to obtain. > >>> --- >>> >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>> >>> System.out.println(String.format(...) >>> >>> Why not simply >>> >>> System.out.printf(..) >>> >>> ? > > or simply (as I commented [1]) > System.out.format > > Mandy > [1] > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > From mandy.chung at oracle.com Mon Dec 9 17:48:51 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 9 Dec 2019 09:48:51 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> Message-ID: <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com> Files:lines requires FilePermission check.? So it needs to be wrapped with doPrivileged.? The readFilePrivileged can unwrap and throw the cause instead like this: ??? static Stream readFilePrivileged(Path path) throws IOException { ???????? try { ???????????? return AccessController.doPrivileged((PrivilegedExceptionAction>) () -> Files.lines(path)); ???????? } catch (PrivilegedActionException e) { ???????????? Throwable x = e.getCause(); ???????????? if (x instanceof IOException) ????????????????? throw (IOException)x; ???????????? if (x instanceof RuntimeException) ????????????????? throw (RuntimeException)x; ???????????? if (x instanceof Error) ????????????????? throw (Error)x; ???????????? throw new InternalError(x); ???????? } ??? } On 12/9/19 7:17 AM, Bob Vandette wrote: > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException > so it?s consistent with the other get functions? > > Bob. > > >> On Dec 6, 2019, at 8:41 PM, Daniil Titov wrote: >> >> Hi David, Mandy, and Bob, >> >> Thank you for reviewing this fix. >> >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. >> 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize >> was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. >> I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, >> but I agree that the changes proposed in the previous version of the webrev increase such probability. >> I filed the follow-up issue [4] as Mandy suggested. >> 3. The legacy methods were renamed as David suggested. >> >> >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>> ! static int initialized=1; >>> >>> Am I reading this right that the code currently fails to actually do the >>> initialization because of this ??? >> Yes, currently the code fails to do the initialization but it was unnoticed since method >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" >> was always -1. >> >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>> >>> System.out.println(String.format(...) >>> >>> Why not simply >>> >>> System.out.printf(..) >> As I tried explain it earlier it would make the tests unstable. >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. >> Instead it parses the format string into a list of FormatString objects and then iterates over the list. >> As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find >> in the output. >> >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" >> and "1030762496". >> >> >> [0.304s][trace][os,container] Memory Usage is: 42983424 >> OperatingSystemMXBean.getFreeMemorySize: 1030758400 >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> [0.305s][trace][os,container] Memory Usage is: 42979328 >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 >> 1030762496 >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 >> >> >> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr >> >> at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) >> at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) >> at TestMemoryAwareness.main(TestMemoryAwareness.java:73) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.base/java.lang.reflect.Method.invoke(Method.java:564) >> at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) >> at java.base/java.lang.Thread.run(Thread.java:832) >> >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522 >> >> Thank you, >> Daniil >> >> ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: >> >> >> >> On 12/6/19 5:59 AM, Bob Vandette wrote: >>>> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >>>> >>>> >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >>>> >>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. >> I thought that the error case we are referring to is limit == 0 which >> indicates something unexpected goes wrong. So the compatibility concern >> should be low. This is very specific to Metrics implementation for >> cgroup v1 and let me know if I'm wrong. >> >>>> Surely there must always be some information available from the operating environment? I see from the impl file: >>>> >>>> // the host data, value 0 indicates that something went wrong while the metric was read and >>>> // in this case we return "information unavailable" code -1. >>>> >>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. >>> I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. >>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others >>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no >>> limits. >>> >> It's important to consider carefully if the monitoring API indicates an >> error vs unavailable and an application should continue to run when the >> monitoring system fails to get the metrics. >> >> There are several choices to report "something goes wrong" scenarios >> (should unlikely happen???): >> 1. fall back to a random positive value (e.g. host value) >> 2. return a negative value >> 3. throw an exception >> >> #3 is not an option as the application is not expecting this. For #2, >> the application can filter bad values if desirable. >> >> I'm okay if you want to file a JBS issue to follow up and thoroughly >> look at the cases that the metrics are unavailable and the cases when >> fails to obtain. >> >>>> --- >>>> >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>>> >>>> System.out.println(String.format(...) >>>> >>>> Why not simply >>>> >>>> System.out.printf(..) >>>> >>>> ? >> or simply (as I commented [1]) >> System.out.format >> >> Mandy >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html >> >> >> >> From larry.cable at oracle.com Mon Dec 9 18:21:44 2019 From: larry.cable at oracle.com (Laurence Cable) Date: Mon, 9 Dec 2019 10:21:44 -0800 Subject: RFR: JDK-8215196: [Graal] vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with "changes for the arguments of the popped frame's method, did not remain current argument values" In-Reply-To: <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com> References: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com> <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com> <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com> <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com> <444ed938-d873-12fc-a55e-2d645a099260@oracle.com> <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com> <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com> <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com> Message-ID: <7f5bda1a-5741-e670-86fe-57767f0afa94@oracle.com> inline... On 12/6/19 6:12 PM, serguei.spitsyn at oracle.com wrote: > On 12/6/19 17:24, Daniel D. Daugherty wrote: >> On 12/6/19 6:26 PM, serguei.spitsyn at oracle.com wrote: >>> On 12/6/19 13:52, Chris Plummer wrote: >>>> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote: >>>>> On 12/6/19 11:07, Chris Plummer wrote: >>>>>> On 12/5/19 6:45 PM, David Holmes wrote: >>>>>>> Hi Serguei, >>>>>>> >>>>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Chris and Alex, >>>>>>>> >>>>>>>> (I've also included Dan, David and Dean to the mailing list) >>>>>>>> >>>>>>>> We have to reach a consensus about this. >>>>>>> >>>>>>> This is just part of a much broader issue with JVM TI that I >>>>>>> tried to have a discussion started based on Richard Reingruber's >>>>>>> proposals around Escape Analysis: >>>>>>> >>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html >>>>>>> >>>>>>> >>>>>>> Unfortunately that discussion did not get much traction. >>>>>> Hmm. I have the emails that precede yours above, but not that >>>>>> one. Not sure how what happened. Just read through it and it did >>>>>> give me one thought. >>>>> >>>>>> Consider a model where the program is designed drive behavior of >>>>>> the agent, triggering the agent to do certain things by having >>>>>> the program do certain things. Normally an agent monitors the >>>>>> application, but in this case the application is purposefully >>>>>> controlling actions performed by the agent. If code is elided >>>>>> from the program, then the agent no longer performs as expected. >>>>>> It's a kind of backwards jvmti programming model, and you may ask >>>>>> why would anyone do this. I'm not sure if there's a good reason >>>>>> for it, but should it be expected to work given how the spec is >>>>>> written? >>>>> >>>>> My interpretation is that the current JVM TI PopFrame behavior >>>>> does not break this model. >>>>> The spec says: "any changes to the arguments, which occurred in >>>>> the called method, remain;" >>>>> As the code was eliminated by the compiler then no changes to this >>>>> argument occurred. >>>>> So, the PopFrame behavior follows the spec. So, I think, the >>>>> option #2 is not right. But it depends on our basic philosophy. >>>>> If the developer wants to control the agent then the program has >>>>> to be designed to do something meaningful that is not going to be >>>>> optimized out by the JIT compiler. >>>> You misunderstood my point. What I'm saying is that someone might >>>> do something like assign to a local with the specific intent of >>>> having that trigger a jmvti event, with the specific intent of >>>> having the agent perform some expected action as a result. Think of >>>> it as being a trigger for the agent, not as the agent monitoring >>>> the app. For example, you could right a program + agent, and >>>> setting a specific local in the program triggers the agent to turn >>>> on a light, and setting some other local turns it off. Absurd, but >>>> possible, and maybe there are less absurd applications. >>> >>> I think, I understood your point correctly. >>> Your point is that the code that can be eliminated (e.g. local++) is >>> not that meaningless as it seems to be. >>> My point is that there are still other more reliable ways to trigger >>> the agent. >>> So that relying on something that can be eliminated by JIT compilers >>> is not important to support. >> >> You are making the assumption that the agent author understands what >> Java code/variables *might* be eliminated by the JIT compiler. I don't >> think that's a good assumption. I might have code that does a really >> complicated thing in a local variable that is only useful to the >> agent itself. The JIT will see that the local variable cannot escape >> the function and is not used outside the function (as far as it can >> see) so it will elide the local variable and the code that was used >> to generated the local result in the variable. >> >> If that local result happens to be some computation that the agent >> needed to see to do its next operation... > > Thank you for sharing your point. > I'm not insisting on my assumptions here, just not sure this is more > important than allowing optimizations. > Do you actually think this use case needs to be supported? > > In general, to identify our philosophy about interaction between JIT > compiler > code elimination and JVM TI we need to make some assumptions. > > > Let's temporarily put JVM TI out of scope. > Are there any assumptions when JIT compilers eliminate some code? > Is it based on some vision what code is observable? > If it was decided some code can be eliminated then is it JVM TI only > that breaks such assumptions about observability? > > If so, then such optimizations can be disabled at some level. > Then we end up debugging/profiling/monitoring, and finally, observing > a slightly different application. > Are we Okay with this? Do we need any compromises here? > Maybe we need more flags to control the JIT compiler behavior. I would say that "in general" (not Java specific) there is an implicit assumption that compiler optimization and "debugging" are diametrically opposed to each other, and thus one cannot assume/expect that either can transparently co-exist, you either optimize or you debug, but there is a sliding scale between the two extremes, fully optimized and no optimization (and hence fully debug-able). The question is: where does "observe-ability" (as distinct from debugging) lie on that continuum? In most "ahead of time" language compilers, where all the code generation occurs during the compilation phase, the developer can choose to inform the compiler if their intent is to debug the resulting code (no mutating optimizations, and full metadata retention), or to optimize for "production" execution (maximal optimization, and no retention of debug metadata). The JVM moves some of this activity to r/t which is I think an implementation detail, there is still a "contract" between the activity of the "compilation" component and the "debug/observe-ability" component of the runtime environment. The debug/observe-ability component can only interact with the code that the "compiler" generates (AOT & JIT etc). In other language toolchains, the specification of the intent to "debug" typically constrains the "compiler" from making mutating optimizations that would result in an execution behavior that is not broadly equivalent to that expressed at source level. In short, I think we have two options; define the behavior of the execution environment as "undefined", that is the compilers and runtime are permitted to mutate the code generated from a form equivalent to that expressed in source, or we add the ability to express the intent of the code generated, such that when that intent is to debug it, that mutating optimizations are suppressed by the compiler and runtime. - Larry > Thanks, > Serguei > >> >> Dan >> >> >>> >>> Thanks, >>> Serguei >>> >>>> Chris >>>>> >>>>>>> >>>>>>>> We have 3 options: >>>>>>>> >>>>>>>> Option #1: >>>>>>>> ?? The JIT optimization to delete a code which "looks useless" >>>>>>>> ?? has to be disabled if can_pop_frame capability is enabled. >>>>>>>> ?? Than this problem becomes a JIT compiler bug. >>>>>>>> >>>>>>>> Option #2: >>>>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to >>>>>>>> something like: >>>>>>>> ?? "Note however, that the original argument values are not >>>>>>>> ??? preserved and can be changed by the called method;" >>>>>>>> ?? Than this problem becomes a JVM TI spec bug. >>>>>>>> >>>>>>>> Option #3: >>>>>>>> ?? Consider it is Okay for compiler to eliminate useless code, >>>>>>>> ?? so the argument values can be reinitialized by the PopFrame. >>>>>>>> ?? Than this problem becomes just a test bug. >>>>>>>> >>>>>>>> >>>>>>>> My preference is option #3. >>>>>>>> The point is that if the arguments are not really used in >>>>>>>> a method then restoring them to any values is a no-op. >>>>>>>> It is really meaningless use case, so why should we care about it. >>>>>> Is "restoring" the proper term here? I thought they were just >>>>>> left on the stack and reused on the subsequent invoke. >>>>> >>>>> Agreed. The term "restoring" is not accurate here. >>>>> >>>>>> In fact I figured the reason for the language in the spec in the >>>>>> first place is to alleviate JVMTI from having to restore them to >>>>>> their original values, which is probably not even possible. >>>>> >>>>> Right. >>>>> >>>>>>> >>>>>>> Thanks for setting that out clearly. >>>>>>> >>>>>>> I'd like to agree this is particular case is a test bug. If we >>>>>>> have a method: >>>>>>> >>>>>>> int incr(int val) { >>>>>>> ? val++; >>>>>>> ? popFrameHere(); >>>>>>> ? return val; >>>>>>> } >>>>>>> >>>>>>> then the change to the argument is necessary and must be >>>>>>> preserved. In contrast: >>>>>>> >>>>>>> void incr(int val) { >>>>>>> ? val++; >>>>>>> ? popFrameHere(); >>>>>>> } >>>>>>> >>>>>>> the change to the argument is meaningless and I would hope any >>>>>>> decent JIT would simply elide it. >>>>>> So, this goes back to my example above where the program is >>>>>> trying to elicit behavior from the agent. It's not meaningless in >>>>>> that case, but that doesn't mean I think we need to support it. >>>>> >>>>> Even with this model it is possible and better to do something >>>>> meaningful to control the agent. >>>>> This model is very rare use case. >>>>> It is hard to justify a need to support it. :) >>>>> >>>>>>> >>>>>>> But we must have a consistent approach to such things. What >>>>>>> would happen if a breakpoint were to be placed on the >>>>>>> instruction that uselessly modified the argument - would we >>>>>>> still see the modification or would it be elided? >>>>>> Breakpoints force interpreted mode for the method, although I >>>>>> suppose that's a hotspot implementation detail and not something >>>>>> a VM would be required to do. A VM that allows breakpoints in >>>>>> compiled methods has the potential to miss the breakpoint if code >>>>>> is elided. >>>>>> >>>>>> Also, what if you put a breakpoint in a method, the call to it is >>>>>> elided. You would never hit the breakpoint. That could cause some >>>>>> serious head scratching for a debugger user if they know the code >>>>>> doing the method call is "executed". >>>>> >>>>> If the method is not actually being called then missing >>>>> breakpoints there gives a clue what is going on. >>>>> Otherwise, it will cause cause some serious head scratching for a >>>>> debugger user. >>>>> In general, my preference would be to debug actual behavior. >>>>> It is not good we have no support breakpoints in compiled methods. >>>>> >>>>> >>>>>>> >>>>>>> And how do C1 and C2 avoid this issue? Do they simply not >>>>>>> optimise away the useless assignment? Or do they actively >>>>>>> disable that optimization in this context? >>>>>>> >>>>>>> We need, IMO, to establish the basic philosophy of how to manage >>>>>>> JVM TI / JIT interactions, so we know what things must remain >>>>>>> visible and which can be optimised away. >>>>>>> >>>>>>> That said, changing the test allows us to defer having to reach >>>>>>> that consensus. >>>>>> Agreed. I think it's ok to work around the test issue as long as >>>>>> we keep this overall issue on the radar. Do we have a bug field >>>>>> for that? >>>>> >>>>> I thought, it is a little bit early to file a bug for it. >>>>> Also, probably, it can be an umbrella enhancement or task. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Alex, >>>>>>>>> >>>>>>>>> The fix itself looks Okay. >>>>>>>>> Minor: replace in the comment: "compiler don't drop" => >>>>>>>>> "compiler doesn't drop". >>>>>>>>> >>>>>>>>> However, we still have to reach a consensus on how we treat >>>>>>>>> this issue (as Chris already commented). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>> On 11/8/19 15:22, Alex Menkov wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> Please review the fix for >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196 >>>>>>>>>> webrev: >>>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/ >>>>>>>>>> >>>>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for >>>>>>>>>> testing I reverted [1] changes. >>>>>>>>>> >>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025 >>>>>>>>>> >>>>>>>>>> --alex >>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> > From daniil.x.titov at oracle.com Mon Dec 9 18:51:15 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 09 Dec 2019 10:51:15 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com> Message-ID: Hi Mandy and Bob, > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException > so it?s consistent with the other get functions? In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires FilePermission checks ( and tests proved that) so we could change this implementation to the following: public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; try (BufferedReader bufferedReader = AccessController.doPrivileged((PrivilegedExceptionAction) () -> { return Files.newBufferedReader(Paths.get(subsystem.path(), parm)); })) { return bufferedReader.readLine(); } catch (PrivilegedActionException | IOException e) { return null; } } Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap PrivilegedActionException exception and throw the cause instead? Thank you, Daniil ?On 12/9/19, 9:48 AM, "Mandy Chung" wrote: Files:lines requires FilePermission check. So it needs to be wrapped with doPrivileged. The readFilePrivileged can unwrap and throw the cause instead like this: static Stream readFilePrivileged(Path path) throws IOException { try { return AccessController.doPrivileged((PrivilegedExceptionAction>) () -> Files.lines(path)); } catch (PrivilegedActionException e) { Throwable x = e.getCause(); if (x instanceof IOException) throw (IOException)x; if (x instanceof RuntimeException) throw (RuntimeException)x; if (x instanceof Error) throw (Error)x; throw new InternalError(x); } } On 12/9/19 7:17 AM, Bob Vandette wrote: > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException > so it?s consistent with the other get functions? > > Bob. > > >> On Dec 6, 2019, at 8:41 PM, Daniil Titov wrote: >> >> Hi David, Mandy, and Bob, >> >> Thank you for reviewing this fix. >> >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. >> 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize >> was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. >> I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, >> but I agree that the changes proposed in the previous version of the webrev increase such probability. >> I filed the follow-up issue [4] as Mandy suggested. >> 3. The legacy methods were renamed as David suggested. >> >> >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>> ! static int initialized=1; >>> >>> Am I reading this right that the code currently fails to actually do the >>> initialization because of this ??? >> Yes, currently the code fails to do the initialization but it was unnoticed since method >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" >> was always -1. >> >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>> >>> System.out.println(String.format(...) >>> >>> Why not simply >>> >>> System.out.printf(..) >> As I tried explain it earlier it would make the tests unstable. >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. >> Instead it parses the format string into a list of FormatString objects and then iterates over the list. >> As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find >> in the output. >> >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" >> and "1030762496". >> >> >> [0.304s][trace][os,container] Memory Usage is: 42983424 >> OperatingSystemMXBean.getFreeMemorySize: 1030758400 >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> [0.305s][trace][os,container] Memory Usage is: 42979328 >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 >> 1030762496 >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 >> >> >> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr >> >> at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) >> at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) >> at TestMemoryAwareness.main(TestMemoryAwareness.java:73) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.base/java.lang.reflect.Method.invoke(Method.java:564) >> at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) >> at java.base/java.lang.Thread.run(Thread.java:832) >> >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522 >> >> Thank you, >> Daniil >> >> ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: >> >> >> >> On 12/6/19 5:59 AM, Bob Vandette wrote: >>>> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >>>> >>>> >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >>>> >>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. >> I thought that the error case we are referring to is limit == 0 which >> indicates something unexpected goes wrong. So the compatibility concern >> should be low. This is very specific to Metrics implementation for >> cgroup v1 and let me know if I'm wrong. >> >>>> Surely there must always be some information available from the operating environment? I see from the impl file: >>>> >>>> // the host data, value 0 indicates that something went wrong while the metric was read and >>>> // in this case we return "information unavailable" code -1. >>>> >>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. >>> I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. >>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others >>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no >>> limits. >>> >> It's important to consider carefully if the monitoring API indicates an >> error vs unavailable and an application should continue to run when the >> monitoring system fails to get the metrics. >> >> There are several choices to report "something goes wrong" scenarios >> (should unlikely happen???): >> 1. fall back to a random positive value (e.g. host value) >> 2. return a negative value >> 3. throw an exception >> >> #3 is not an option as the application is not expecting this. For #2, >> the application can filter bad values if desirable. >> >> I'm okay if you want to file a JBS issue to follow up and thoroughly >> look at the cases that the metrics are unavailable and the cases when >> fails to obtain. >> >>>> --- >>>> >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>>> >>>> System.out.println(String.format(...) >>>> >>>> Why not simply >>>> >>>> System.out.printf(..) >>>> >>>> ? >> or simply (as I commented [1]) >> System.out.format >> >> Mandy >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html >> >> >> >> From daniil.x.titov at oracle.com Mon Dec 9 18:59:21 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 09 Dec 2019 10:59:21 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <89E9C74A-7962-4408-93C5-1AA947FD973D@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com> <89E9C74A-7962-4408-93C5-1AA947FD973D@oracle.com> Message-ID: A correction... We could even further simplify it as the following: public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; try (BufferedReader bufferedReader = AccessController.doPrivileged((PrivilegedExceptionAction) () -> Files.newBufferedReader(Paths.get(subsystem.path(), parm)))) { return bufferedReader.readLine(); } catch (PrivilegedActionException | IOException e) { return null; } } Best regards, Daniil ?On 12/9/19, 10:51 AM, "Daniil Titov" wrote: Hi Mandy and Bob, > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException > so it?s consistent with the other get functions? In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires FilePermission checks ( and tests proved that) so we could change this implementation to the following: public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; try (BufferedReader bufferedReader = AccessController.doPrivileged((PrivilegedExceptionAction) () -> { return Files.newBufferedReader(Paths.get(subsystem.path(), parm)); })) { return bufferedReader.readLine(); } catch (PrivilegedActionException | IOException e) { return null; } } Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap PrivilegedActionException exception and throw the cause instead? Thank you, Daniil ?On 12/9/19, 9:48 AM, "Mandy Chung" wrote: Files:lines requires FilePermission check. So it needs to be wrapped with doPrivileged. The readFilePrivileged can unwrap and throw the cause instead like this: static Stream readFilePrivileged(Path path) throws IOException { try { return AccessController.doPrivileged((PrivilegedExceptionAction>) () -> Files.lines(path)); } catch (PrivilegedActionException e) { Throwable x = e.getCause(); if (x instanceof IOException) throw (IOException)x; if (x instanceof RuntimeException) throw (RuntimeException)x; if (x instanceof Error) throw (Error)x; throw new InternalError(x); } } On 12/9/19 7:17 AM, Bob Vandette wrote: > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException > so it?s consistent with the other get functions? > > Bob. > > >> On Dec 6, 2019, at 8:41 PM, Daniil Titov wrote: >> >> Hi David, Mandy, and Bob, >> >> Thank you for reviewing this fix. >> >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. >> 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize >> was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. >> I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, >> but I agree that the changes proposed in the previous version of the webrev increase such probability. >> I filed the follow-up issue [4] as Mandy suggested. >> 3. The legacy methods were renamed as David suggested. >> >> >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>> ! static int initialized=1; >>> >>> Am I reading this right that the code currently fails to actually do the >>> initialization because of this ??? >> Yes, currently the code fails to do the initialization but it was unnoticed since method >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" >> was always -1. >> >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>> >>> System.out.println(String.format(...) >>> >>> Why not simply >>> >>> System.out.printf(..) >> As I tried explain it earlier it would make the tests unstable. >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. >> Instead it parses the format string into a list of FormatString objects and then iterates over the list. >> As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find >> in the output. >> >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" >> and "1030762496". >> >> >> [0.304s][trace][os,container] Memory Usage is: 42983424 >> OperatingSystemMXBean.getFreeMemorySize: 1030758400 >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> [0.305s][trace][os,container] Memory Usage is: 42979328 >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 >> 1030762496 >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 >> >> >> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr >> >> at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) >> at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) >> at TestMemoryAwareness.main(TestMemoryAwareness.java:73) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.base/java.lang.reflect.Method.invoke(Method.java:564) >> at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) >> at java.base/java.lang.Thread.run(Thread.java:832) >> >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522 >> >> Thank you, >> Daniil >> >> ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: >> >> >> >> On 12/6/19 5:59 AM, Bob Vandette wrote: >>>> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >>>> >>>> >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >>>> >>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. >> I thought that the error case we are referring to is limit == 0 which >> indicates something unexpected goes wrong. So the compatibility concern >> should be low. This is very specific to Metrics implementation for >> cgroup v1 and let me know if I'm wrong. >> >>>> Surely there must always be some information available from the operating environment? I see from the impl file: >>>> >>>> // the host data, value 0 indicates that something went wrong while the metric was read and >>>> // in this case we return "information unavailable" code -1. >>>> >>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. >>> I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. >>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others >>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no >>> limits. >>> >> It's important to consider carefully if the monitoring API indicates an >> error vs unavailable and an application should continue to run when the >> monitoring system fails to get the metrics. >> >> There are several choices to report "something goes wrong" scenarios >> (should unlikely happen???): >> 1. fall back to a random positive value (e.g. host value) >> 2. return a negative value >> 3. throw an exception >> >> #3 is not an option as the application is not expecting this. For #2, >> the application can filter bad values if desirable. >> >> I'm okay if you want to file a JBS issue to follow up and thoroughly >> look at the cases that the metrics are unavailable and the cases when >> fails to obtain. >> >>>> --- >>>> >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>>> >>>> System.out.println(String.format(...) >>>> >>>> Why not simply >>>> >>>> System.out.printf(..) >>>> >>>> ? >> or simply (as I commented [1]) >> System.out.format >> >> Mandy >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html >> >> >> >> From daniil.x.titov at oracle.com Mon Dec 9 19:31:20 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 09 Dec 2019 11:31:20 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com> Message-ID: <447D72BA-1046-4131-9B13-C1F7244D87EA@oracle.com> Hi David, > So we never try to access the uninitialized counters.cpus array which is > good but we still return garbage for counters.jvmTicks and > counters.cpuTicks - surely that should have been noticeable? It only affected the first time the CPU load was requested. Function get_cpuload_internal(...) calls get_totalticks() and get_jvmticks() functions that update these counters. But on the first call, yes, it compares the newly received counters with the garbage. It has the code that seems to be written to somehow mitigate it and in worse case just return 0 or 1.0. But it also could be that there is some other problem this code tries to solve so I'm not sure we should remove these workarounds as a part of the current fix. 274 // seems like we sometimes end up with less kernel ticks when 275 // reading /proc/self/stat a second time, timing issue between cpus? 276 if (pticks->usedKernel < tmp.usedKernel) { 277 kdiff = 0; 278 } else { 279 kdiff = pticks->usedKernel - tmp.usedKernel; 280 } 281 tdiff = pticks->total - tmp.total; 282 udiff = pticks->used - tmp.used; 283 284 if (tdiff == 0) { 285 user_load = 0; 286 } else { 287 if (tdiff < (udiff + kdiff)) { 288 tdiff = udiff + kdiff; 289 } 290 *pkernelLoad = (kdiff / (double)tdiff); 291 // BUG9044876, normalize return values to sane values 292 *pkernelLoad = MAX(*pkernelLoad, 0.0); 293 *pkernelLoad = MIN(*pkernelLoad, 1.0); 294 295 user_load = (udiff / (double)tdiff); 296 user_load = MAX(user_load, 0.0); 297 user_load = MIN(user_load, 1.0); 298 } 299 } Best regards, Daniil ?On 12/8/19, 8:49 PM, "David Holmes" wrote: Hi Daniil, On 7/12/2019 11:41 am, Daniil Titov wrote: > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. Okay. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. Okay. > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. I added a comment to the bug. This is potentially a difficult problem to resolve - it all depends on the likelihood of any errors and what they really indicate. > 3. The legacy methods were renamed as David suggested. Thanks! > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed since method > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > was always -1. So we never try to access the uninitialized counters.cpus array which is good but we still return garbage for counters.jvmTicks and counters.cpuTicks - surely that should have been noticeable? >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > in the output. Sorry I missed the earlier explanation. I find it somewhat surprising that format() works that way, but without unlimited buffering there will always be a need to flush the outputstream at some point. Thanks, David ----- > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > and "1030762496". > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > [0.305s][trace][os,container] Memory Usage is: 42979328 > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > 1030762496 > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > at java.base/java.lang.Thread.run(Thread.java:832) > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > >> > >> > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > >> > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > I thought that the error case we are referring to is limit == 0 which > indicates something unexpected goes wrong. So the compatibility concern > should be low. This is very specific to Metrics implementation for > cgroup v1 and let me know if I'm wrong. > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > >> > >> // the host data, value 0 indicates that something went wrong while the metric was read and > >> // in this case we return "information unavailable" code -1. > >> > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > limits. > > > > It's important to consider carefully if the monitoring API indicates an > error vs unavailable and an application should continue to run when the > monitoring system fails to get the metrics. > > There are several choices to report "something goes wrong" scenarios > (should unlikely happen???): > 1. fall back to a random positive value (e.g. host value) > 2. return a negative value > 3. throw an exception > > #3 is not an option as the application is not expecting this. For #2, > the application can filter bad values if desirable. > > I'm okay if you want to file a JBS issue to follow up and thoroughly > look at the cases that the metrics are unavailable and the cases when > fails to obtain. > > >> --- > >> > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > >> > >> ? > > or simply (as I commented [1]) > System.out.format > > Mandy > [1] > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > From leonid.mesnik at oracle.com Mon Dec 9 20:56:34 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Mon, 9 Dec 2019 12:56:34 -0800 Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi tests In-Reply-To: <1a9d76e5-3a73-9804-be3f-93d6864465be@oracle.com> References: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com> <1a9d76e5-3a73-9804-be3f-93d6864465be@oracle.com> Message-ID: <2218fd4b-5070-aa83-2584-82f26187cd23@oracle.com> David, Serguei Thank you for review. I added comment about JDITestRuntimeException in https://bugs.openjdk.java.net/browse/JDK-8235544 Leonid On 12/7/19 9:19 PM, David Holmes wrote: > +1 on both counts > > Not sure JDITestRuntimeException is really necessary/useful versus > just using RuntimeException, but that's a different issue. > > Thanks, > David > > On 8/12/2019 2:30 pm, serguei.spitsyn at oracle.com wrote: >> Hi Leonid, >> >> The fix looks good. >> >> Thank you for taking care about it! >> I agree, it is an awful duplication. >> >> Thanks, >> Serguei >> >> >> On 12/7/19 18:17, Leonid Mesnik wrote: >>> Hi >>> >>> Could you please review following fix which just remove duplicated >>> threadByName methods and JDITestRuntimeException exceptions in >>> nsk/jdi tests. I don't see any reason to have so many copies of them. >>> >>> The method threadByName is added nsk.share.jdi.Debugee class as >>> 'threadByNameOrThrow' because slightly different 'threadByName' >>> already exist there. I filed another sub-task >>> https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage and >>> merge these 2 methods later. >>> >>> This fix affects about ~4000 lines and I want to keep it as >>> straight-forward as possible. >>> >>> webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/ >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8235530 >>> >>> The next planned steps are in: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8233830 >>> >>> Leonid >>> >> From coleen.phillimore at oracle.com Mon Dec 9 21:04:33 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 9 Dec 2019 16:04:33 -0500 Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi tests In-Reply-To: <2218fd4b-5070-aa83-2584-82f26187cd23@oracle.com> References: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com> <1a9d76e5-3a73-9804-be3f-93d6864465be@oracle.com> <2218fd4b-5070-aa83-2584-82f26187cd23@oracle.com> Message-ID: Very nice! 4153 lines changed: 21 ins; 3841 del; 291 mod; 99816 unchg Coleen On 12/9/19 3:56 PM, Leonid Mesnik wrote: > David, Serguei > > Thank you for review. I added comment about JDITestRuntimeException in > https://bugs.openjdk.java.net/browse/JDK-8235544 > > Leonid > > On 12/7/19 9:19 PM, David Holmes wrote: >> +1 on both counts >> >> Not sure JDITestRuntimeException is really necessary/useful versus >> just using RuntimeException, but that's a different issue. >> >> Thanks, >> David >> >> On 8/12/2019 2:30 pm, serguei.spitsyn at oracle.com wrote: >>> Hi Leonid, >>> >>> The fix looks good. >>> >>> Thank you for taking care about it! >>> I agree, it is an awful duplication. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 12/7/19 18:17, Leonid Mesnik wrote: >>>> Hi >>>> >>>> Could you please review following fix which just remove duplicated >>>> threadByName methods and JDITestRuntimeException exceptions in >>>> nsk/jdi tests. I don't see any reason to have so many copies of them. >>>> >>>> The method threadByName is added nsk.share.jdi.Debugee class as >>>> 'threadByNameOrThrow' because slightly different 'threadByName' >>>> already exist there. I filed another sub-task >>>> https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage >>>> and merge these 2 methods later. >>>> >>>> This fix affects about ~4000 lines and I want to keep it as >>>> straight-forward as possible. >>>> >>>> webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/ >>>> >>>> bug: https://bugs.openjdk.java.net/browse/JDK-8235530 >>>> >>>> The next planned steps are in: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8233830 >>>> >>>> Leonid >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Mon Dec 9 21:54:58 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 9 Dec 2019 13:54:58 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com> Message-ID: <2b476e9f-0266-eba7-3629-114296a55f29@oracle.com> On 12/9/19 10:51 AM, Daniil Titov wrote: > Hi Mandy and Bob, > >> Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException >> so it?s consistent with the other get functions? > In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put > the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires FilePermission checks ( and tests proved that) > so we could change this implementation to the following: > > public static String getStringValue(SubSystem subsystem, String parm) { > if (subsystem == null) return null; > > try (BufferedReader bufferedReader = > AccessController.doPrivileged((PrivilegedExceptionAction) () -> { > return Files.newBufferedReader(Paths.get(subsystem.path(), parm)); > })) { > return bufferedReader.readLine(); > } catch (PrivilegedActionException | IOException e) { > return null; > } > } > > Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap > PrivilegedActionException exception and throw the cause instead? > I think it's simpler to read and understand if the doPrivileged call is moved out as a separate method that will throw IOException as the expected functionality as suggested above. For SubSystem::getStringValue, one suggestion would be: diff --git a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java --- a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java +++ b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java @@ -29,7 +29,11 @@ ?import java.io.IOException; ?import java.math.BigInteger; ?import java.nio.file.Files; +import java.nio.file.Path; ?import java.nio.file.Paths; +import java.security.AccessController; +import java.security.PrivilegedActionException; +import java.security.PrivilegedExceptionAction; ?import java.util.ArrayList; ?import java.util.List; ?import java.util.Optional; @@ -90,9 +94,8 @@ ???? public static String getStringValue(SubSystem subsystem, String parm) { ???????? if (subsystem == null) return null; -??????? try(BufferedReader bufferedReader = Files.newBufferedReader(Paths.get(subsystem.path(), parm))) { -??????????? String line = bufferedReader.readLine(); -??????????? return line; +??????? try { +??????????? return subsystem.readStringValue(parm); ???????? } ???????? catch (IOException e) { ???????????? return null; @@ -100,6 +103,24 @@ ???? } +??? private String readStringValue(String param) throws IOException { +??????? PrivilegedExceptionAction pea = () -> Files.newBufferedReader(Paths.get(path(), param)); +??????? try (BufferedReader bufferedReader = AccessController.doPrivileged(pea)) { +??????????? String line = bufferedReader.readLine(); +??????????? return line; +??????? } catch (PrivilegedActionException e) { +??????????? Throwable x = e.getCause(); +??????????? if (x instanceof IOException) +??????????????? throw (IOException)x; +??????????? if (x instanceof RuntimeException) +??????????????? throw (RuntimeException)x; +??????????? if (x instanceof Error) +??????????????? throw (Error)x; + +??????????? throw new InternalError(x); +??????? } +??? } + From daniil.x.titov at oracle.com Mon Dec 9 23:47:02 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Mon, 09 Dec 2019 15:47:02 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <2b476e9f-0266-eba7-3629-114296a55f29@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com> <2b476e9f-0266-eba7-3629-114296a55f29@oracle.com> Message-ID: <48F75B58-0529-4B43-A355-A12EB0D09598@oracle.com> Hi Mandy and Bob, Please review a new version of the webrev [1] that moves doPrivileged calls in jdk.internal.platform.cgroupv1.SubSystem to separate methods that throw IOException, as Mandy suggested. Mach5 tests are still running. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.06/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 Thank you, Daniil ?On 12/9/19, 1:55 PM, "Mandy Chung" wrote: On 12/9/19 10:51 AM, Daniil Titov wrote: > Hi Mandy and Bob, > >> Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException >> so it?s consistent with the other get functions? > In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put > the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires FilePermission checks ( and tests proved that) > so we could change this implementation to the following: > > public static String getStringValue(SubSystem subsystem, String parm) { > if (subsystem == null) return null; > > try (BufferedReader bufferedReader = > AccessController.doPrivileged((PrivilegedExceptionAction) () -> { > return Files.newBufferedReader(Paths.get(subsystem.path(), parm)); > })) { > return bufferedReader.readLine(); > } catch (PrivilegedActionException | IOException e) { > return null; > } > } > > Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap > PrivilegedActionException exception and throw the cause instead? > I think it's simpler to read and understand if the doPrivileged call is moved out as a separate method that will throw IOException as the expected functionality as suggested above. For SubSystem::getStringValue, one suggestion would be: diff --git a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java --- a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java +++ b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java @@ -29,7 +29,11 @@ import java.io.IOException; import java.math.BigInteger; import java.nio.file.Files; +import java.nio.file.Path; import java.nio.file.Paths; +import java.security.AccessController; +import java.security.PrivilegedActionException; +import java.security.PrivilegedExceptionAction; import java.util.ArrayList; import java.util.List; import java.util.Optional; @@ -90,9 +94,8 @@ public static String getStringValue(SubSystem subsystem, String parm) { if (subsystem == null) return null; - try(BufferedReader bufferedReader = Files.newBufferedReader(Paths.get(subsystem.path(), parm))) { - String line = bufferedReader.readLine(); - return line; + try { + return subsystem.readStringValue(parm); } catch (IOException e) { return null; @@ -100,6 +103,24 @@ } + private String readStringValue(String param) throws IOException { + PrivilegedExceptionAction pea = () -> Files.newBufferedReader(Paths.get(path(), param)); + try (BufferedReader bufferedReader = AccessController.doPrivileged(pea)) { + String line = bufferedReader.readLine(); + return line; + } catch (PrivilegedActionException e) { + Throwable x = e.getCause(); + if (x instanceof IOException) + throw (IOException)x; + if (x instanceof RuntimeException) + throw (RuntimeException)x; + if (x instanceof Error) + throw (Error)x; + + throw new InternalError(x); + } + } + From serguei.spitsyn at oracle.com Tue Dec 10 02:02:21 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Dec 2019 18:02:21 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> Message-ID: <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> Hi Daniil, It is not a full review, just some minor comments. In fact, I do not see real problems yet. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html ? 55???? public long getTotalSwapSpaceSize() { ? 56???????? if (containerMetrics != null) { ? 57???????????? long limit = containerMetrics.getMemoryAndSwapLimit(); ? 58???????????? // The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) ? 59???????????? // or if a docker container was started without specifying a memory limit ( without '--memory=' ? 60???????????? // Docker option). In latter case there is no limit on how much memory the container can use and ? 61???????????? // it can use as much memory as the host's OS allows. ? 62???????????? long memLimit = containerMetrics.getMemoryLimit(); ? 63???????????? if (limit >= 0 && memLimit >= 0) { ? 64???????????????? return limit - memLimit; ? 65???????????? } ? 66???????? } ? 67???????? return getTotalSwapSpaceSize0(); ? 68???? } ? Unneeded space after brackets '('. ? Do we need to check if the (limit - memLimit) value is negative? ? The same question is for getFreeSwapSpaceSize(): ??? memSwapLimit - memLimit - (memSwapUsage - memUsage) ? and getFreeMemorySize(): ??? 101 return limit - usage; ? 81???????????????????????? // If this happens just retry the loop for a few iterations ? Dot is missed at the end of comment. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html ? 34 System.out.println(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); ? 35 System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); ? 36 System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); ? 37 System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); ? 38 System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); ? 39 System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); ? 40 System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); ? 41 System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); ? 42 System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); ? 43 System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); ? To make the above lines a little bit shorter I'd suggest to define a log() method like this: ???? private static void log(String msg) ( System.out.println(msg(; } ? 34???????? log(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); ? 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); ? 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); ? 37 log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); ? 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); ? 39 log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); ? 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); ? 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); ? 42???????? log(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); ? 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); Thanks, Serguei On 12/6/19 17:41, Daniil Titov wrote: > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. > 3. The legacy methods were renamed as David suggested. > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > Yes, currently the code fails to do the initialization but it was unnoticed since method > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > was always -1. > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > in the output. > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > and "1030762496". > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > [0.305s][trace][os,container] Memory Usage is: 42979328 > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > 1030762496 > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > at java.base/java.lang.Thread.run(Thread.java:832) > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > >> > >> > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > >> > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > I thought that the error case we are referring to is limit == 0 which > indicates something unexpected goes wrong. So the compatibility concern > should be low. This is very specific to Metrics implementation for > cgroup v1 and let me know if I'm wrong. > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > >> > >> // the host data, value 0 indicates that something went wrong while the metric was read and > >> // in this case we return "information unavailable" code -1. > >> > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > limits. > > > > It's important to consider carefully if the monitoring API indicates an > error vs unavailable and an application should continue to run when the > monitoring system fails to get the metrics. > > There are several choices to report "something goes wrong" scenarios > (should unlikely happen???): > 1. fall back to a random positive value (e.g. host value) > 2. return a negative value > 3. throw an exception > > #3 is not an option as the application is not expecting this. For #2, > the application can filter bad values if desirable. > > I'm okay if you want to file a JBS issue to follow up and thoroughly > look at the cases that the metrics are unavailable and the cases when > fails to obtain. > > >> --- > >> > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > >> > >> ? > > or simply (as I commented [1]) > System.out.format > > Mandy > [1] > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > From mandy.chung at oracle.com Tue Dec 10 06:11:59 2019 From: mandy.chung at oracle.com (Mandy Chung) Date: Mon, 9 Dec 2019 22:11:59 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <48F75B58-0529-4B43-A355-A12EB0D09598@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com> <2b476e9f-0266-eba7-3629-114296a55f29@oracle.com> <48F75B58-0529-4B43-A355-A12EB0D09598@oracle.com> Message-ID: <1dc773d4-14e6-9209-2a4e-190699d88a47@oracle.com> On 12/9/19 3:47 PM, Daniil Titov wrote: > Hi Mandy and Bob, > > Please review a new version of the webrev [1] that moves doPrivileged calls in > jdk.internal.platform.cgroupv1.SubSystem to separate methods that throw > IOException, as Mandy suggested. > > Mach5 tests are still running. > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.06/ I reviewed Metrics and Subsystem in this version. I? think it's simpler to have unwrapIOExceptionAndRethrow handling the InternalError case. + List lines = subsystem.readMatchingLines(param); + for (String line : lines) { if (line.startsWith(match)) { retval = conversion.apply(line); break; } }This can simply call Metrics::readFilePrivileged and process on the Stream. return Metrics::readFilePrivileged(Paths.get(subsystem.path(), param)) .filter(line -> line.startsWith(match)) .map(conversion::apply) .findFirst().orElseGet(() ->retval); I don't need to see a new webrev. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Dec 10 10:11:23 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Dec 2019 20:11:23 +1000 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <447D72BA-1046-4131-9B13-C1F7244D87EA@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com> <447D72BA-1046-4131-9B13-C1F7244D87EA@oracle.com> Message-ID: <1996939d-54af-c34c-c796-cab3ec92b445@oracle.com> On 10/12/2019 5:31 am, Daniil Titov wrote: > Hi David, > >> So we never try to access the uninitialized counters.cpus array which is >> good but we still return garbage for counters.jvmTicks and >> counters.cpuTicks - surely that should have been noticeable? > > It only affected the first time the CPU load was requested. Function get_cpuload_internal(...) > calls get_totalticks() and get_jvmticks() functions that update these counters. > But on the first call, yes, it compares the newly received counters with the garbage. > > > It has the code that seems to be written to somehow mitigate it and in worse case just return > 0 or 1.0. But it also could be that there is some other problem this code tries to solve so I'm not sure > we should remove these workarounds as a part of the current fix. Please file a follow up RFE to look into this. Thanks, David > 274 // seems like we sometimes end up with less kernel ticks when > 275 // reading /proc/self/stat a second time, timing issue between cpus? > 276 if (pticks->usedKernel < tmp.usedKernel) { > 277 kdiff = 0; > 278 } else { > 279 kdiff = pticks->usedKernel - tmp.usedKernel; > 280 } > 281 tdiff = pticks->total - tmp.total; > 282 udiff = pticks->used - tmp.used; > 283 > 284 if (tdiff == 0) { > 285 user_load = 0; > 286 } else { > 287 if (tdiff < (udiff + kdiff)) { > 288 tdiff = udiff + kdiff; > 289 } > 290 *pkernelLoad = (kdiff / (double)tdiff); > 291 // BUG9044876, normalize return values to sane values > 292 *pkernelLoad = MAX(*pkernelLoad, 0.0); > 293 *pkernelLoad = MIN(*pkernelLoad, 1.0); > 294 > 295 user_load = (udiff / (double)tdiff); > 296 user_load = MAX(user_load, 0.0); > 297 user_load = MIN(user_load, 1.0); > 298 } > 299 } > > Best regards, > Daniil > > ?On 12/8/19, 8:49 PM, "David Holmes" wrote: > > Hi Daniil, > > On 7/12/2019 11:41 am, Daniil Titov wrote: > > Hi David, Mandy, and Bob, > > > > Thank you for reviewing this fix. > > > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > > Okay. > > > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > > Okay. > > > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > > but I agree that the changes proposed in the previous version of the webrev increase such probability. > > I filed the follow-up issue [4] as Mandy suggested. > > I added a comment to the bug. This is potentially a difficult problem to > resolve - it all depends on the likelihood of any errors and what they > really indicate. > > > 3. The legacy methods were renamed as David suggested. > > Thanks! > > > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > >> ! static int initialized=1; > >> > >> Am I reading this right that the code currently fails to actually do the > >> initialization because of this ??? > > > > Yes, currently the code fails to do the initialization but it was unnoticed since method > > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > > was always -1. > > So we never try to access the uninitialized counters.cpus array which is > good but we still return garbage for counters.jvmTicks and > counters.cpuTicks - surely that should have been noticeable? > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > > > > As I tried explain it earlier it would make the tests unstable. > > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > > in the output. > > Sorry I missed the earlier explanation. I find it somewhat surprising > that format() works that way, but without unlimited buffering there will > always be a need to flush the outputstream at some point. > > Thanks, > David > ----- > > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > > and "1030762496". > > > > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > [0.305s][trace][os,container] Memory Usage is: 42979328 > > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > > 1030762496 > > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > > at java.base/java.lang.Thread.run(Thread.java:832) > > > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > > > Thank you, > > Daniil > > > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > > >> > > >> > > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > > >> > > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > > > I thought that the error case we are referring to is limit == 0 which > > indicates something unexpected goes wrong. So the compatibility concern > > should be low. This is very specific to Metrics implementation for > > cgroup v1 and let me know if I'm wrong. > > > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > > >> > > >> // the host data, value 0 indicates that something went wrong while the metric was read and > > >> // in this case we return "information unavailable" code -1. > > >> > > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > > limits. > > > > > > > It's important to consider carefully if the monitoring API indicates an > > error vs unavailable and an application should continue to run when the > > monitoring system fails to get the metrics. > > > > There are several choices to report "something goes wrong" scenarios > > (should unlikely happen???): > > 1. fall back to a random positive value (e.g. host value) > > 2. return a negative value > > 3. throw an exception > > > > #3 is not an option as the application is not expecting this. For #2, > > the application can filter bad values if desirable. > > > > I'm okay if you want to file a JBS issue to follow up and thoroughly > > look at the cases that the metrics are unavailable and the cases when > > fails to obtain. > > > > >> --- > > >> > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > >> > > >> System.out.println(String.format(...) > > >> > > >> Why not simply > > >> > > >> System.out.printf(..) > > >> > > >> ? > > > > or simply (as I commented [1]) > > System.out.format > > > > Mandy > > [1] > > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > > > > > > > > From daniil.x.titov at oracle.com Tue Dec 10 17:49:51 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 10 Dec 2019 09:49:51 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <1996939d-54af-c34c-c796-cab3ec92b445@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com> <447D72BA-1046-4131-9B13-C1F7244D87EA@oracle.com> <1996939d-54af-c34c-c796-cab3ec92b445@oracle.com> Message-ID: <92CAC22E-53FE-422B-B5E0-EA6C959352FC@oracle.com> Hi David, > Please file a follow up RFE to look into this. I created an issue to follow this up [1] [1] https://bugs.openjdk.java.net/browse/JDK-8235681 Thank you, Daniil ?On 12/10/19, 2:11 AM, "David Holmes" wrote: On 10/12/2019 5:31 am, Daniil Titov wrote: > Hi David, > >> So we never try to access the uninitialized counters.cpus array which is >> good but we still return garbage for counters.jvmTicks and >> counters.cpuTicks - surely that should have been noticeable? > > It only affected the first time the CPU load was requested. Function get_cpuload_internal(...) > calls get_totalticks() and get_jvmticks() functions that update these counters. > But on the first call, yes, it compares the newly received counters with the garbage. > > > It has the code that seems to be written to somehow mitigate it and in worse case just return > 0 or 1.0. But it also could be that there is some other problem this code tries to solve so I'm not sure > we should remove these workarounds as a part of the current fix. Please file a follow up RFE to look into this. Thanks, David > 274 // seems like we sometimes end up with less kernel ticks when > 275 // reading /proc/self/stat a second time, timing issue between cpus? > 276 if (pticks->usedKernel < tmp.usedKernel) { > 277 kdiff = 0; > 278 } else { > 279 kdiff = pticks->usedKernel - tmp.usedKernel; > 280 } > 281 tdiff = pticks->total - tmp.total; > 282 udiff = pticks->used - tmp.used; > 283 > 284 if (tdiff == 0) { > 285 user_load = 0; > 286 } else { > 287 if (tdiff < (udiff + kdiff)) { > 288 tdiff = udiff + kdiff; > 289 } > 290 *pkernelLoad = (kdiff / (double)tdiff); > 291 // BUG9044876, normalize return values to sane values > 292 *pkernelLoad = MAX(*pkernelLoad, 0.0); > 293 *pkernelLoad = MIN(*pkernelLoad, 1.0); > 294 > 295 user_load = (udiff / (double)tdiff); > 296 user_load = MAX(user_load, 0.0); > 297 user_load = MIN(user_load, 1.0); > 298 } > 299 } > > Best regards, > Daniil > > ?On 12/8/19, 8:49 PM, "David Holmes" wrote: > > Hi Daniil, > > On 7/12/2019 11:41 am, Daniil Titov wrote: > > Hi David, Mandy, and Bob, > > > > Thank you for reviewing this fix. > > > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > > Okay. > > > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > > Okay. > > > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > > but I agree that the changes proposed in the previous version of the webrev increase such probability. > > I filed the follow-up issue [4] as Mandy suggested. > > I added a comment to the bug. This is potentially a difficult problem to > resolve - it all depends on the likelihood of any errors and what they > really indicate. > > > 3. The legacy methods were renamed as David suggested. > > Thanks! > > > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > >> ! static int initialized=1; > >> > >> Am I reading this right that the code currently fails to actually do the > >> initialization because of this ??? > > > > Yes, currently the code fails to do the initialization but it was unnoticed since method > > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > > was always -1. > > So we never try to access the uninitialized counters.cpus array which is > good but we still return garbage for counters.jvmTicks and > counters.cpuTicks - surely that should have been noticeable? > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > > > > As I tried explain it earlier it would make the tests unstable. > > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > > in the output. > > Sorry I missed the earlier explanation. I find it somewhat surprising > that format() works that way, but without unlimited buffering there will > always be a need to flush the outputstream at some point. > > Thanks, > David > ----- > > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > > and "1030762496". > > > > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > [0.305s][trace][os,container] Memory Usage is: 42979328 > > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > > 1030762496 > > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > > at java.base/java.lang.Thread.run(Thread.java:832) > > > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > > > Thank you, > > Daniil > > > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > > >> > > >> > > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > > >> > > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > > > I thought that the error case we are referring to is limit == 0 which > > indicates something unexpected goes wrong. So the compatibility concern > > should be low. This is very specific to Metrics implementation for > > cgroup v1 and let me know if I'm wrong. > > > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > > >> > > >> // the host data, value 0 indicates that something went wrong while the metric was read and > > >> // in this case we return "information unavailable" code -1. > > >> > > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > > limits. > > > > > > > It's important to consider carefully if the monitoring API indicates an > > error vs unavailable and an application should continue to run when the > > monitoring system fails to get the metrics. > > > > There are several choices to report "something goes wrong" scenarios > > (should unlikely happen???): > > 1. fall back to a random positive value (e.g. host value) > > 2. return a negative value > > 3. throw an exception > > > > #3 is not an option as the application is not expecting this. For #2, > > the application can filter bad values if desirable. > > > > I'm okay if you want to file a JBS issue to follow up and thoroughly > > look at the cases that the metrics are unavailable and the cases when > > fails to obtain. > > > > >> --- > > >> > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > >> > > >> System.out.println(String.format(...) > > >> > > >> Why not simply > > >> > > >> System.out.printf(..) > > >> > > >> ? > > > > or simply (as I commented [1]) > > System.out.format > > > > Mandy > > [1] > > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > > > > > > > > From daniil.x.titov at oracle.com Tue Dec 10 18:29:56 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 10 Dec 2019 10:29:56 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> Message-ID: <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com> Hi Serguei, > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow up issue I created for that [1]. [1] https://bugs.openjdk.java.net/browse/JDK-8235522 Thank you, Daniil ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, It is not a full review, just some minor comments. In fact, I do not see real problems yet. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html 55 public long getTotalSwapSpaceSize() { 56 if (containerMetrics != null) { 57 long limit = containerMetrics.getMemoryAndSwapLimit(); 58 // The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) 59 // or if a docker container was started without specifying a memory limit ( without '--memory=' 60 // Docker option). In latter case there is no limit on how much memory the container can use and 61 // it can use as much memory as the host's OS allows. 62 long memLimit = containerMetrics.getMemoryLimit(); 63 if (limit >= 0 && memLimit >= 0) { 64 return limit - memLimit; 65 } 66 } 67 return getTotalSwapSpaceSize0(); 68 } Unneeded space after brackets '('. Do we need to check if the (limit - memLimit) value is negative? The same question is for getFreeSwapSpaceSize(): memSwapLimit - memLimit - (memSwapUsage - memUsage) and getFreeMemorySize(): 101 return limit - usage; 81 // If this happens just retry the loop for a few iterations Dot is missed at the end of comment. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html 34 System.out.println(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); To make the above lines a little bit shorter I'd suggest to define a log() method like this: private static void log(String msg) ( System.out.println(msg(; } 34 log(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); Thanks, Serguei On 12/6/19 17:41, Daniil Titov wrote: > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. > 3. The legacy methods were renamed as David suggested. > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > Yes, currently the code fails to do the initialization but it was unnoticed since method > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > was always -1. > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > in the output. > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > and "1030762496". > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > [0.305s][trace][os,container] Memory Usage is: 42979328 > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > 1030762496 > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > at java.base/java.lang.Thread.run(Thread.java:832) > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > >> > >> > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > >> > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > I thought that the error case we are referring to is limit == 0 which > indicates something unexpected goes wrong. So the compatibility concern > should be low. This is very specific to Metrics implementation for > cgroup v1 and let me know if I'm wrong. > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > >> > >> // the host data, value 0 indicates that something went wrong while the metric was read and > >> // in this case we return "information unavailable" code -1. > >> > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > limits. > > > > It's important to consider carefully if the monitoring API indicates an > error vs unavailable and an application should continue to run when the > monitoring system fails to get the metrics. > > There are several choices to report "something goes wrong" scenarios > (should unlikely happen???): > 1. fall back to a random positive value (e.g. host value) > 2. return a negative value > 3. throw an exception > > #3 is not an option as the application is not expecting this. For #2, > the application can filter bad values if desirable. > > I'm okay if you want to file a JBS issue to follow up and thoroughly > look at the cases that the metrics are unavailable and the cases when > fails to obtain. > > >> --- > >> > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > >> > >> ? > > or simply (as I commented [1]) > System.out.format > > Mandy > [1] > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > From richard.reingruber at sap.com Tue Dec 10 21:45:28 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 10 Dec 2019 21:45:28 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Message-ID: Hi, I would like to get reviews please for http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ Corresponding RFE: https://bugs.openjdk.java.net/browse/JDK-8227745 Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the change is being tested at SAP since I posted the first RFR some months ago. The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI agents request capabilities that allow them to access local variable values. E.g. if you start-up with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right from the beginning, well before a debugger attaches -- if ever one should do so. With the enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based optimizations are reverted just before an agent acquires the reference to an object. In the JBS item you'll find more details. Thanks, Richard. [1] Experimental fix for JDK-8214584 based on JDK-8227745 http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch From chris.plummer at oracle.com Wed Dec 11 02:52:19 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 10 Dec 2019 18:52:19 -0800 Subject: Removal of SA javascript support Message-ID: Hi, I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it. If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation. There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following: ?registerCommand("class", "class name", "jclass"); ?registerCommand("classes", "classes", "jclasses"); ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass"); ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); ?registerCommand("mem", "mem address [ length ]", "printMem"); ?registerCommand("sysprops", "sysprops", "sysProps"); ?registerCommand("whatis", "whatis address", "printWhatis"); Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1]. The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command.? The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented. The real purpose of the email is to propose removal of this support. Here are the reasons: (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered. (2) Nashorn is deprecated and will be removed eventually. (3) We have very little understanding of the javascript support. (4) No resources to work on it (unless there is a community volunteer). (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value. Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript. I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb: ???? Warning! JS Engine can't start, some commands will not be available. This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about. The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15. Please let me know what you think. thanks, Chris [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html [3] https://bugs.openjdk.java.net/browse/JDK-8235594 [4] https://bugs.openjdk.java.net/browse/JDK-8234277 From rednaxelafx at gmail.com Wed Dec 11 03:26:54 2019 From: rednaxelafx at gmail.com (Krystal Mok) Date: Tue, 10 Dec 2019 19:26:54 -0800 Subject: Removal of SA javascript support In-Reply-To: References: Message-ID: Hi Chris, Thanks for the proposal. I used to be one of the few heavy users of jsload / jseval in CLHSDB back in the JDK6 to JDK8 era. The way I used to use it is to quickly prototype new functionality in JS and later bake it into Java code, and also for exploring heap dumps beneath the existing commands available in CLHSDB (i.e. the underlying SA API is far more powerful than the set of commands exposed in HSDB). I even collected my own library of SA-based JS functions for easy navigation of Java heap dumps. e.g. this objtree command: https://gist.github.com/rednaxelafx/1393698#file-objtree-js I'm sad to see it go but given its current state I'd +1 on the proposal to remove it now. Best regards, Kris On Tue, Dec 10, 2019 at 6:52 PM Chris Plummer wrote: > Hi, > > I like to propose the removal of SA javascript support. Few people even > realize this support exists, and hopefully even fewer are using it since > I'd like to remove it. Since I'm new to this myself, let me first > explain what I know about it's existence, and then explain why I want to > remove it. > > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't > look for them in anything post JDK 8. I'll explain why later. jsload is > used to load a javascript file. In that file you can register new clhsdb > commands that are written in javascript. You can also evaluate > javascript using the jseval command. Some of this is explained in [1], > which is the only place I can find any reference to this support. It > does not appear to be officially supported, nor is there any oracle > provided documentation. > > There also appear to be a few clhsdb commands that are written in > javascript. Doing a grep for "registerCommand" in sa.js shows the > following: > > registerCommand("class", "class name", "jclass"); > registerCommand("classes", "classes", "jclasses"); > registerCommand("dumpclass", "dumpclass { address | name } [ directory > ]", "dclass"); > registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); > registerCommand("mem", "mem address [ length ]", "printMem"); > registerCommand("sysprops", "sysprops", "sysProps"); > registerCommand("whatis", "whatis address", "printWhatis"); > > Once again, don't go looking for these in anything newer than JDK8. You > won't find them. Again the only documentation I can fine is [1]. > > The other use of Javascript is the SOQL command (Simple Object Query > Language), a tool used to query the heap, and also the JSDB command. > The only SOQL documentation I could find is the blog reference [2]. I > could not find HSDB documentation, but I believe is is a javascript > support for looking at hotspot. So once again, neither of these seem to > be officially supported or documented. > > The real purpose of the email is to propose removal of this support. > Here are the reasons: > > (1) It's broken, and has been since 9. See [3]. This is why you don't > see the javascript related commands in clhsdb. Javascript fails to > initialize, so none of the javascript related commands are registered. > (2) Nashorn is deprecated and will be removed eventually. > (3) We have very little understanding of the javascript support. > (4) No resources to work on it (unless there is a community volunteer). > (5) Very questionable value (lack of users). The fact this support has > been broken since JDK 9 and no bug was filed until I did so this week is > a good indication of that. Another is that there are no other SA > Javascript related bugs filed. Lastly, the lack of any official > documentation and only minimal mention of it on the web is another good > indication of it's (lack of) value. > > Also, regarding the 7 commands listed above that would be lost (but > currently don't work now anyway), if they are really wanted, they could > be implemented in java instead of javascript. > > I'd like to remove javascript support in two steps. The first is simply > disable the clhsdb code that tries to initialize the javascript support. > I'd like to do this in 14 (actually as soon as possible). I'd like to > actually do this now even if we decide to keep javascript support and > eventually fix it because it will get rid of the warning you see > whenever you attach from clhsdb: > > Warning! JS Engine can't start, some commands will not be available. > > This warning will become more of an issue for the clhsdb tests after I > push [4] because then you will also see the full stacktrace for the > underlying exception that caused the Javascript to fail to start. > Besides being unnecessary noise in passing test cases, it can also be > misleading in any test that fails because the exception will be > unrelated to the failure. This is actually what got me going down this > path of what the javascript support is all about. > > The next step would be to strip out all Javascript related code, > including the SOQL and JSDB tools. This would be done in 15. > > Please let me know what you think. > > thanks, > > Chris > > [1] > > https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html > [2] > > http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sundararajan.athijegannathan at oracle.com Wed Dec 11 03:38:16 2019 From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com) Date: Wed, 11 Dec 2019 09:08:16 +0530 Subject: Removal of SA javascript support In-Reply-To: References: Message-ID: Hi Kris, Glad to hear that someone used JS interface of SA :) Quick prototyping + debugger interactive scripting were the goals of JS interface! As you mentioned, given the current state of SA JS interface, it has to be removed :( Thanks -Sundar On 11/12/19 8:56 am, Krystal Mok wrote: > Hi Chris, > > Thanks for the proposal. I used to be one of the few heavy users of > jsload / jseval in CLHSDB back in the JDK6 to JDK8 era. The way I used > to use it is to quickly prototype new functionality in JS and later > bake it into Java code, and also for exploring heap dumps beneath the > existing commands available in CLHSDB (i.e. the underlying SA API is > far more powerful than the set of commands exposed in HSDB). > > I even collected my own library of SA-based JS functions for easy > navigation of Java heap dumps. e.g. this objtree command: > https://gist.github.com/rednaxelafx/1393698#file-objtree-js > > I'm sad to see it go but given its current state I'd?+1 on the > proposal to remove it now. > > Best regards, > Kris > > On Tue, Dec 10, 2019 at 6:52 PM Chris Plummer > > wrote: > > Hi, > > I like to propose the removal of SA javascript support. Few people > even > realize this support exists, and hopefully even fewer are using it > since > I'd like to remove it. Since I'm new to this myself, let me first > explain what I know about it's existence, and then explain why I > want to > remove it. > > If you run "jhsdb clhsdb", there are jsload and jseval commands. > Don't > look for them in anything post JDK 8. I'll explain why later. > jsload is > used to load a javascript file. In that file you can register new > clhsdb > commands that are written in javascript. You can also evaluate > javascript using the jseval command. Some of this is explained in > [1], > which is the only place I can find any reference to this support. It > does not appear to be officially supported, nor is there any oracle > provided documentation. > > There also appear to be a few clhsdb commands that are written in > javascript. Doing a grep for "registerCommand" in sa.js shows the > following: > > ??registerCommand("class", "class name", "jclass"); > ??registerCommand("classes", "classes", "jclasses"); > ??registerCommand("dumpclass", "dumpclass { address | name } [ > directory > ]", "dclass"); > ??registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); > ??registerCommand("mem", "mem address [ length ]", "printMem"); > ??registerCommand("sysprops", "sysprops", "sysProps"); > ??registerCommand("whatis", "whatis address", "printWhatis"); > > Once again, don't go looking for these in anything newer than > JDK8. You > won't find them. Again the only documentation I can fine is [1]. > > The other use of Javascript is the SOQL command (Simple Object Query > Language), a tool used to query the heap, and also the JSDB command. > The only SOQL documentation I could find is the blog reference [2]. I > could not find HSDB documentation, but I believe is is a javascript > support for looking at hotspot. So once again, neither of these > seem to > be officially supported or documented. > > The real purpose of the email is to propose removal of this support. > Here are the reasons: > > (1) It's broken, and has been since 9. See [3]. This is why you don't > see the javascript related commands in clhsdb. Javascript fails to > initialize, so none of the javascript related commands are registered. > (2) Nashorn is deprecated and will be removed eventually. > (3) We have very little understanding of the javascript support. > (4) No resources to work on it (unless there is a community > volunteer). > (5) Very questionable value (lack of users). The fact this support > has > been broken since JDK 9 and no bug was filed until I did so this > week is > a good indication of that. Another is that there are no other SA > Javascript related bugs filed. Lastly, the lack of any official > documentation and only minimal mention of it on the web is another > good > indication of it's (lack of) value. > > Also, regarding the 7 commands listed above that would be lost (but > currently don't work now anyway), if they are really wanted, they > could > be implemented in java instead of javascript. > > I'd like to remove javascript support in two steps. The first is > simply > disable the clhsdb code that tries to initialize the javascript > support. > I'd like to do this in 14 (actually as soon as possible). I'd like to > actually do this now even if we decide to keep javascript support and > eventually fix it because it will get rid of the warning you see > whenever you attach from clhsdb: > > ????? Warning! JS Engine can't start, some commands will not be > available. > > This warning will become more of an issue for the clhsdb tests > after I > push [4] because then you will also see the full stacktrace for the > underlying exception that caused the Javascript to fail to start. > Besides being unnecessary noise in passing test cases, it can also be > misleading in any test that fails because the exception will be > unrelated to the failure. This is actually what got me going down > this > path of what the javascript support is all about. > > The next step would be to strip out all Javascript related code, > including the SOQL and JSDB tools. This would be done in 15. > > Please let me know what you think. > > thanks, > > Chris > > [1] > https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html > [2] > http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rednaxelafx at gmail.com Wed Dec 11 03:49:37 2019 From: rednaxelafx at gmail.com (Krystal Mok) Date: Tue, 10 Dec 2019 19:49:37 -0800 Subject: Removal of SA javascript support In-Reply-To: References: Message-ID: Thank you very much for the work on the JS support in SA, Sundar! I really loved it and depended on it. That's why back in the day when the JS support was broken from time to time I'd get affected and annoyed and send fixes for them... A bit of suggestion: I still see a lot of value for having a proper REPL beyond CLHSDB for the same purposes as the JS support. Would it be possible for the serviceability team or someone from the community to invest in integrating JShell into the SA world? Thanks, Kris On Tue, Dec 10, 2019 at 7:38 PM wrote: > Hi Kris, > > Glad to hear that someone used JS interface of SA :) Quick prototyping + > debugger interactive scripting were the goals of JS interface! As you > mentioned, given the current state of SA JS interface, it has to be removed > :( > > Thanks > > -Sundar > On 11/12/19 8:56 am, Krystal Mok wrote: > > Hi Chris, > > Thanks for the proposal. I used to be one of the few heavy users of jsload > / jseval in CLHSDB back in the JDK6 to JDK8 era. The way I used to use it > is to quickly prototype new functionality in JS and later bake it into Java > code, and also for exploring heap dumps beneath the existing commands > available in CLHSDB (i.e. the underlying SA API is far more powerful than > the set of commands exposed in HSDB). > > I even collected my own library of SA-based JS functions for easy > navigation of Java heap dumps. e.g. this objtree command: > https://gist.github.com/rednaxelafx/1393698#file-objtree-js > > I'm sad to see it go but given its current state I'd +1 on the proposal to > remove it now. > > Best regards, > Kris > > On Tue, Dec 10, 2019 at 6:52 PM Chris Plummer > wrote: > >> Hi, >> >> I like to propose the removal of SA javascript support. Few people even >> realize this support exists, and hopefully even fewer are using it since >> I'd like to remove it. Since I'm new to this myself, let me first >> explain what I know about it's existence, and then explain why I want to >> remove it. >> >> If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't >> look for them in anything post JDK 8. I'll explain why later. jsload is >> used to load a javascript file. In that file you can register new clhsdb >> commands that are written in javascript. You can also evaluate >> javascript using the jseval command. Some of this is explained in [1], >> which is the only place I can find any reference to this support. It >> does not appear to be officially supported, nor is there any oracle >> provided documentation. >> >> There also appear to be a few clhsdb commands that are written in >> javascript. Doing a grep for "registerCommand" in sa.js shows the >> following: >> >> registerCommand("class", "class name", "jclass"); >> registerCommand("classes", "classes", "jclasses"); >> registerCommand("dumpclass", "dumpclass { address | name } [ directory >> ]", "dclass"); >> registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); >> registerCommand("mem", "mem address [ length ]", "printMem"); >> registerCommand("sysprops", "sysprops", "sysProps"); >> registerCommand("whatis", "whatis address", "printWhatis"); >> >> Once again, don't go looking for these in anything newer than JDK8. You >> won't find them. Again the only documentation I can fine is [1]. >> >> The other use of Javascript is the SOQL command (Simple Object Query >> Language), a tool used to query the heap, and also the JSDB command. >> The only SOQL documentation I could find is the blog reference [2]. I >> could not find HSDB documentation, but I believe is is a javascript >> support for looking at hotspot. So once again, neither of these seem to >> be officially supported or documented. >> >> The real purpose of the email is to propose removal of this support. >> Here are the reasons: >> >> (1) It's broken, and has been since 9. See [3]. This is why you don't >> see the javascript related commands in clhsdb. Javascript fails to >> initialize, so none of the javascript related commands are registered. >> (2) Nashorn is deprecated and will be removed eventually. >> (3) We have very little understanding of the javascript support. >> (4) No resources to work on it (unless there is a community volunteer). >> (5) Very questionable value (lack of users). The fact this support has >> been broken since JDK 9 and no bug was filed until I did so this week is >> a good indication of that. Another is that there are no other SA >> Javascript related bugs filed. Lastly, the lack of any official >> documentation and only minimal mention of it on the web is another good >> indication of it's (lack of) value. >> >> Also, regarding the 7 commands listed above that would be lost (but >> currently don't work now anyway), if they are really wanted, they could >> be implemented in java instead of javascript. >> >> I'd like to remove javascript support in two steps. The first is simply >> disable the clhsdb code that tries to initialize the javascript support. >> I'd like to do this in 14 (actually as soon as possible). I'd like to >> actually do this now even if we decide to keep javascript support and >> eventually fix it because it will get rid of the warning you see >> whenever you attach from clhsdb: >> >> Warning! JS Engine can't start, some commands will not be available. >> >> This warning will become more of an issue for the clhsdb tests after I >> push [4] because then you will also see the full stacktrace for the >> underlying exception that caused the Javascript to fail to start. >> Besides being unnecessary noise in passing test cases, it can also be >> misleading in any test that fails because the exception will be >> unrelated to the failure. This is actually what got me going down this >> path of what the javascript support is all about. >> >> The next step would be to strip out all Javascript related code, >> including the SOQL and JSDB tools. This would be done in 15. >> >> Please let me know what you think. >> >> thanks, >> >> Chris >> >> [1] >> >> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html >> [2] >> >> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html >> [3] https://bugs.openjdk.java.net/browse/JDK-8235594 >> [4] https://bugs.openjdk.java.net/browse/JDK-8234277 >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Wed Dec 11 05:33:41 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 11 Dec 2019 14:33:41 +0900 Subject: Removal of SA javascript support In-Reply-To: References: Message-ID: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> Hi Chris, It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw. However I want SA to implement pluggable feature. I use custom script to list compiled codes in CodeCache. I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash. Thanks, Yasumasa On 2019/12/11 11:52, Chris Plummer wrote: > Hi, > > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it. > > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation. > > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following: > > ?registerCommand("class", "class name", "jclass"); > ?registerCommand("classes", "classes", "jclasses"); > ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass"); > ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); > ?registerCommand("mem", "mem address [ length ]", "printMem"); > ?registerCommand("sysprops", "sysprops", "sysProps"); > ?registerCommand("whatis", "whatis address", "printWhatis"); > > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1]. > > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented. > > The real purpose of the email is to propose removal of this support. Here are the reasons: > > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered. > (2) Nashorn is deprecated and will be removed eventually. > (3) We have very little understanding of the javascript support. > (4) No resources to work on it (unless there is a community volunteer). > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value. > > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript. > > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb: > > ???? Warning! JS Engine can't start, some commands will not be available. > > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about. > > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15. > > Please let me know what you think. > > thanks, > > Chris > > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 > From rednaxelafx at gmail.com Wed Dec 11 05:39:27 2019 From: rednaxelafx at gmail.com (Krystal Mok) Date: Tue, 10 Dec 2019 21:39:27 -0800 Subject: Removal of SA javascript support In-Reply-To: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> Message-ID: Hi Yasumasa, That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right? [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246 - Kris On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga wrote: > Hi Chris, > > It's a sad proposal, but I agree with you. To maintain SA in JS is > difficult since Jigsaw. > However I want SA to implement pluggable feature. > I use custom script to list compiled codes in CodeCache. > > I guess other troubleshooters also want similar feature (via jsload) in > future if they encounter JVM crash. > > > Thanks, > > Yasumasa > > > On 2019/12/11 11:52, Chris Plummer wrote: > > Hi, > > > > I like to propose the removal of SA javascript support. Few people even > realize this support exists, and hopefully even fewer are using it since > I'd like to remove it. Since I'm new to this myself, let me first explain > what I know about it's existence, and then explain why I want to remove it. > > > > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't > look for them in anything post JDK 8. I'll explain why later. jsload is > used to load a javascript file. In that file you can register new clhsdb > commands that are written in javascript. You can also evaluate javascript > using the jseval command. Some of this is explained in [1], which is the > only place I can find any reference to this support. It does not appear to > be officially supported, nor is there any oracle provided documentation. > > > > There also appear to be a few clhsdb commands that are written in > javascript. Doing a grep for "registerCommand" in sa.js shows the following: > > > > registerCommand("class", "class name", "jclass"); > > registerCommand("classes", "classes", "jclasses"); > > registerCommand("dumpclass", "dumpclass { address | name } [ directory > ]", "dclass"); > > registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); > > registerCommand("mem", "mem address [ length ]", "printMem"); > > registerCommand("sysprops", "sysprops", "sysProps"); > > registerCommand("whatis", "whatis address", "printWhatis"); > > > > Once again, don't go looking for these in anything newer than JDK8. You > won't find them. Again the only documentation I can fine is [1]. > > > > The other use of Javascript is the SOQL command (Simple Object Query > Language), a tool used to query the heap, and also the JSDB command. The > only SOQL documentation I could find is the blog reference [2]. I could not > find HSDB documentation, but I believe is is a javascript support for > looking at hotspot. So once again, neither of these seem to be officially > supported or documented. > > > > The real purpose of the email is to propose removal of this support. > Here are the reasons: > > > > (1) It's broken, and has been since 9. See [3]. This is why you don't > see the javascript related commands in clhsdb. Javascript fails to > initialize, so none of the javascript related commands are registered. > > (2) Nashorn is deprecated and will be removed eventually. > > (3) We have very little understanding of the javascript support. > > (4) No resources to work on it (unless there is a community volunteer). > > (5) Very questionable value (lack of users). The fact this support has > been broken since JDK 9 and no bug was filed until I did so this week is a > good indication of that. Another is that there are no other SA Javascript > related bugs filed. Lastly, the lack of any official documentation and only > minimal mention of it on the web is another good indication of it's (lack > of) value. > > > > Also, regarding the 7 commands listed above that would be lost (but > currently don't work now anyway), if they are really wanted, they could be > implemented in java instead of javascript. > > > > I'd like to remove javascript support in two steps. The first is simply > disable the clhsdb code that tries to initialize the javascript support. > I'd like to do this in 14 (actually as soon as possible). I'd like to > actually do this now even if we decide to keep javascript support and > eventually fix it because it will get rid of the warning you see whenever > you attach from clhsdb: > > > > Warning! JS Engine can't start, some commands will not be > available. > > > > This warning will become more of an issue for the clhsdb tests after I > push [4] because then you will also see the full stacktrace for the > underlying exception that caused the Javascript to fail to start. Besides > being unnecessary noise in passing test cases, it can also be misleading in > any test that fails because the exception will be unrelated to the failure. > This is actually what got me going down this path of what the javascript > support is all about. > > > > The next step would be to strip out all Javascript related code, > including the SOQL and JSDB tools. This would be done in 15. > > > > Please let me know what you think. > > > > thanks, > > > > Chris > > > > [1] > https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html > > [2] > http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html > > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 > > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Wed Dec 11 05:56:46 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 11 Dec 2019 14:56:46 +0900 Subject: Removal of SA javascript support In-Reply-To: References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> Message-ID: <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com> On 2019/12/11 14:39, Krystal Mok wrote: > Hi?Yasumasa, > > That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right? Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc... Yasumasa > [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246 > > - Kris > > On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga > wrote: > > Hi Chris, > > It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw. > However I want SA to implement pluggable feature. > I use custom script to list compiled codes in CodeCache. > > I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash. > > > Thanks, > > Yasumasa > > > On 2019/12/11 11:52, Chris Plummer wrote: > > Hi, > > > > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it. > > > > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation. > > > > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following: > > > >? ?registerCommand("class", "class name", "jclass"); > >? ?registerCommand("classes", "classes", "jclasses"); > >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass"); > >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); > >? ?registerCommand("mem", "mem address [ length ]", "printMem"); > >? ?registerCommand("sysprops", "sysprops", "sysProps"); > >? ?registerCommand("whatis", "whatis address", "printWhatis"); > > > > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1]. > > > > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented. > > > > The real purpose of the email is to propose removal of this support. Here are the reasons: > > > > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered. > > (2) Nashorn is deprecated and will be removed eventually. > > (3) We have very little understanding of the javascript support. > > (4) No resources to work on it (unless there is a community volunteer). > > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value. > > > > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript. > > > > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb: > > > >? ???? Warning! JS Engine can't start, some commands will not be available. > > > > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about. > > > > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15. > > > > Please let me know what you think. > > > > thanks, > > > > Chris > > > > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html > > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html > > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 > > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 > > > From chris.plummer at oracle.com Wed Dec 11 06:00:05 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 10 Dec 2019 22:00:05 -0800 Subject: Removal of SA javascript support In-Reply-To: References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> Message-ID: <3a11d62c-60fa-e0ac-f4f2-475445729cb3@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Dec 11 06:00:54 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 10 Dec 2019 22:00:54 -0800 Subject: Removal of SA javascript support In-Reply-To: <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com> References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com> Message-ID: <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote: > On 2019/12/11 14:39, Krystal Mok wrote: >> Hi?Yasumasa, >> >> That's a very nice idea. Basically what you're asking for is exposing >> the Command interface [1] so that plugins can implement it and get >> dynamically loaded / registered into CLHSDB / HSDB, right? > > Yes, but we also need proxy API to access internal SA objects e.g. > CodeCache, JavaThread, TypeDataBase, etc... > Yes, or export them. I should have read this email before posting my previous one. Chris > > Yasumasa > > >> [1]: >> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246 >> >> - Kris >> >> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga >> > wrote: >> >> ??? Hi Chris, >> >> ??? It's a sad proposal, but I agree with you. To maintain SA in JS >> is difficult since Jigsaw. >> ??? However I want SA to implement pluggable feature. >> ??? I use custom script to list compiled codes in CodeCache. >> >> ??? I guess other troubleshooters also want similar feature (via >> jsload) in future if they encounter JVM crash. >> >> >> ??? Thanks, >> >> ??? Yasumasa >> >> >> ??? On 2019/12/11 11:52, Chris Plummer wrote: >> ???? > Hi, >> ???? > >> ???? > I like to propose the removal of SA javascript support. Few >> people even realize this support exists, and hopefully even fewer are >> using it since I'd like to remove it. Since I'm new to this myself, >> let me first explain what I know about it's existence, and then >> explain why I want to remove it. >> ???? > >> ???? > If you run "jhsdb clhsdb", there are jsload and jseval >> commands. Don't look for them in anything post JDK 8. I'll explain >> why later. jsload is used to load a javascript file. In that file you >> can register new clhsdb commands that are written in javascript. You >> can also evaluate javascript using the jseval command. Some of this >> is explained in [1], which is the only place I can find any reference >> to this support. It does not appear to be officially supported, nor >> is there any oracle provided documentation. >> ???? > >> ???? > There also appear to be a few clhsdb commands that are written >> in javascript. Doing a grep for "registerCommand" in sa.js shows the >> following: >> ???? > >> ???? >? ?registerCommand("class", "class name", "jclass"); >> ???? >? ?registerCommand("classes", "classes", "jclasses"); >> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ >> directory ]", "dclass"); >> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); >> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem"); >> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps"); >> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis"); >> ???? > >> ???? > Once again, don't go looking for these in anything newer than >> JDK8. You won't find them. Again the only documentation I can fine is >> [1]. >> ???? > >> ???? > The other use of Javascript is the SOQL command (Simple Object >> Query Language), a tool used to query the heap, and also the JSDB >> command. The only SOQL documentation I could find is the blog >> reference [2]. I could not find HSDB documentation, but I believe is >> is a javascript support for looking at hotspot. So once again, >> neither of these seem to be officially supported or documented. >> ???? > >> ???? > The real purpose of the email is to propose removal of this >> support. Here are the reasons: >> ???? > >> ???? > (1) It's broken, and has been since 9. See [3]. This is why >> you don't see the javascript related commands in clhsdb. Javascript >> fails to initialize, so none of the javascript related commands are >> registered. >> ???? > (2) Nashorn is deprecated and will be removed eventually. >> ???? > (3) We have very little understanding of the javascript support. >> ???? > (4) No resources to work on it (unless there is a community >> volunteer). >> ???? > (5) Very questionable value (lack of users). The fact this >> support has been broken since JDK 9 and no bug was filed until I did >> so this week is a good indication of that. Another is that there are >> no other SA Javascript related bugs filed. Lastly, the lack of any >> official documentation and only minimal mention of it on the web is >> another good indication of it's (lack of) value. >> ???? > >> ???? > Also, regarding the 7 commands listed above that would be lost >> (but currently don't work now anyway), if they are really wanted, >> they could be implemented in java instead of javascript. >> ???? > >> ???? > I'd like to remove javascript support in two steps. The first >> is simply disable the clhsdb code that tries to initialize the >> javascript support. I'd like to do this in 14 (actually as soon as >> possible). I'd like to actually do this now even if we decide to keep >> javascript support and eventually fix it because it will get rid of >> the warning you see whenever you attach from clhsdb: >> ???? > >> ???? >? ???? Warning! JS Engine can't start, some commands will not >> be available. >> ???? > >> ???? > This warning will become more of an issue for the clhsdb tests >> after I push [4] because then you will also see the full stacktrace >> for the underlying exception that caused the Javascript to fail to >> start. Besides being unnecessary noise in passing test cases, it can >> also be misleading in any test that fails because the exception will >> be unrelated to the failure. This is actually what got me going down >> this path of what the javascript support is all about. >> ???? > >> ???? > The next step would be to strip out all Javascript related >> code, including the SOQL and JSDB tools. This would be done in 15. >> ???? > >> ???? > Please let me know what you think. >> ???? > >> ???? > thanks, >> ???? > >> ???? > Chris >> ???? > >> ???? > [1] >> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html >> ???? > [2] >> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html >> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 >> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 >> ???? > >> From suenaga at oss.nttdata.com Wed Dec 11 06:27:40 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 11 Dec 2019 15:27:40 +0900 Subject: Removal of SA javascript support In-Reply-To: <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com> References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com> <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com> Message-ID: Hi, IMHO we need to export all packages in SA if we do not provide new API for SA. sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 (before Jigsaw), so we could make various functions if we need. OTOH we cannot know what classes are needed by the SA users. All packages in jdk.hotspot.agent module provides features, and they require other packages. For example, sun.jvm.hotspot.oops.Oop requires sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger . It is difficult to track and to export minimally. (I worked for it in JDK-8157947, but I gave up...) Thus I guess it is a big challenge to export SA classes without refactoring. If we provide new API for SA plugin, I guess we need to work some refactoring. Yasumasa On 2019/12/11 15:00, Chris Plummer wrote: > On 12/10/19 9:56 PM, Yasumasa Suenaga wrote: >> On 2019/12/11 14:39, Krystal Mok wrote: >>> Hi?Yasumasa, >>> >>> That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right? >> >> Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc... >> > Yes, or export them. I should have read this email before posting my previous one. > > Chris >> >> Yasumasa >> >> >>> [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246 >>> >>> - Kris >>> >>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga > wrote: >>> >>> ??? Hi Chris, >>> >>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw. >>> ??? However I want SA to implement pluggable feature. >>> ??? I use custom script to list compiled codes in CodeCache. >>> >>> ??? I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash. >>> >>> >>> ??? Thanks, >>> >>> ??? Yasumasa >>> >>> >>> ??? On 2019/12/11 11:52, Chris Plummer wrote: >>> ???? > Hi, >>> ???? > >>> ???? > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it. >>> ???? > >>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation. >>> ???? > >>> ???? > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following: >>> ???? > >>> ???? >? ?registerCommand("class", "class name", "jclass"); >>> ???? >? ?registerCommand("classes", "classes", "jclasses"); >>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass"); >>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); >>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem"); >>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps"); >>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis"); >>> ???? > >>> ???? > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1]. >>> ???? > >>> ???? > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented. >>> ???? > >>> ???? > The real purpose of the email is to propose removal of this support. Here are the reasons: >>> ???? > >>> ???? > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered. >>> ???? > (2) Nashorn is deprecated and will be removed eventually. >>> ???? > (3) We have very little understanding of the javascript support. >>> ???? > (4) No resources to work on it (unless there is a community volunteer). >>> ???? > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value. >>> ???? > >>> ???? > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript. >>> ???? > >>> ???? > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb: >>> ???? > >>> ???? >? ???? Warning! JS Engine can't start, some commands will not be available. >>> ???? > >>> ???? > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about. >>> ???? > >>> ???? > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15. >>> ???? > >>> ???? > Please let me know what you think. >>> ???? > >>> ???? > thanks, >>> ???? > >>> ???? > Chris >>> ???? > >>> ???? > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html >>> ???? > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html >>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 >>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 >>> ???? > >>> > > From david.holmes at oracle.com Wed Dec 11 07:02:31 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Dec 2019 17:02:31 +1000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: Message-ID: Hi Richard, On 11/12/2019 7:45 am, Reingruber, Richard wrote: > Hi, > > I would like to get reviews please for > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > > Corresponding RFE: > https://bugs.openjdk.java.net/browse/JDK-8227745 > > Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] > > Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the > change is being tested at SAP since I posted the first RFR some months ago. > > The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI > agents request capabilities that allow them to access local variable values. E.g. if you start-up > with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right > from the beginning, well before a debugger attaches -- if ever one should do so. With the > enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based > optimizations are reverted just before an agent acquires the reference to an object. In the JBS item > you'll find more details. Most of the details here are in areas I can comment on in detail, but I did take an initial general look at things. The only thing that jumped out at me is that I think the DeoptimizeObjectsALotThread should be a hidden thread. + bool is_hidden_from_external_view() const { return true; } Also I don't see any testing of the DeoptimizeObjectsALotThread. Without active testing this will just bit-rot. Also on the tests I don't understand your @requires clause: @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & (vm.opt.TieredCompilation != true)) This seems to require that TieredCompilation is disabled, but tiered is our normal mode of operation. ?? Thanks, David > Thanks, > Richard. > > [1] Experimental fix for JDK-8214584 based on JDK-8227745 > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch > From dms at samersoff.net Wed Dec 11 07:44:32 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Wed, 11 Dec 2019 10:44:32 +0300 Subject: Removal of SA javascript support In-Reply-To: References: Message-ID: <030fbae7-f75a-1714-a98c-7139e817967f@samersoff.net> Hello Chris, I'm supporting you with this decision. PS: For people who want SA scripting - One thing I experimented with a long time ago - has been exporting of some SA capabilities to jython. This might be the way to go. -Dmitry On 11.12.19 05:52, Chris Plummer wrote: > Hi, > > I like to propose the removal of SA javascript support. Few people even > realize this support exists, and hopefully even fewer are using it since > I'd like to remove it. Since I'm new to this myself, let me first > explain what I know about it's existence, and then explain why I want to > remove it. > > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't > look for them in anything post JDK 8. I'll explain why later. jsload is > used to load a javascript file. In that file you can register new clhsdb > commands that are written in javascript. You can also evaluate > javascript using the jseval command. Some of this is explained in [1], > which is the only place I can find any reference to this support. It > does not appear to be officially supported, nor is there any oracle > provided documentation. > > There also appear to be a few clhsdb commands that are written in > javascript. Doing a grep for "registerCommand" in sa.js shows the > following: > > ?registerCommand("class", "class name", "jclass"); > ?registerCommand("classes", "classes", "jclasses"); > ?registerCommand("dumpclass", "dumpclass { address | name } [ directory > ]", "dclass"); > ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); > ?registerCommand("mem", "mem address [ length ]", "printMem"); > ?registerCommand("sysprops", "sysprops", "sysProps"); > ?registerCommand("whatis", "whatis address", "printWhatis"); > > Once again, don't go looking for these in anything newer than JDK8. You > won't find them. Again the only documentation I can fine is [1]. > > The other use of Javascript is the SOQL command (Simple Object Query > Language), a tool used to query the heap, and also the JSDB command.? > The only SOQL documentation I could find is the blog reference [2]. I > could not find HSDB documentation, but I believe is is a javascript > support for looking at hotspot. So once again, neither of these seem to > be officially supported or documented. > > The real purpose of the email is to propose removal of this support. > Here are the reasons: > > (1) It's broken, and has been since 9. See [3]. This is why you don't > see the javascript related commands in clhsdb. Javascript fails to > initialize, so none of the javascript related commands are registered. > (2) Nashorn is deprecated and will be removed eventually. > (3) We have very little understanding of the javascript support. > (4) No resources to work on it (unless there is a community volunteer). > (5) Very questionable value (lack of users). The fact this support has > been broken since JDK 9 and no bug was filed until I did so this week is > a good indication of that. Another is that there are no other SA > Javascript related bugs filed. Lastly, the lack of any official > documentation and only minimal mention of it on the web is another good > indication of it's (lack of) value. > > Also, regarding the 7 commands listed above that would be lost (but > currently don't work now anyway), if they are really wanted, they could > be implemented in java instead of javascript. > > I'd like to remove javascript support in two steps. The first is simply > disable the clhsdb code that tries to initialize the javascript support. > I'd like to do this in 14 (actually as soon as possible). I'd like to > actually do this now even if we decide to keep javascript support and > eventually fix it because it will get rid of the warning you see > whenever you attach from clhsdb: > > ???? Warning! JS Engine can't start, some commands will not be available. > > This warning will become more of an issue for the clhsdb tests after I > push [4] because then you will also see the full stacktrace for the > underlying exception that caused the Javascript to fail to start. > Besides being unnecessary noise in passing test cases, it can also be > misleading in any test that fails because the exception will be > unrelated to the failure. This is actually what got me going down this > path of what the javascript support is all about. > > The next step would be to strip out all Javascript related code, > including the SOQL and JSDB tools. This would be done in 15. > > Please let me know what you think. > > thanks, > > Chris > > [1] > https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html > > [2] > http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html > > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 > From sundararajan.athijegannathan at oracle.com Wed Dec 11 12:47:15 2019 From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com) Date: Wed, 11 Dec 2019 18:17:15 +0530 Subject: Removal of SA javascript support In-Reply-To: References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com> <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com> Message-ID: <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com> Effectively you're asking for SA as API. I don't think that is a good idea. That implies supporting hotspot data structures as Java *API*. That will be maintainability nightmare - we've to keep tracking hotspot data structures in SA code. That itself is problematic. API would be next level nightmare. -Sundar On 11/12/19 11:57 am, Yasumasa Suenaga wrote: > Hi, > > IMHO we need to export all packages in SA if we do not provide new API > for SA. > sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 > (before Jigsaw), so we could make various functions if we need. > > OTOH we cannot know what classes are needed by the SA users. All > packages in jdk.hotspot.agent module provides features, and they > require other packages. For example, sun.jvm.hotspot.oops.Oop requires > sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger . > It is difficult to track and to export minimally. > (I worked for it in JDK-8157947, but I gave up...) > > Thus I guess it is a big challenge to export SA classes without > refactoring. > If we provide new API for SA plugin, I guess we need to work some > refactoring. > > > Yasumasa > > > On 2019/12/11 15:00, Chris Plummer wrote: >> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote: >>> On 2019/12/11 14:39, Krystal Mok wrote: >>>> Hi?Yasumasa, >>>> >>>> That's a very nice idea. Basically what you're asking for is >>>> exposing the Command interface [1] so that plugins can implement it >>>> and get dynamically loaded / registered into CLHSDB / HSDB, right? >>> >>> Yes, but we also need proxy API to access internal SA objects e.g. >>> CodeCache, JavaThread, TypeDataBase, etc... >>> >> Yes, or export them. I should have read this email before posting my >> previous one. >> >> Chris >>> >>> Yasumasa >>> >>> >>>> [1]: >>>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246 >>>> >>>> - Kris >>>> >>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga >>>> > wrote: >>>> >>>> ??? Hi Chris, >>>> >>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS >>>> is difficult since Jigsaw. >>>> ??? However I want SA to implement pluggable feature. >>>> ??? I use custom script to list compiled codes in CodeCache. >>>> >>>> ??? I guess other troubleshooters also want similar feature (via >>>> jsload) in future if they encounter JVM crash. >>>> >>>> >>>> ??? Thanks, >>>> >>>> ??? Yasumasa >>>> >>>> >>>> ??? On 2019/12/11 11:52, Chris Plummer wrote: >>>> ???? > Hi, >>>> ???? > >>>> ???? > I like to propose the removal of SA javascript support. Few >>>> people even realize this support exists, and hopefully even fewer >>>> are using it since I'd like to remove it. Since I'm new to this >>>> myself, let me first explain what I know about it's existence, and >>>> then explain why I want to remove it. >>>> ???? > >>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval >>>> commands. Don't look for them in anything post JDK 8. I'll explain >>>> why later. jsload is used to load a javascript file. In that file >>>> you can register new clhsdb commands that are written in >>>> javascript. You can also evaluate javascript using the jseval >>>> command. Some of this is explained in [1], which is the only place >>>> I can find any reference to this support. It does not appear to be >>>> officially supported, nor is there any oracle provided documentation. >>>> ???? > >>>> ???? > There also appear to be a few clhsdb commands that are >>>> written in javascript. Doing a grep for "registerCommand" in sa.js >>>> shows the following: >>>> ???? > >>>> ???? >? ?registerCommand("class", "class name", "jclass"); >>>> ???? >? ?registerCommand("classes", "classes", "jclasses"); >>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } >>>> [ directory ]", "dclass"); >>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); >>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem"); >>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps"); >>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis"); >>>> ???? > >>>> ???? > Once again, don't go looking for these in anything newer >>>> than JDK8. You won't find them. Again the only documentation I can >>>> fine is [1]. >>>> ???? > >>>> ???? > The other use of Javascript is the SOQL command (Simple >>>> Object Query Language), a tool used to query the heap, and also the >>>> JSDB command. The only SOQL documentation I could find is the blog >>>> reference [2]. I could not find HSDB documentation, but I believe >>>> is is a javascript support for looking at hotspot. So once again, >>>> neither of these seem to be officially supported or documented. >>>> ???? > >>>> ???? > The real purpose of the email is to propose removal of this >>>> support. Here are the reasons: >>>> ???? > >>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why >>>> you don't see the javascript related commands in clhsdb. Javascript >>>> fails to initialize, so none of the javascript related commands are >>>> registered. >>>> ???? > (2) Nashorn is deprecated and will be removed eventually. >>>> ???? > (3) We have very little understanding of the javascript >>>> support. >>>> ???? > (4) No resources to work on it (unless there is a community >>>> volunteer). >>>> ???? > (5) Very questionable value (lack of users). The fact this >>>> support has been broken since JDK 9 and no bug was filed until I >>>> did so this week is a good indication of that. Another is that >>>> there are no other SA Javascript related bugs filed. Lastly, the >>>> lack of any official documentation and only minimal mention of it >>>> on the web is another good indication of it's (lack of) value. >>>> ???? > >>>> ???? > Also, regarding the 7 commands listed above that would be >>>> lost (but currently don't work now anyway), if they are really >>>> wanted, they could be implemented in java instead of javascript. >>>> ???? > >>>> ???? > I'd like to remove javascript support in two steps. The >>>> first is simply disable the clhsdb code that tries to initialize >>>> the javascript support. I'd like to do this in 14 (actually as soon >>>> as possible). I'd like to actually do this now even if we decide to >>>> keep javascript support and eventually fix it because it will get >>>> rid of the warning you see whenever you attach from clhsdb: >>>> ???? > >>>> ???? >? ???? Warning! JS Engine can't start, some commands will not >>>> be available. >>>> ???? > >>>> ???? > This warning will become more of an issue for the clhsdb >>>> tests after I push [4] because then you will also see the full >>>> stacktrace for the underlying exception that caused the Javascript >>>> to fail to start. Besides being unnecessary noise in passing test >>>> cases, it can also be misleading in any test that fails because the >>>> exception will be unrelated to the failure. This is actually what >>>> got me going down this path of what the javascript support is all >>>> about. >>>> ???? > >>>> ???? > The next step would be to strip out all Javascript related >>>> code, including the SOQL and JSDB tools. This would be done in 15. >>>> ???? > >>>> ???? > Please let me know what you think. >>>> ???? > >>>> ???? > thanks, >>>> ???? > >>>> ???? > Chris >>>> ???? > >>>> ???? > [1] >>>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html >>>> ???? > [2] >>>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html >>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 >>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 >>>> ???? > >>>> >> >> From sundararajan.athijegannathan at oracle.com Wed Dec 11 12:49:21 2019 From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com) Date: Wed, 11 Dec 2019 18:19:21 +0530 Subject: Removal of SA javascript support In-Reply-To: <030fbae7-f75a-1714-a98c-7139e817967f@samersoff.net> References: <030fbae7-f75a-1714-a98c-7139e817967f@samersoff.net> Message-ID: <0ca8ae29-0645-e00b-5896-9248f71645fe@oracle.com> Replacing one scripting language with another (jython) does not solve anything. You'd still face the same issues - accessing module private stuff from SA module from scripts. Besides you'll have a new problem in addition. How to bundle jython? We've been using bundled scripting engine (nashorn) so far. -Sundar On 11/12/19 1:14 pm, Dmitry Samersoff wrote: > Hello Chris, > > I'm supporting you with this decision. > > PS: For people who want SA scripting - > > One thing I experimented with a long time ago - > has been exporting of some SA capabilities to jython. > This might be the way to go. > > -Dmitry > > > > On 11.12.19 05:52, Chris Plummer wrote: >> Hi, >> >> I like to propose the removal of SA javascript support. Few people even >> realize this support exists, and hopefully even fewer are using it since >> I'd like to remove it. Since I'm new to this myself, let me first >> explain what I know about it's existence, and then explain why I want to >> remove it. >> >> If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't >> look for them in anything post JDK 8. I'll explain why later. jsload is >> used to load a javascript file. In that file you can register new clhsdb >> commands that are written in javascript. You can also evaluate >> javascript using the jseval command. Some of this is explained in [1], >> which is the only place I can find any reference to this support. It >> does not appear to be officially supported, nor is there any oracle >> provided documentation. >> >> There also appear to be a few clhsdb commands that are written in >> javascript. Doing a grep for "registerCommand" in sa.js shows the >> following: >> >> ?registerCommand("class", "class name", "jclass"); >> ?registerCommand("classes", "classes", "jclasses"); >> ?registerCommand("dumpclass", "dumpclass { address | name } [ directory >> ]", "dclass"); >> ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); >> ?registerCommand("mem", "mem address [ length ]", "printMem"); >> ?registerCommand("sysprops", "sysprops", "sysProps"); >> ?registerCommand("whatis", "whatis address", "printWhatis"); >> >> Once again, don't go looking for these in anything newer than JDK8. You >> won't find them. Again the only documentation I can fine is [1]. >> >> The other use of Javascript is the SOQL command (Simple Object Query >> Language), a tool used to query the heap, and also the JSDB command. >> The only SOQL documentation I could find is the blog reference [2]. I >> could not find HSDB documentation, but I believe is is a javascript >> support for looking at hotspot. So once again, neither of these seem to >> be officially supported or documented. >> >> The real purpose of the email is to propose removal of this support. >> Here are the reasons: >> >> (1) It's broken, and has been since 9. See [3]. This is why you don't >> see the javascript related commands in clhsdb. Javascript fails to >> initialize, so none of the javascript related commands are registered. >> (2) Nashorn is deprecated and will be removed eventually. >> (3) We have very little understanding of the javascript support. >> (4) No resources to work on it (unless there is a community volunteer). >> (5) Very questionable value (lack of users). The fact this support has >> been broken since JDK 9 and no bug was filed until I did so this week is >> a good indication of that. Another is that there are no other SA >> Javascript related bugs filed. Lastly, the lack of any official >> documentation and only minimal mention of it on the web is another good >> indication of it's (lack of) value. >> >> Also, regarding the 7 commands listed above that would be lost (but >> currently don't work now anyway), if they are really wanted, they could >> be implemented in java instead of javascript. >> >> I'd like to remove javascript support in two steps. The first is simply >> disable the clhsdb code that tries to initialize the javascript support. >> I'd like to do this in 14 (actually as soon as possible). I'd like to >> actually do this now even if we decide to keep javascript support and >> eventually fix it because it will get rid of the warning you see >> whenever you attach from clhsdb: >> >> ???? Warning! JS Engine can't start, some commands will not be available. >> >> This warning will become more of an issue for the clhsdb tests after I >> push [4] because then you will also see the full stacktrace for the >> underlying exception that caused the Javascript to fail to start. >> Besides being unnecessary noise in passing test cases, it can also be >> misleading in any test that fails because the exception will be >> unrelated to the failure. This is actually what got me going down this >> path of what the javascript support is all about. >> >> The next step would be to strip out all Javascript related code, >> including the SOQL and JSDB tools. This would be done in 15. >> >> Please let me know what you think. >> >> thanks, >> >> Chris >> >> [1] >> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html >> >> [2] >> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html >> >> [3] https://bugs.openjdk.java.net/browse/JDK-8235594 >> [4] https://bugs.openjdk.java.net/browse/JDK-8234277 >> From dms at samersoff.net Wed Dec 11 15:03:10 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Wed, 11 Dec 2019 18:03:10 +0300 Subject: Removal of SA javascript support In-Reply-To: <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com> References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com> <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com> <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com> Message-ID: <6faf0cb5-7b4a-5e35-ed7c-90b817235031@samersoff.net> Sundar, Supporting hotspot data structure in SA is already a maintenance nightmare ;) So we can consider to provide high level API, like find_class_by_name to script writer. It allows anybody who are interesting with quick prototyping write his own program on top of SA with any language they want. -Dmitry On 11.12.19 15:47, sundararajan.athijegannathan at oracle.com wrote: > Effectively you're asking for SA as API. I don't think that is a good > idea. That implies supporting hotspot data structures as Java *API*. > That will be maintainability nightmare - we've to keep tracking hotspot > data structures in SA code. That itself is problematic. API would be > next level nightmare. > > -Sundar > > On 11/12/19 11:57 am, Yasumasa Suenaga wrote: >> Hi, >> >> IMHO we need to export all packages in SA if we do not provide new API >> for SA. >> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 >> (before Jigsaw), so we could make various functions if we need. >> >> OTOH we cannot know what classes are needed by the SA users. All >> packages in jdk.hotspot.agent module provides features, and they >> require other packages. For example, sun.jvm.hotspot.oops.Oop requires >> sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger . >> It is difficult to track and to export minimally. >> (I worked for it in JDK-8157947, but I gave up...) >> >> Thus I guess it is a big challenge to export SA classes without >> refactoring. >> If we provide new API for SA plugin, I guess we need to work some >> refactoring. >> >> >> Yasumasa >> >> >> On 2019/12/11 15:00, Chris Plummer wrote: >>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote: >>>> On 2019/12/11 14:39, Krystal Mok wrote: >>>>> Hi?Yasumasa, >>>>> >>>>> That's a very nice idea. Basically what you're asking for is >>>>> exposing the Command interface [1] so that plugins can implement it >>>>> and get dynamically loaded / registered into CLHSDB / HSDB, right? >>>> >>>> Yes, but we also need proxy API to access internal SA objects e.g. >>>> CodeCache, JavaThread, TypeDataBase, etc... >>>> >>> Yes, or export them. I should have read this email before posting my >>> previous one. >>> >>> Chris >>>> >>>> Yasumasa >>>> >>>> >>>>> [1]: >>>>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246 >>>>> >>>>> >>>>> - Kris >>>>> >>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga >>>>> > wrote: >>>>> >>>>> ??? Hi Chris, >>>>> >>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS >>>>> is difficult since Jigsaw. >>>>> ??? However I want SA to implement pluggable feature. >>>>> ??? I use custom script to list compiled codes in CodeCache. >>>>> >>>>> ??? I guess other troubleshooters also want similar feature (via >>>>> jsload) in future if they encounter JVM crash. >>>>> >>>>> >>>>> ??? Thanks, >>>>> >>>>> ??? Yasumasa >>>>> >>>>> >>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote: >>>>> ???? > Hi, >>>>> ???? > >>>>> ???? > I like to propose the removal of SA javascript support. Few >>>>> people even realize this support exists, and hopefully even fewer >>>>> are using it since I'd like to remove it. Since I'm new to this >>>>> myself, let me first explain what I know about it's existence, and >>>>> then explain why I want to remove it. >>>>> ???? > >>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval >>>>> commands. Don't look for them in anything post JDK 8. I'll explain >>>>> why later. jsload is used to load a javascript file. In that file >>>>> you can register new clhsdb commands that are written in >>>>> javascript. You can also evaluate javascript using the jseval >>>>> command. Some of this is explained in [1], which is the only place >>>>> I can find any reference to this support. It does not appear to be >>>>> officially supported, nor is there any oracle provided documentation. >>>>> ???? > >>>>> ???? > There also appear to be a few clhsdb commands that are >>>>> written in javascript. Doing a grep for "registerCommand" in sa.js >>>>> shows the following: >>>>> ???? > >>>>> ???? >? ?registerCommand("class", "class name", "jclass"); >>>>> ???? >? ?registerCommand("classes", "classes", "jclasses"); >>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } >>>>> [ directory ]", "dclass"); >>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); >>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem"); >>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps"); >>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis"); >>>>> ???? > >>>>> ???? > Once again, don't go looking for these in anything newer >>>>> than JDK8. You won't find them. Again the only documentation I can >>>>> fine is [1]. >>>>> ???? > >>>>> ???? > The other use of Javascript is the SOQL command (Simple >>>>> Object Query Language), a tool used to query the heap, and also the >>>>> JSDB command. The only SOQL documentation I could find is the blog >>>>> reference [2]. I could not find HSDB documentation, but I believe >>>>> is is a javascript support for looking at hotspot. So once again, >>>>> neither of these seem to be officially supported or documented. >>>>> ???? > >>>>> ???? > The real purpose of the email is to propose removal of this >>>>> support. Here are the reasons: >>>>> ???? > >>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why >>>>> you don't see the javascript related commands in clhsdb. Javascript >>>>> fails to initialize, so none of the javascript related commands are >>>>> registered. >>>>> ???? > (2) Nashorn is deprecated and will be removed eventually. >>>>> ???? > (3) We have very little understanding of the javascript >>>>> support. >>>>> ???? > (4) No resources to work on it (unless there is a community >>>>> volunteer). >>>>> ???? > (5) Very questionable value (lack of users). The fact this >>>>> support has been broken since JDK 9 and no bug was filed until I >>>>> did so this week is a good indication of that. Another is that >>>>> there are no other SA Javascript related bugs filed. Lastly, the >>>>> lack of any official documentation and only minimal mention of it >>>>> on the web is another good indication of it's (lack of) value. >>>>> ???? > >>>>> ???? > Also, regarding the 7 commands listed above that would be >>>>> lost (but currently don't work now anyway), if they are really >>>>> wanted, they could be implemented in java instead of javascript. >>>>> ???? > >>>>> ???? > I'd like to remove javascript support in two steps. The >>>>> first is simply disable the clhsdb code that tries to initialize >>>>> the javascript support. I'd like to do this in 14 (actually as soon >>>>> as possible). I'd like to actually do this now even if we decide to >>>>> keep javascript support and eventually fix it because it will get >>>>> rid of the warning you see whenever you attach from clhsdb: >>>>> ???? > >>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not >>>>> be available. >>>>> ???? > >>>>> ???? > This warning will become more of an issue for the clhsdb >>>>> tests after I push [4] because then you will also see the full >>>>> stacktrace for the underlying exception that caused the Javascript >>>>> to fail to start. Besides being unnecessary noise in passing test >>>>> cases, it can also be misleading in any test that fails because the >>>>> exception will be unrelated to the failure. This is actually what >>>>> got me going down this path of what the javascript support is all >>>>> about. >>>>> ???? > >>>>> ???? > The next step would be to strip out all Javascript related >>>>> code, including the SOQL and JSDB tools. This would be done in 15. >>>>> ???? > >>>>> ???? > Please let me know what you think. >>>>> ???? > >>>>> ???? > thanks, >>>>> ???? > >>>>> ???? > Chris >>>>> ???? > >>>>> ???? > [1] >>>>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html >>>>> >>>>> ???? > [2] >>>>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html >>>>> >>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 >>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 >>>>> ???? > >>>>> >>> >>> From richard.reingruber at sap.com Wed Dec 11 15:07:29 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 11 Dec 2019 15:07:29 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: Message-ID: Hi David, > Most of the details here are in areas I can comment on in detail, but I > did take an initial general look at things. Thanks for taking the time! > The only thing that jumped out at me is that I think the > DeoptimizeObjectsALotThread should be a hidden thread. > > + bool is_hidden_from_external_view() const { return true; } Yes, it should. Will add the method like above. > Also I don't see any testing of the DeoptimizeObjectsALotThread. Without > active testing this will just bit-rot. DeoptimizeObjectsALot is meant for stress testing with a larger workload. I will add a minimal test to keep it fresh. > Also on the tests I don't understand your @requires clause: > > @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & > (vm.opt.TieredCompilation != true)) > > This seems to require that TieredCompilation is disabled, but tiered is > our normal mode of operation. ?? > I removed the clause. I guess I wanted to target the tests towards the code they are supposed to test, and it's easier to analyze failures w/o tiered compilation and with just one compiler thread. Additionally I will make use of compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. Thanks, Richard. -----Original Message----- From: David Holmes Sent: Mittwoch, 11. Dezember 2019 08:03 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, On 11/12/2019 7:45 am, Reingruber, Richard wrote: > Hi, > > I would like to get reviews please for > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > > Corresponding RFE: > https://bugs.openjdk.java.net/browse/JDK-8227745 > > Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] > > Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the > change is being tested at SAP since I posted the first RFR some months ago. > > The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI > agents request capabilities that allow them to access local variable values. E.g. if you start-up > with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right > from the beginning, well before a debugger attaches -- if ever one should do so. With the > enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based > optimizations are reverted just before an agent acquires the reference to an object. In the JBS item > you'll find more details. Most of the details here are in areas I can comment on in detail, but I did take an initial general look at things. The only thing that jumped out at me is that I think the DeoptimizeObjectsALotThread should be a hidden thread. + bool is_hidden_from_external_view() const { return true; } Also I don't see any testing of the DeoptimizeObjectsALotThread. Without active testing this will just bit-rot. Also on the tests I don't understand your @requires clause: @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & (vm.opt.TieredCompilation != true)) This seems to require that TieredCompilation is disabled, but tiered is our normal mode of operation. ?? Thanks, David > Thanks, > Richard. > > [1] Experimental fix for JDK-8214584 based on JDK-8227745 > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch > From dms at samersoff.net Wed Dec 11 15:34:02 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Wed, 11 Dec 2019 18:34:02 +0300 Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> Message-ID: <2c066d67-aa3c-83e2-632a-1ba3114d1538@samersoff.net> Hello Yasumasa, Please, 1. Consider to use mmap for reading elf sections. 2. Please move all platfrom-specific parts of native code to a separate file/directory. Current patch will brake AARCH64 build. 3. I didn't find any tests here. How did your test the changes? libproc_impl.c 131: If is not necessary, free handles NULLPTR gracefully. -Dmitry On 04.12.19 03:54, Yasumasa Suenaga wrote: > PING: Could you review it? > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ > > This bug is targeted to JDK 14. > > > Thanks, > > Yasumasa > > > On 2019/11/28 21:39, Yasumasa Suenaga wrote: >> Hi, >> >> I refactored LinuxAMD64CFrame.java . It works fine in >> serviceability/sa tests and >> all tests on submit repo >> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >> Could you review new webrev? >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >> >> The diff from previous webrev is here: >> ?? http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>> >>> >>> According to 2.7 Stack Unwind Algorithm in System V Application >>> Binary Interface AMD64 >>> Architecture Processor Supplement [1], we need to use DWARF in >>> .eh_frame or .debug_frame >>> for stack unwinding. >>> >>> As JDK-8022183 said, omit-frame-pointer is enabled by default since >>> GCC 4.6, so system >>> library (e.g. libc) might be compiled with this feature. >>> >>> However `jhsdb jstack --mixed` does not do so, it uses base pointer >>> register (RBP). >>> So it might be lack of stack frames. >>> >>> I guess JDK-8219201 is caused by same issue. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] >>> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>> From daniil.x.titov at oracle.com Wed Dec 11 17:12:40 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 11 Dec 2019 09:12:40 -0800 (PST) Subject: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> Message-ID: <2A286F91-6CB8-4A32-B103-2C28D52E5A1A@oracle.com> Hi Serguei, Thank you for your comments. I will correct this nits before pushing the changes. Hi Bob and David, > [Mandy Chung] >> I reviewed Metrics and Subsystem in this version. >> I don't need to see a new webrev. As I understood Mandy finished reviewing this fix. Just wanted to confirm with you if you are okey with that version of the fix (webrev.06) ? Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests passed. Thank you, Daniil ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, It is not a full review, just some minor comments. In fact, I do not see real problems yet. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html 55 public long getTotalSwapSpaceSize() { 56 if (containerMetrics != null) { 57 long limit = containerMetrics.getMemoryAndSwapLimit(); 58 // The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) 59 // or if a docker container was started without specifying a memory limit ( without '--memory=' 60 // Docker option). In latter case there is no limit on how much memory the container can use and 61 // it can use as much memory as the host's OS allows. 62 long memLimit = containerMetrics.getMemoryLimit(); 63 if (limit >= 0 && memLimit >= 0) { 64 return limit - memLimit; 65 } 66 } 67 return getTotalSwapSpaceSize0(); 68 } Unneeded space after brackets '('. Do we need to check if the (limit - memLimit) value is negative? The same question is for getFreeSwapSpaceSize(): memSwapLimit - memLimit - (memSwapUsage - memUsage) and getFreeMemorySize(): 101 return limit - usage; 81 // If this happens just retry the loop for a few iterations Dot is missed at the end of comment. http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html 34 System.out.println(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); To make the above lines a little bit shorter I'd suggest to define a log() method like this: private static void log(String msg) ( System.out.println(msg(; } 34 log(String.format("Runtime.availableProcessors: %d", Runtime.getRuntime().availableProcessors())); 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: %d", osBean.getAvailableProcessors())); 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", osBean.getTotalMemorySize())); 37 log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: %d", osBean.getTotalPhysicalMemorySize())); 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", osBean.getFreeMemorySize())); 39 log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", osBean.getFreePhysicalMemorySize())); 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: %d", osBean.getTotalSwapSpaceSize())); 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: %d", osBean.getFreeSwapSpaceSize())); 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", osBean.getCpuLoad())); 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", osBean.getSystemCpuLoad())); Thanks, Serguei On 12/6/19 17:41, Daniil Titov wrote: > Hi David, Mandy, and Bob, > > Thank you for reviewing this fix. > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > but I agree that the changes proposed in the previous version of the webrev increase such probability. > I filed the follow-up issue [4] as Mandy suggested. > 3. The legacy methods were renamed as David suggested. > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >> ! static int initialized=1; >> >> Am I reading this right that the code currently fails to actually do the >> initialization because of this ??? > Yes, currently the code fails to do the initialization but it was unnoticed since method > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > was always -1. > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >> >> System.out.println(String.format(...) >> >> Why not simply >> >> System.out.printf(..) > As I tried explain it earlier it would make the tests unstable. > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > in the output. > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > and "1030762496". > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > [0.305s][trace][os,container] Memory Usage is: 42979328 > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > 1030762496 > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > at java.base/java.lang.Thread.run(Thread.java:832) > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > >> > >> > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > >> > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > I thought that the error case we are referring to is limit == 0 which > indicates something unexpected goes wrong. So the compatibility concern > should be low. This is very specific to Metrics implementation for > cgroup v1 and let me know if I'm wrong. > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > >> > >> // the host data, value 0 indicates that something went wrong while the metric was read and > >> // in this case we return "information unavailable" code -1. > >> > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > limits. > > > > It's important to consider carefully if the monitoring API indicates an > error vs unavailable and an application should continue to run when the > monitoring system fails to get the metrics. > > There are several choices to report "something goes wrong" scenarios > (should unlikely happen???): > 1. fall back to a random positive value (e.g. host value) > 2. return a negative value > 3. throw an exception > > #3 is not an option as the application is not expecting this. For #2, > the application can filter bad values if desirable. > > I'm okay if you want to file a JBS issue to follow up and thoroughly > look at the cases that the metrics are unavailable and the cases when > fails to obtain. > > >> --- > >> > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > >> > >> ? > > or simply (as I commented [1]) > System.out.format > > Mandy > [1] > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > From bob.vandette at oracle.com Wed Dec 11 17:21:11 2019 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 11 Dec 2019 12:21:11 -0500 Subject: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <2A286F91-6CB8-4A32-B103-2C28D52E5A1A@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> <2A286F91-6CB8-4A32-B103-2C28D52E5A1A@oracle.com> Message-ID: Yes, I defer to Mandy on the best way to express the various Java exceptions. I?m ok with the changes. Thanks for getting this done for JDK14! Bob. > On Dec 11, 2019, at 12:12 PM, Daniil Titov wrote: > > Hi Serguei, > > Thank you for your comments. I will correct this nits before pushing the changes. > > Hi Bob and David, > >> [Mandy Chung] >>> I reviewed Metrics and Subsystem in this version. >>> I don't need to see a new webrev. > > As I understood Mandy finished reviewing this fix. Just wanted to confirm with you if you are okey with that version of the fix (webrev.06) ? > > Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests passed. > > Thank you, > Daniil > > > > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It is not a full review, just some minor comments. > In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM > runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without > specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on > how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for > a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: > %d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > %d", osBean.getTotalMemorySize())); > 37 > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: > %d", osBean.getFreeMemorySize())); > 39 > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > %d", osBean.getFreePhysicalMemorySize())); > 40 > System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 > System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 > System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 > System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: > %f", osBean.getSystemCpuLoad())); > > > To make the above lines a little bit shorter I'd suggest to define a > log() method like this: > private static void log(String msg) ( System.out.println(msg(; } > > 34 log(String.format("Runtime.availableProcessors: %d", > Runtime.getRuntime().availableProcessors())); > 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", > osBean.getTotalMemorySize())); > 37 > log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", > osBean.getFreeMemorySize())); > 39 > log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", > osBean.getFreePhysicalMemorySize())); > 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", > osBean.getSystemCpuLoad())); > > > Thanks, > Serguei > > > > On 12/6/19 17:41, Daniil Titov wrote: >> Hi David, Mandy, and Bob, >> >> Thank you for reviewing this fix. >> >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. >> 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize >> was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. >> I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, >> but I agree that the changes proposed in the previous version of the webrev increase such probability. >> I filed the follow-up issue [4] as Mandy suggested. >> 3. The legacy methods were renamed as David suggested. >> >> >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>> ! static int initialized=1; >>> >>> Am I reading this right that the code currently fails to actually do the >>> initialization because of this ??? >> Yes, currently the code fails to do the initialization but it was unnoticed since method >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" >> was always -1. >> >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>> >>> System.out.println(String.format(...) >>> >>> Why not simply >>> >>> System.out.printf(..) >> As I tried explain it earlier it would make the tests unstable. >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. >> Instead it parses the format string into a list of FormatString objects and then iterates over the list. >> As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find >> in the output. >> >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" >> and "1030762496". >> >> >> [0.304s][trace][os,container] Memory Usage is: 42983424 >> OperatingSystemMXBean.getFreeMemorySize: 1030758400 >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> [0.305s][trace][os,container] Memory Usage is: 42979328 >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 >> 1030762496 >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 >> >> >> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr >> >> at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) >> at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) >> at TestMemoryAwareness.main(TestMemoryAwareness.java:73) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.base/java.lang.reflect.Method.invoke(Method.java:564) >> at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) >> at java.base/java.lang.Thread.run(Thread.java:832) >> >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522 >> >> Thank you, >> Daniil >> >> ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: >> >> >> >> On 12/6/19 5:59 AM, Bob Vandette wrote: >>>> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >>>> >>>> >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >>>> >>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. >> >> I thought that the error case we are referring to is limit == 0 which >> indicates something unexpected goes wrong. So the compatibility concern >> should be low. This is very specific to Metrics implementation for >> cgroup v1 and let me know if I'm wrong. >> >>>> Surely there must always be some information available from the operating environment? I see from the impl file: >>>> >>>> // the host data, value 0 indicates that something went wrong while the metric was read and >>>> // in this case we return "information unavailable" code -1. >>>> >>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. >>> I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. >>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others >>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no >>> limits. >>> >> >> It's important to consider carefully if the monitoring API indicates an >> error vs unavailable and an application should continue to run when the >> monitoring system fails to get the metrics. >> >> There are several choices to report "something goes wrong" scenarios >> (should unlikely happen???): >> 1. fall back to a random positive value (e.g. host value) >> 2. return a negative value >> 3. throw an exception >> >> #3 is not an option as the application is not expecting this. For #2, >> the application can filter bad values if desirable. >> >> I'm okay if you want to file a JBS issue to follow up and thoroughly >> look at the cases that the metrics are unavailable and the cases when >> fails to obtain. >> >>>> --- >>>> >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>>> >>>> System.out.println(String.format(...) >>>> >>>> Why not simply >>>> >>>> System.out.printf(..) >>>> >>>> ? >> >> or simply (as I commented [1]) >> System.out.format >> >> Mandy >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html >> >> >> >> > > > > From daniil.x.titov at oracle.com Thu Dec 5 01:43:39 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 04 Dec 2019 17:43:39 -0800 Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and do a better job of detecting SA failures In-Reply-To: <0871624E-D09F-4F07-B49C-B4043EDCBEF8@oracle.com> References: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com> <0871624E-D09F-4F07-B49C-B4043EDCBEF8@oracle.com> Message-ID: Hi Chris, The change looks good to me. Best regards, Daniil ?On 12/4/19, 5:39 PM, "serviceability-dev on behalf of Chris Plummer" wrote: Can I get one more review please? thanks, Chris On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > It looks good. > > Thanks, > Serguei > > On 12/3/19 12:45 PM, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8234277 >> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/ >> >> No longer redirect stderr for the jhsdb/clhsdb process. It results in >> not seeing attach failures in the output, so OutputAnalyer can't >> check for them. >> >> Execute "verbose true" as the first clhsdb command after launching. >> This will result in verboseExceptions being true in >> CommandProcessor.java, so full exception traces will appear in the >> output. This will make debugging future SA test failures a lot easier. >> >> Add an extra check for any DebuggerException. This is mainly for >> detecting that the attached failed. This previously was going >> un-noticed, and instead the test would later fail because it noticed >> some other issue, like missing output, which isn't very informative. >> >> Add checks for other unexpected SA exceptions that are caught and >> printed by CommandProcessor. These will always have an "Error: " >> prefix, making them easy to detect. >> >> Problem list ClhsdbScanOops.java. With the new error checking, it >> will now always fail on windows due to JDK-8230731 and on macos and >> linux due to JDK-8235220. These failures are not "new" per se, but >> are just now being properly detected. >> >> thanks, >> >> Chris > From david.holmes at oracle.com Wed Dec 11 21:02:57 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 12 Dec 2019 07:02:57 +1000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: Message-ID: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> On 12/12/2019 1:07 am, Reingruber, Richard wrote: > Hi David, > > > Most of the details here are in areas I can comment on in detail, but I > > did take an initial general look at things. > > Thanks for taking the time! Apologies the above should read: "Most of the details here are in areas I *can't* comment on in detail ..." David > > The only thing that jumped out at me is that I think the > > DeoptimizeObjectsALotThread should be a hidden thread. > > > > + bool is_hidden_from_external_view() const { return true; } > > Yes, it should. Will add the method like above. > > > Also I don't see any testing of the DeoptimizeObjectsALotThread. Without > > active testing this will just bit-rot. > > DeoptimizeObjectsALot is meant for stress testing with a larger workload. I will add a minimal test > to keep it fresh. > > > Also on the tests I don't understand your @requires clause: > > > > @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & > > (vm.opt.TieredCompilation != true)) > > > > This seems to require that TieredCompilation is disabled, but tiered is > > our normal mode of operation. ?? > > > > I removed the clause. I guess I wanted to target the tests towards the code they are supposed to > test, and it's easier to analyze failures w/o tiered compilation and with just one compiler thread. > > Additionally I will make use of compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. > > Thanks, > Richard. > > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 11. Dezember 2019 08:03 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi Richard, > > On 11/12/2019 7:45 am, Reingruber, Richard wrote: >> Hi, >> >> I would like to get reviews please for >> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >> >> Corresponding RFE: >> https://bugs.openjdk.java.net/browse/JDK-8227745 >> >> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >> >> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the >> change is being tested at SAP since I posted the first RFR some months ago. >> >> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI >> agents request capabilities that allow them to access local variable values. E.g. if you start-up >> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right >> from the beginning, well before a debugger attaches -- if ever one should do so. With the >> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based >> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item >> you'll find more details. > > Most of the details here are in areas I can comment on in detail, but I > did take an initial general look at things. > > The only thing that jumped out at me is that I think the > DeoptimizeObjectsALotThread should be a hidden thread. > > + bool is_hidden_from_external_view() const { return true; } > > Also I don't see any testing of the DeoptimizeObjectsALotThread. Without > active testing this will just bit-rot. > > Also on the tests I don't understand your @requires clause: > > @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & > (vm.opt.TieredCompilation != true)) > > This seems to require that TieredCompilation is disabled, but tiered is > our normal mode of operation. ?? > > Thanks, > David > >> Thanks, >> Richard. >> >> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >> From ioi.lam at oracle.com Wed Dec 11 22:24:02 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 11 Dec 2019 14:24:02 -0800 Subject: Removal of SA javascript support In-Reply-To: <6faf0cb5-7b4a-5e35-ed7c-90b817235031@samersoff.net> References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com> <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com> <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com> <6faf0cb5-7b4a-5e35-ed7c-90b817235031@samersoff.net> Message-ID: <6e887f3b-7930-728e-8f84-89b8498e446d@oracle.com> Regarding maintaining hotspot data structures in SA, I think we often break it without knowing, especially when we are adding data structures that are not currently exposed by SA. Does anyone have a sense of the state of SA in newer versions of the JDK. Is SA still doing what you expect, or do you see a declining level of usefulness because SA is getting more out-of-sync? Thanks On 12/11/19 7:03 AM, Dmitry Samersoff wrote: > Sundar, > > Supporting hotspot data structure in SA is already a maintenance > nightmare ;) > > So we can consider to provide high level API, like find_class_by_name to > script writer. > > It allows anybody who are interesting with quick prototyping write his > own program on top of SA with any language they want. > > -Dmitry > > On 11.12.19 15:47, sundararajan.athijegannathan at oracle.com wrote: >> Effectively you're asking for SA as API. I don't think that is a good >> idea. That implies supporting hotspot data structures as Java *API*. >> That will be maintainability nightmare - we've to keep tracking hotspot >> data structures in SA code. That itself is problematic. API would be >> next level nightmare. >> >> -Sundar >> >> On 11/12/19 11:57 am, Yasumasa Suenaga wrote: >>> Hi, >>> >>> IMHO we need to export all packages in SA if we do not provide new API >>> for SA. >>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 >>> (before Jigsaw), so we could make various functions if we need. >>> >>> OTOH we cannot know what classes are needed by the SA users. All >>> packages in jdk.hotspot.agent module provides features, and they >>> require other packages. For example, sun.jvm.hotspot.oops.Oop requires >>> sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger . >>> It is difficult to track and to export minimally. >>> (I worked for it in JDK-8157947, but I gave up...) >>> >>> Thus I guess it is a big challenge to export SA classes without >>> refactoring. >>> If we provide new API for SA plugin, I guess we need to work some >>> refactoring. >>> >>> >>> Yasumasa >>> >>> >>> On 2019/12/11 15:00, Chris Plummer wrote: >>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote: >>>>> On 2019/12/11 14:39, Krystal Mok wrote: >>>>>> Hi?Yasumasa, >>>>>> >>>>>> That's a very nice idea. Basically what you're asking for is >>>>>> exposing the Command interface [1] so that plugins can implement it >>>>>> and get dynamically loaded / registered into CLHSDB / HSDB, right? >>>>> Yes, but we also need proxy API to access internal SA objects e.g. >>>>> CodeCache, JavaThread, TypeDataBase, etc... >>>>> >>>> Yes, or export them. I should have read this email before posting my >>>> previous one. >>>> >>>> Chris >>>>> Yasumasa >>>>> >>>>> >>>>>> [1]: >>>>>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246 >>>>>> >>>>>> >>>>>> - Kris >>>>>> >>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga >>>>>> > wrote: >>>>>> >>>>>> ??? Hi Chris, >>>>>> >>>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS >>>>>> is difficult since Jigsaw. >>>>>> ??? However I want SA to implement pluggable feature. >>>>>> ??? I use custom script to list compiled codes in CodeCache. >>>>>> >>>>>> ??? I guess other troubleshooters also want similar feature (via >>>>>> jsload) in future if they encounter JVM crash. >>>>>> >>>>>> >>>>>> ??? Thanks, >>>>>> >>>>>> ??? Yasumasa >>>>>> >>>>>> >>>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote: >>>>>> ???? > Hi, >>>>>> ???? > >>>>>> ???? > I like to propose the removal of SA javascript support. Few >>>>>> people even realize this support exists, and hopefully even fewer >>>>>> are using it since I'd like to remove it. Since I'm new to this >>>>>> myself, let me first explain what I know about it's existence, and >>>>>> then explain why I want to remove it. >>>>>> ???? > >>>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval >>>>>> commands. Don't look for them in anything post JDK 8. I'll explain >>>>>> why later. jsload is used to load a javascript file. In that file >>>>>> you can register new clhsdb commands that are written in >>>>>> javascript. You can also evaluate javascript using the jseval >>>>>> command. Some of this is explained in [1], which is the only place >>>>>> I can find any reference to this support. It does not appear to be >>>>>> officially supported, nor is there any oracle provided documentation. >>>>>> ???? > >>>>>> ???? > There also appear to be a few clhsdb commands that are >>>>>> written in javascript. Doing a grep for "registerCommand" in sa.js >>>>>> shows the following: >>>>>> ???? > >>>>>> ???? >? ?registerCommand("class", "class name", "jclass"); >>>>>> ???? >? ?registerCommand("classes", "classes", "jclasses"); >>>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } >>>>>> [ directory ]", "dclass"); >>>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); >>>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem"); >>>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps"); >>>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis"); >>>>>> ???? > >>>>>> ???? > Once again, don't go looking for these in anything newer >>>>>> than JDK8. You won't find them. Again the only documentation I can >>>>>> fine is [1]. >>>>>> ???? > >>>>>> ???? > The other use of Javascript is the SOQL command (Simple >>>>>> Object Query Language), a tool used to query the heap, and also the >>>>>> JSDB command. The only SOQL documentation I could find is the blog >>>>>> reference [2]. I could not find HSDB documentation, but I believe >>>>>> is is a javascript support for looking at hotspot. So once again, >>>>>> neither of these seem to be officially supported or documented. >>>>>> ???? > >>>>>> ???? > The real purpose of the email is to propose removal of this >>>>>> support. Here are the reasons: >>>>>> ???? > >>>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why >>>>>> you don't see the javascript related commands in clhsdb. Javascript >>>>>> fails to initialize, so none of the javascript related commands are >>>>>> registered. >>>>>> ???? > (2) Nashorn is deprecated and will be removed eventually. >>>>>> ???? > (3) We have very little understanding of the javascript >>>>>> support. >>>>>> ???? > (4) No resources to work on it (unless there is a community >>>>>> volunteer). >>>>>> ???? > (5) Very questionable value (lack of users). The fact this >>>>>> support has been broken since JDK 9 and no bug was filed until I >>>>>> did so this week is a good indication of that. Another is that >>>>>> there are no other SA Javascript related bugs filed. Lastly, the >>>>>> lack of any official documentation and only minimal mention of it >>>>>> on the web is another good indication of it's (lack of) value. >>>>>> ???? > >>>>>> ???? > Also, regarding the 7 commands listed above that would be >>>>>> lost (but currently don't work now anyway), if they are really >>>>>> wanted, they could be implemented in java instead of javascript. >>>>>> ???? > >>>>>> ???? > I'd like to remove javascript support in two steps. The >>>>>> first is simply disable the clhsdb code that tries to initialize >>>>>> the javascript support. I'd like to do this in 14 (actually as soon >>>>>> as possible). I'd like to actually do this now even if we decide to >>>>>> keep javascript support and eventually fix it because it will get >>>>>> rid of the warning you see whenever you attach from clhsdb: >>>>>> ???? > >>>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not >>>>>> be available. >>>>>> ???? > >>>>>> ???? > This warning will become more of an issue for the clhsdb >>>>>> tests after I push [4] because then you will also see the full >>>>>> stacktrace for the underlying exception that caused the Javascript >>>>>> to fail to start. Besides being unnecessary noise in passing test >>>>>> cases, it can also be misleading in any test that fails because the >>>>>> exception will be unrelated to the failure. This is actually what >>>>>> got me going down this path of what the javascript support is all >>>>>> about. >>>>>> ???? > >>>>>> ???? > The next step would be to strip out all Javascript related >>>>>> code, including the SOQL and JSDB tools. This would be done in 15. >>>>>> ???? > >>>>>> ???? > Please let me know what you think. >>>>>> ???? > >>>>>> ???? > thanks, >>>>>> ???? > >>>>>> ???? > Chris >>>>>> ???? > >>>>>> ???? > [1] >>>>>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html >>>>>> >>>>>> ???? > [2] >>>>>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html >>>>>> >>>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 >>>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 >>>>>> ???? > >>>>>> >>>> From serguei.spitsyn at oracle.com Wed Dec 11 23:13:34 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Dec 2019 15:13:34 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com> Message-ID: <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com> Hi Daniil, One my concerns was a non-atomic read of multiple metrics before comparison. It creates a potential to get a mismatch in result. However, the probability to get a negative value is pretty low, I think. The other concern (if incorrect metrics are returned) is covered by JDK-8235522. Revising all concerns in JDK-8235522 sounds good to me. Thanks, Serguei On 12/10/19 10:29, Daniil Titov wrote: > Hi Serguei, > >> Do we need to check if the (limit - memLimit) value is negative? >> The same question is for getFreeSwapSpaceSize(): >> memSwapLimit - memLimit - (memSwapUsage - memUsage) >> >> and getFreeMemorySize(): >> 101 return limit - usage; > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method > returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow > up issue I created for that [1]. > > [1] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It is not a full review, just some minor comments. > In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM > runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without > specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on > how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for > a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: > %d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > %d", osBean.getTotalMemorySize())); > 37 > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: > %d", osBean.getFreeMemorySize())); > 39 > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > %d", osBean.getFreePhysicalMemorySize())); > 40 > System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 > System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 > System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 > System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: > %f", osBean.getSystemCpuLoad())); > > > To make the above lines a little bit shorter I'd suggest to define a > log() method like this: > private static void log(String msg) ( System.out.println(msg(; } > > 34 log(String.format("Runtime.availableProcessors: %d", > Runtime.getRuntime().availableProcessors())); > 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", > osBean.getTotalMemorySize())); > 37 > log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", > osBean.getFreeMemorySize())); > 39 > log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", > osBean.getFreePhysicalMemorySize())); > 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", > osBean.getSystemCpuLoad())); > > > Thanks, > Serguei > > > > On 12/6/19 17:41, Daniil Titov wrote: > > Hi David, Mandy, and Bob, > > > > Thank you for reviewing this fix. > > > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > > but I agree that the changes proposed in the previous version of the webrev increase such probability. > > I filed the follow-up issue [4] as Mandy suggested. > > 3. The legacy methods were renamed as David suggested. > > > > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > >> ! static int initialized=1; > >> > >> Am I reading this right that the code currently fails to actually do the > >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed since method > > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > > was always -1. > > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > > in the output. > > > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > > and "1030762496". > > > > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > [0.305s][trace][os,container] Memory Usage is: 42979328 > > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > > 1030762496 > > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > > at java.base/java.lang.Thread.run(Thread.java:832) > > > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > > > Thank you, > > Daniil > > > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > > >> > > >> > > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > > >> > > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > > > I thought that the error case we are referring to is limit == 0 which > > indicates something unexpected goes wrong. So the compatibility concern > > should be low. This is very specific to Metrics implementation for > > cgroup v1 and let me know if I'm wrong. > > > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > > >> > > >> // the host data, value 0 indicates that something went wrong while the metric was read and > > >> // in this case we return "information unavailable" code -1. > > >> > > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > > limits. > > > > > > > It's important to consider carefully if the monitoring API indicates an > > error vs unavailable and an application should continue to run when the > > monitoring system fails to get the metrics. > > > > There are several choices to report "something goes wrong" scenarios > > (should unlikely happen???): > > 1. fall back to a random positive value (e.g. host value) > > 2. return a negative value > > 3. throw an exception > > > > #3 is not an option as the application is not expecting this. For #2, > > the application can filter bad values if desirable. > > > > I'm okay if you want to file a JBS issue to follow up and thoroughly > > look at the cases that the metrics are unavailable and the cases when > > fails to obtain. > > > > >> --- > > >> > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > >> > > >> System.out.println(String.format(...) > > >> > > >> Why not simply > > >> > > >> System.out.printf(..) > > >> > > >> ? > > > > or simply (as I commented [1]) > > System.out.format > > > > Mandy > > [1] > > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > > > > > > > > > From daniil.x.titov at oracle.com Wed Dec 11 23:33:05 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 11 Dec 2019 15:33:05 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com> References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com> <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com> Message-ID: Hi Serguei, Thank you for reviewing this change. Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and memoryAndSwapLimit). The "limit" metrics (memoryLimit and memoryAndSwapLimit) are set when the container starts and are not subjects to change. The only method that reads more than one "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries if the calculated swapUsage is negative as a result of non-atomic reads. Thank you, Daniil ?On 12/11/19, 3:13 PM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, One my concerns was a non-atomic read of multiple metrics before comparison. It creates a potential to get a mismatch in result. However, the probability to get a negative value is pretty low, I think. The other concern (if incorrect metrics are returned) is covered by JDK-8235522. Revising all concerns in JDK-8235522 sounds good to me. Thanks, Serguei On 12/10/19 10:29, Daniil Titov wrote: > Hi Serguei, > >> Do we need to check if the (limit - memLimit) value is negative? >> The same question is for getFreeSwapSpaceSize(): >> memSwapLimit - memLimit - (memSwapUsage - memUsage) >> >> and getFreeMemorySize(): >> 101 return limit - usage; > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method > returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow > up issue I created for that [1]. > > [1] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It is not a full review, just some minor comments. > In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM > runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without > specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on > how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for > a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: > %d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > %d", osBean.getTotalMemorySize())); > 37 > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: > %d", osBean.getFreeMemorySize())); > 39 > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > %d", osBean.getFreePhysicalMemorySize())); > 40 > System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 > System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 > System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 > System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: > %f", osBean.getSystemCpuLoad())); > > > To make the above lines a little bit shorter I'd suggest to define a > log() method like this: > private static void log(String msg) ( System.out.println(msg(; } > > 34 log(String.format("Runtime.availableProcessors: %d", > Runtime.getRuntime().availableProcessors())); > 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", > osBean.getTotalMemorySize())); > 37 > log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", > osBean.getFreeMemorySize())); > 39 > log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", > osBean.getFreePhysicalMemorySize())); > 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", > osBean.getSystemCpuLoad())); > > > Thanks, > Serguei > > > > On 12/6/19 17:41, Daniil Titov wrote: > > Hi David, Mandy, and Bob, > > > > Thank you for reviewing this fix. > > > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > > but I agree that the changes proposed in the previous version of the webrev increase such probability. > > I filed the follow-up issue [4] as Mandy suggested. > > 3. The legacy methods were renamed as David suggested. > > > > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > >> ! static int initialized=1; > >> > >> Am I reading this right that the code currently fails to actually do the > >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed since method > > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > > was always -1. > > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > > in the output. > > > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > > and "1030762496". > > > > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > [0.305s][trace][os,container] Memory Usage is: 42979328 > > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > > 1030762496 > > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > > at java.base/java.lang.Thread.run(Thread.java:832) > > > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > > > Thank you, > > Daniil > > > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > > >> > > >> > > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > > >> > > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > > > I thought that the error case we are referring to is limit == 0 which > > indicates something unexpected goes wrong. So the compatibility concern > > should be low. This is very specific to Metrics implementation for > > cgroup v1 and let me know if I'm wrong. > > > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > > >> > > >> // the host data, value 0 indicates that something went wrong while the metric was read and > > >> // in this case we return "information unavailable" code -1. > > >> > > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > > limits. > > > > > > > It's important to consider carefully if the monitoring API indicates an > > error vs unavailable and an application should continue to run when the > > monitoring system fails to get the metrics. > > > > There are several choices to report "something goes wrong" scenarios > > (should unlikely happen???): > > 1. fall back to a random positive value (e.g. host value) > > 2. return a negative value > > 3. throw an exception > > > > #3 is not an option as the application is not expecting this. For #2, > > the application can filter bad values if desirable. > > > > I'm okay if you want to file a JBS issue to follow up and thoroughly > > look at the cases that the metrics are unavailable and the cases when > > fails to obtain. > > > > >> --- > > >> > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > >> > > >> System.out.println(String.format(...) > > >> > > >> Why not simply > > >> > > >> System.out.printf(..) > > >> > > >> ? > > > > or simply (as I commented [1]) > > System.out.format > > > > Mandy > > [1] > > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > > > > > > > > > From daniil.x.titov at oracle.com Wed Dec 11 23:35:10 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 11 Dec 2019 15:35:10 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com> <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com> Message-ID: <8E4D2767-A223-46AC-A541-D51CC5D3D7AF@oracle.com> Typo fixed... .. that the only "volatile" metrics are "usage" ones ( memoryUsage and *memoryAndSwapUsage*). Best regards, Daniil ?On 12/11/19, 3:33 PM, "Daniil Titov" wrote: Hi Serguei, Thank you for reviewing this change. Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and memoryAndSwapLimit). The "limit" metrics (memoryLimit and memoryAndSwapLimit) are set when the container starts and are not subjects to change. The only method that reads more than one "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries if the calculated swapUsage is negative as a result of non-atomic reads. Thank you, Daniil ?On 12/11/19, 3:13 PM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, One my concerns was a non-atomic read of multiple metrics before comparison. It creates a potential to get a mismatch in result. However, the probability to get a negative value is pretty low, I think. The other concern (if incorrect metrics are returned) is covered by JDK-8235522. Revising all concerns in JDK-8235522 sounds good to me. Thanks, Serguei On 12/10/19 10:29, Daniil Titov wrote: > Hi Serguei, > >> Do we need to check if the (limit - memLimit) value is negative? >> The same question is for getFreeSwapSpaceSize(): >> memSwapLimit - memLimit - (memSwapUsage - memUsage) >> >> and getFreeMemorySize(): >> 101 return limit - usage; > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method > returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow > up issue I created for that [1]. > > [1] https://bugs.openjdk.java.net/browse/JDK-8235522 > > Thank you, > Daniil > > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It is not a full review, just some minor comments. > In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM > runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without > specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on > how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for > a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: > %d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > %d", osBean.getTotalMemorySize())); > 37 > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: > %d", osBean.getFreeMemorySize())); > 39 > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > %d", osBean.getFreePhysicalMemorySize())); > 40 > System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 > System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 > System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 > System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: > %f", osBean.getSystemCpuLoad())); > > > To make the above lines a little bit shorter I'd suggest to define a > log() method like this: > private static void log(String msg) ( System.out.println(msg(; } > > 34 log(String.format("Runtime.availableProcessors: %d", > Runtime.getRuntime().availableProcessors())); > 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", > osBean.getTotalMemorySize())); > 37 > log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", > osBean.getFreeMemorySize())); > 39 > log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", > osBean.getFreePhysicalMemorySize())); > 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", > osBean.getSystemCpuLoad())); > > > Thanks, > Serguei > > > > On 12/6/19 17:41, Daniil Titov wrote: > > Hi David, Mandy, and Bob, > > > > Thank you for reviewing this fix. > > > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > > but I agree that the changes proposed in the previous version of the webrev increase such probability. > > I filed the follow-up issue [4] as Mandy suggested. > > 3. The legacy methods were renamed as David suggested. > > > > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > >> ! static int initialized=1; > >> > >> Am I reading this right that the code currently fails to actually do the > >> initialization because of this ??? > > Yes, currently the code fails to do the initialization but it was unnoticed since method > > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > > was always -1. > > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > >> > >> System.out.println(String.format(...) > >> > >> Why not simply > >> > >> System.out.printf(..) > > As I tried explain it earlier it would make the tests unstable. > > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > > in the output. > > > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > > and "1030762496". > > > > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > [0.305s][trace][os,container] Memory Usage is: 42979328 > > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > > 1030762496 > > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > > at java.base/java.lang.Thread.run(Thread.java:832) > > > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > > > Thank you, > > Daniil > > > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > > >> > > >> > > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > > >> > > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > > > I thought that the error case we are referring to is limit == 0 which > > indicates something unexpected goes wrong. So the compatibility concern > > should be low. This is very specific to Metrics implementation for > > cgroup v1 and let me know if I'm wrong. > > > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > > >> > > >> // the host data, value 0 indicates that something went wrong while the metric was read and > > >> // in this case we return "information unavailable" code -1. > > >> > > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > > limits. > > > > > > > It's important to consider carefully if the monitoring API indicates an > > error vs unavailable and an application should continue to run when the > > monitoring system fails to get the metrics. > > > > There are several choices to report "something goes wrong" scenarios > > (should unlikely happen???): > > 1. fall back to a random positive value (e.g. host value) > > 2. return a negative value > > 3. throw an exception > > > > #3 is not an option as the application is not expecting this. For #2, > > the application can filter bad values if desirable. > > > > I'm okay if you want to file a JBS issue to follow up and thoroughly > > look at the cases that the metrics are unavailable and the cases when > > fails to obtain. > > > > >> --- > > >> > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > >> > > >> System.out.println(String.format(...) > > >> > > >> Why not simply > > >> > > >> System.out.printf(..) > > >> > > >> ? > > > > or simply (as I commented [1]) > > System.out.format > > > > Mandy > > [1] > > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > > > > > > > > > From serguei.spitsyn at oracle.com Wed Dec 11 23:51:12 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Dec 2019 15:51:12 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com> <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com> Message-ID: Hi Daniil, Got it, thanks! Serguei On 12/11/19 15:33, Daniil Titov wrote: > Hi Serguei, > > Thank you for reviewing this change. > > Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and > memoryAndSwapLimit). The "limit" metrics (memoryLimit and memoryAndSwapLimit) are set > when the container starts and are not subjects to change. The only method that reads more than one > "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries if the calculated swapUsage > is negative as a result of non-atomic reads. > > > Thank you, > Daniil > > ?On 12/11/19, 3:13 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > One my concerns was a non-atomic read of multiple metrics before comparison. > It creates a potential to get a mismatch in result. > However, the probability to get a negative value is pretty low, I think. > The other concern (if incorrect metrics are returned) is covered by > JDK-8235522. > Revising all concerns in JDK-8235522 sounds good to me. > > Thanks, > Serguei > > On 12/10/19 10:29, Daniil Titov wrote: > > Hi Serguei, > > > >> Do we need to check if the (limit - memLimit) value is negative? > >> The same question is for getFreeSwapSpaceSize(): > >> memSwapLimit - memLimit - (memSwapUsage - memUsage) > >> > >> and getFreeMemorySize(): > >> 101 return limit - usage; > > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method > > returns would indicate this (currently the native methods already returns -1 if something went wrong). But we could revise it in the follow > > up issue I created for that [1]. > > > > [1] https://bugs.openjdk.java.net/browse/JDK-8235522 > > > > Thank you, > > Daniil > > > > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" wrote: > > > > Hi Daniil, > > > > It is not a full review, just some minor comments. > > In fact, I do not see real problems yet. > > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > > > 55 public long getTotalSwapSpaceSize() { > > 56 if (containerMetrics != null) { > > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > > 58 // The memory limit metrics is not available if JVM > > runs on Linux host ( not in a docker container) > > 59 // or if a docker container was started without > > specifying a memory limit ( without '--memory=' > > 60 // Docker option). In latter case there is no limit on > > how much memory the container can use and > > 61 // it can use as much memory as the host's OS allows. > > 62 long memLimit = containerMetrics.getMemoryLimit(); > > 63 if (limit >= 0 && memLimit >= 0) { > > 64 return limit - memLimit; > > 65 } > > 66 } > > 67 return getTotalSwapSpaceSize0(); > > 68 } > > > > Unneeded space after brackets '('. > > Do we need to check if the (limit - memLimit) value is negative? > > The same question is for getFreeSwapSpaceSize(): > > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > > > and getFreeMemorySize(): > > 101 return limit - usage; > > > > 81 // If this happens just retry the loop for > > a few iterations > > > > Dot is missed at the end of comment. > > > > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > > > 34 System.out.println(String.format("Runtime.availableProcessors: > > %d", Runtime.getRuntime().availableProcessors())); > > 35 > > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > > %d", osBean.getAvailableProcessors())); > > 36 > > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > > %d", osBean.getTotalMemorySize())); > > 37 > > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > > %d", osBean.getTotalPhysicalMemorySize())); > > 38 > > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: > > %d", osBean.getFreeMemorySize())); > > 39 > > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > > %d", osBean.getFreePhysicalMemorySize())); > > 40 > > System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > > %d", osBean.getTotalSwapSpaceSize())); > > 41 > > System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > > %d", osBean.getFreeSwapSpaceSize())); > > 42 > > System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", > > osBean.getCpuLoad())); > > 43 > > System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: > > %f", osBean.getSystemCpuLoad())); > > > > > > To make the above lines a little bit shorter I'd suggest to define a > > log() method like this: > > private static void log(String msg) ( System.out.println(msg(; } > > > > 34 log(String.format("Runtime.availableProcessors: %d", > > Runtime.getRuntime().availableProcessors())); > > 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: > > %d", osBean.getAvailableProcessors())); > > 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", > > osBean.getTotalMemorySize())); > > 37 > > log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > > %d", osBean.getTotalPhysicalMemorySize())); > > 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", > > osBean.getFreeMemorySize())); > > 39 > > log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", > > osBean.getFreePhysicalMemorySize())); > > 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > > %d", osBean.getTotalSwapSpaceSize())); > > 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > > %d", osBean.getFreeSwapSpaceSize())); > > 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", > > osBean.getCpuLoad())); > > 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", > > osBean.getSystemCpuLoad())); > > > > > > Thanks, > > Serguei > > > > > > > > On 12/6/19 17:41, Daniil Titov wrote: > > > Hi David, Mandy, and Bob, > > > > > > Thank you for reviewing this fix. > > > > > > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) > > > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. > > > 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize > > > was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. > > > I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, > > > but I agree that the changes proposed in the previous version of the webrev increase such probability. > > > I filed the follow-up issue [4] as Mandy suggested. > > > 3. The legacy methods were renamed as David suggested. > > > > > > > > >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c > > >> ! static int initialized=1; > > >> > > >> Am I reading this right that the code currently fails to actually do the > > >> initialization because of this ??? > > > Yes, currently the code fails to do the initialization but it was unnoticed since method > > > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" > > > was always -1. > > > > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > >> > > >> System.out.println(String.format(...) > > >> > > >> Why not simply > > >> > > >> System.out.printf(..) > > > As I tried explain it earlier it would make the tests unstable. > > > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. > > > Instead it parses the format string into a list of FormatString objects and then iterates over the list. > > > As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find > > > in the output. > > > > > > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" > > > and "1030762496". > > > > > > > > > [0.304s][trace][os,container] Memory Usage is: 42983424 > > > OperatingSystemMXBean.getFreeMemorySize: 1030758400 > > > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > > [0.305s][trace][os,container] Memory Usage is: 42979328 > > > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes > > > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 > > > 1030762496 > > > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 > > > > > > > > > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr > > > > > > at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) > > > at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) > > > at TestMemoryAwareness.main(TestMemoryAwareness.java:73) > > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > > > at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) > > > at java.base/java.lang.Thread.run(Thread.java:832) > > > > > > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 > > > [4] https://bugs.openjdk.java.net/browse/JDK-8235522 > > > > > > Thank you, > > > Daniil > > > > > > ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: > > > > > > > > > > > > On 12/6/19 5:59 AM, Bob Vandette wrote: > > > >> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: > > > >> > > > >> > > > >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java > > > >> > > > >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. > > > > > > I thought that the error case we are referring to is limit == 0 which > > > indicates something unexpected goes wrong. So the compatibility concern > > > should be low. This is very specific to Metrics implementation for > > > cgroup v1 and let me know if I'm wrong. > > > > > > >> Surely there must always be some information available from the operating environment? I see from the impl file: > > > >> > > > >> // the host data, value 0 indicates that something went wrong while the metric was read and > > > >> // in this case we return "information unavailable" code -1. > > > >> > > > >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. > > > > I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. > > > > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others > > > > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no > > > > limits. > > > > > > > > > > It's important to consider carefully if the monitoring API indicates an > > > error vs unavailable and an application should continue to run when the > > > monitoring system fails to get the metrics. > > > > > > There are several choices to report "something goes wrong" scenarios > > > (should unlikely happen???): > > > 1. fall back to a random positive value (e.g. host value) > > > 2. return a negative value > > > 3. throw an exception > > > > > > #3 is not an option as the application is not expecting this. For #2, > > > the application can filter bad values if desirable. > > > > > > I'm okay if you want to file a JBS issue to follow up and thoroughly > > > look at the cases that the metrics are unavailable and the cases when > > > fails to obtain. > > > > > > >> --- > > > >> > > > >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java > > > >> > > > >> System.out.println(String.format(...) > > > >> > > > >> Why not simply > > > >> > > > >> System.out.printf(..) > > > >> > > > >> ? > > > > > > or simply (as I commented [1]) > > > System.out.format > > > > > > Mandy > > > [1] > > > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html > > > > > > > > > > > > > > > > > > > > > > > > From suenaga at oss.nttdata.com Thu Dec 12 01:07:37 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 12 Dec 2019 10:07:37 +0900 Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <2c066d67-aa3c-83e2-632a-1ba3114d1538@samersoff.net> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2c066d67-aa3c-83e2-632a-1ba3114d1538@samersoff.net> Message-ID: <635516c4-8f89-c9e6-75c4-9debd8a315be@oss.nttdata.com> Hi Dmitry, Thanks for your comment! On 2019/12/12 0:34, Dmitry Samersoff wrote: > Hello Yasumasa, > > Please, > > 1. Consider to use mmap for reading elf sections. Did you pointed `read_section_data()`? lib->eh_frame.data = read_section_data(lib->fd, &ehdr, sh); I do not change implementation of `read_section_data()`. If you want to change to use mmap, I think it should be fixed as another issue. > 2. Please move all platfrom-specific parts of native code to a separate > file/directory. Current patch will brake AARCH64 build. Unfortunately JDK libraries (shared libraries excepts HotSpot) seem not to care CPU type in makefiles. http://hg.openjdk.java.net/jdk/jdk/file/f22d91b2d072/make/common/JdkNativeCompilation.gmk#l38 I believe my patch do not call platform-specific function(s). Can you share your concern? > 3. I didn't find any tests here. How did your test the changes? It can be tested in TestJhsdbJstackMixed and ClhsdbPstack whether mixed jstack can work without error. We can add the test whether native frames exist in the result, but I found same issue on Windows. So I do not want to add it now. > libproc_impl.c > > 131: If is not necessary, free handles NULLPTR gracefully. Thanks, I will fix it. Yasumasa > -Dmitry > > > On 04.12.19 03:54, Yasumasa Suenaga wrote: >> PING: Could you review it? >> >> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >> >> This bug is targeted to JDK 14. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/11/28 21:39, Yasumasa Suenaga wrote: >>> Hi, >>> >>> I refactored LinuxAMD64CFrame.java . It works fine in >>> serviceability/sa tests and >>> all tests on submit repo >>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>> Could you review new webrev? >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>> >>> The diff from previous webrev is here: >>> ?? http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>> Hi all, >>>> >>>> Please review this change: >>>> >>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>> >>>> >>>> According to 2.7 Stack Unwind Algorithm in System V Application >>>> Binary Interface AMD64 >>>> Architecture Processor Supplement [1], we need to use DWARF in >>>> .eh_frame or .debug_frame >>>> for stack unwinding. >>>> >>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since >>>> GCC 4.6, so system >>>> library (e.g. libc) might be compiled with this feature. >>>> >>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer >>>> register (RBP). >>>> So it might be lack of stack frames. >>>> >>>> I guess JDK-8219201 is caused by same issue. >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> [1] >>>> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>>> From suenaga at oss.nttdata.com Thu Dec 12 01:27:07 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 12 Dec 2019 10:27:07 +0900 Subject: Removal of SA javascript support In-Reply-To: <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com> References: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com> <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com> <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com> <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com> Message-ID: <1e497ba4-7897-8b23-a53a-255fa3ee0aea@oss.nttdata.com> I discussed with Kris about this in OpenJDK Committers' Workshop last year. In case of .NET Core, SOS is provided to integrate runtime debugging feature to native debugger. If same feature will be provided, I'm very happy! For example, GDB and WinDbg provides remote debug server. If (CL)HSDB can connect to native debugger, we can gather data in Java layer via them. I think we can delegate most of memory access to native debugger. Also we can run custom scripts on GDB. Then I think we need minimum support for Java call frames, OOP, and SymbolTable. Yasumasa On 2019/12/11 21:47, sundararajan.athijegannathan at oracle.com wrote: > Effectively you're asking for SA as API. I don't think that is a good idea. That implies supporting hotspot data structures as Java *API*. That will be maintainability nightmare - we've to keep tracking hotspot data structures in SA code. That itself is problematic. API would be next level nightmare. > > -Sundar > > On 11/12/19 11:57 am, Yasumasa Suenaga wrote: >> Hi, >> >> IMHO we need to export all packages in SA if we do not provide new API for SA. >> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 (before Jigsaw), so we could make various functions if we need. >> >> OTOH we cannot know what classes are needed by the SA users. All packages in jdk.hotspot.agent module provides features, and they require other packages. For example, sun.jvm.hotspot.oops.Oop requires sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger . >> It is difficult to track and to export minimally. >> (I worked for it in JDK-8157947, but I gave up...) >> >> Thus I guess it is a big challenge to export SA classes without refactoring. >> If we provide new API for SA plugin, I guess we need to work some refactoring. >> >> >> Yasumasa >> >> >> On 2019/12/11 15:00, Chris Plummer wrote: >>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote: >>>> On 2019/12/11 14:39, Krystal Mok wrote: >>>>> Hi?Yasumasa, >>>>> >>>>> That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right? >>>> >>>> Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc... >>>> >>> Yes, or export them. I should have read this email before posting my previous one. >>> >>> Chris >>>> >>>> Yasumasa >>>> >>>> >>>>> [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246 >>>>> >>>>> - Kris >>>>> >>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga > wrote: >>>>> >>>>> ??? Hi Chris, >>>>> >>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw. >>>>> ??? However I want SA to implement pluggable feature. >>>>> ??? I use custom script to list compiled codes in CodeCache. >>>>> >>>>> ??? I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash. >>>>> >>>>> >>>>> ??? Thanks, >>>>> >>>>> ??? Yasumasa >>>>> >>>>> >>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote: >>>>> ???? > Hi, >>>>> ???? > >>>>> ???? > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it. >>>>> ???? > >>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation. >>>>> ???? > >>>>> ???? > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following: >>>>> ???? > >>>>> ???? >? ?registerCommand("class", "class name", "jclass"); >>>>> ???? >? ?registerCommand("classes", "classes", "jclasses"); >>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass"); >>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap"); >>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem"); >>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps"); >>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis"); >>>>> ???? > >>>>> ???? > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1]. >>>>> ???? > >>>>> ???? > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented. >>>>> ???? > >>>>> ???? > The real purpose of the email is to propose removal of this support. Here are the reasons: >>>>> ???? > >>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered. >>>>> ???? > (2) Nashorn is deprecated and will be removed eventually. >>>>> ???? > (3) We have very little understanding of the javascript support. >>>>> ???? > (4) No resources to work on it (unless there is a community volunteer). >>>>> ???? > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value. >>>>> ???? > >>>>> ???? > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript. >>>>> ???? > >>>>> ???? > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb: >>>>> ???? > >>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not be available. >>>>> ???? > >>>>> ???? > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about. >>>>> ???? > >>>>> ???? > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15. >>>>> ???? > >>>>> ???? > Please let me know what you think. >>>>> ???? > >>>>> ???? > thanks, >>>>> ???? > >>>>> ???? > Chris >>>>> ???? > >>>>> ???? > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html >>>>> ???? > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html >>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594 >>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277 >>>>> ???? > >>>>> >>> >>> From fairoz.matte at oracle.com Thu Dec 12 03:10:56 2019 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Wed, 11 Dec 2019 19:10:56 -0800 (PST) Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled Message-ID: Hi, Please review this small change, Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call. JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ This patch is provided by Yasumasa Suenaga Thanks, Fairoz From daniil.x.titov at oracle.com Thu Dec 12 03:25:49 2019 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 11 Dec 2019 19:25:49 -0800 Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware In-Reply-To: References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com> <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com> <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com> <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com> <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com> <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com> <072d6861-1374-8190-135d-e30ece2ee380@oracle.com> <2A286F91-6CB8-4A32-B103-2C28D52E5A1A@oracle.com> Message-ID: <0569FA2A-AEB3-41E1-8306-11290159B030@oracle.com> Hi Bob, David, Mandy, and Serguei, Thank you for reviewing this change! Best regards, Daniil ?On 12/11/19, 9:21 AM, "Bob Vandette" wrote: Yes, I defer to Mandy on the best way to express the various Java exceptions. I?m ok with the changes. Thanks for getting this done for JDK14! Bob. > On Dec 11, 2019, at 12:12 PM, Daniil Titov wrote: > > Hi Serguei, > > Thank you for your comments. I will correct this nits before pushing the changes. > > Hi Bob and David, > >> [Mandy Chung] >>> I reviewed Metrics and Subsystem in this version. >>> I don't need to see a new webrev. > > As I understood Mandy finished reviewing this fix. Just wanted to confirm with you if you are okey with that version of the fix (webrev.06) ? > > Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests passed. > > Thank you, > Daniil > > > > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It is not a full review, just some minor comments. > In fact, I do not see real problems yet. > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html > > 55 public long getTotalSwapSpaceSize() { > 56 if (containerMetrics != null) { > 57 long limit = containerMetrics.getMemoryAndSwapLimit(); > 58 // The memory limit metrics is not available if JVM > runs on Linux host ( not in a docker container) > 59 // or if a docker container was started without > specifying a memory limit ( without '--memory=' > 60 // Docker option). In latter case there is no limit on > how much memory the container can use and > 61 // it can use as much memory as the host's OS allows. > 62 long memLimit = containerMetrics.getMemoryLimit(); > 63 if (limit >= 0 && memLimit >= 0) { > 64 return limit - memLimit; > 65 } > 66 } > 67 return getTotalSwapSpaceSize0(); > 68 } > > Unneeded space after brackets '('. > Do we need to check if the (limit - memLimit) value is negative? > The same question is for getFreeSwapSpaceSize(): > memSwapLimit - memLimit - (memSwapUsage - memUsage) > > and getFreeMemorySize(): > 101 return limit - usage; > > 81 // If this happens just retry the loop for > a few iterations > > Dot is missed at the end of comment. > > > http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html > > 34 System.out.println(String.format("Runtime.availableProcessors: > %d", Runtime.getRuntime().availableProcessors())); > 35 > System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 > System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: > %d", osBean.getTotalMemorySize())); > 37 > System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 > System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: > %d", osBean.getFreeMemorySize())); > 39 > System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: > %d", osBean.getFreePhysicalMemorySize())); > 40 > System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 > System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 > System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 > System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: > %f", osBean.getSystemCpuLoad())); > > > To make the above lines a little bit shorter I'd suggest to define a > log() method like this: > private static void log(String msg) ( System.out.println(msg(; } > > 34 log(String.format("Runtime.availableProcessors: %d", > Runtime.getRuntime().availableProcessors())); > 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: > %d", osBean.getAvailableProcessors())); > 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", > osBean.getTotalMemorySize())); > 37 > log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: > %d", osBean.getTotalPhysicalMemorySize())); > 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", > osBean.getFreeMemorySize())); > 39 > log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", > osBean.getFreePhysicalMemorySize())); > 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: > %d", osBean.getTotalSwapSpaceSize())); > 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: > %d", osBean.getFreeSwapSpaceSize())); > 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f", > osBean.getCpuLoad())); > 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", > osBean.getSystemCpuLoad())); > > > Thanks, > Serguei > > > > On 12/6/19 17:41, Daniil Titov wrote: >> Hi David, Mandy, and Bob, >> >> Thank you for reviewing this fix. >> >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04) >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. >> 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize >> was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read. >> I would like to mention that currently the native implementation of these methods de-facto may return -1 at some circumstances, >> but I agree that the changes proposed in the previous version of the webrev increase such probability. >> I filed the follow-up issue [4] as Mandy suggested. >> 3. The legacy methods were renamed as David suggested. >> >> >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c >>> ! static int initialized=1; >>> >>> Am I reading this right that the code currently fails to actually do the >>> initialization because of this ??? >> Yes, currently the code fails to do the initialization but it was unnoticed since method >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which" >> was always -1. >> >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>> >>> System.out.println(String.format(...) >>> >>> Why not simply >>> >>> System.out.printf(..) >> As I tried explain it earlier it would make the tests unstable. >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically. >> Instead it parses the format string into a list of FormatString objects and then iterates over the list. >> As a result, the other traces occasionally got printed between these iterations and break the pattern the test is expected to find >> in the output. >> >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" >> and "1030762496". >> >> >> [0.304s][trace][os,container] Memory Usage is: 42983424 >> OperatingSystemMXBean.getFreeMemorySize: 1030758400 >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> [0.305s][trace][os,container] Memory Usage is: 42979328 >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes >> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232 >> 1030762496 >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176 >> >> >> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr >> >> at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306) >> at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151) >> at TestMemoryAwareness.main(TestMemoryAwareness.java:73) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.base/java.lang.reflect.Method.invoke(Method.java:564) >> at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298) >> at java.base/java.lang.Thread.run(Thread.java:832) >> >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running. >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575 >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522 >> >> Thank you, >> Daniil >> >> ?On 12/6/19, 1:38 PM, "Mandy Chung" wrote: >> >> >> >> On 12/6/19 5:59 AM, Bob Vandette wrote: >>>> On Dec 6, 2019, at 2:49 AM, David Holmes wrote: >>>> >>>> >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java >>>> >>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. >> >> I thought that the error case we are referring to is limit == 0 which >> indicates something unexpected goes wrong. So the compatibility concern >> should be low. This is very specific to Metrics implementation for >> cgroup v1 and let me know if I'm wrong. >> >>>> Surely there must always be some information available from the operating environment? I see from the impl file: >>>> >>>> // the host data, value 0 indicates that something went wrong while the metric was read and >>>> // in this case we return "information unavailable" code -1. >>>> >>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO. >>> I agree with David on the compatibility concern. I originally thought that -1 was already a specified return for all of these methods. >>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others >>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no >>> limits. >>> >> >> It's important to consider carefully if the monitoring API indicates an >> error vs unavailable and an application should continue to run when the >> monitoring system fails to get the metrics. >> >> There are several choices to report "something goes wrong" scenarios >> (should unlikely happen???): >> 1. fall back to a random positive value (e.g. host value) >> 2. return a negative value >> 3. throw an exception >> >> #3 is not an option as the application is not expecting this. For #2, >> the application can filter bad values if desirable. >> >> I'm okay if you want to file a JBS issue to follow up and thoroughly >> look at the cases that the metrics are unavailable and the cases when >> fails to obtain. >> >>>> --- >>>> >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java >>>> >>>> System.out.println(String.format(...) >>>> >>>> Why not simply >>>> >>>> System.out.printf(..) >>>> >>>> ? >> >> or simply (as I commented [1]) >> System.out.format >> >> Mandy >> [1] >> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html >> >> >> >> > > > > From suenaga at oss.nttdata.com Thu Dec 12 03:43:40 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 12 Dec 2019 12:43:40 +0900 Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: References: Message-ID: <2ec88d9c-ddd1-9da1-f919-781ccfa8099f@oss.nttdata.com> Hi Fairoz, Looks good! I want you to backport this change to both jdk11u and 8u. Thanks, Yasumasa On 2019/12/12 12:10, Fairoz Matte wrote: > Hi, > > Please review this small change, > Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call. > > JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 > Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ > > This patch is provided by Yasumasa Suenaga > > Thanks, > Fairoz > From fairoz.matte at oracle.com Thu Dec 12 08:30:13 2019 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Thu, 12 Dec 2019 00:30:13 -0800 (PST) Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: <2ec88d9c-ddd1-9da1-f919-781ccfa8099f@oss.nttdata.com> References: <2ec88d9c-ddd1-9da1-f919-781ccfa8099f@oss.nttdata.com> Message-ID: Hi Yasumasa, Thanks for the review. Sure, I will get them on 8u and 11u. Thanks, Fairoz > -----Original Message----- > From: Yasumasa Suenaga > Sent: Thursday, December 12, 2019 9:14 AM > To: Fairoz Matte ; serviceability- > dev at openjdk.java.net > Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if > prelink is enabled > > Hi Fairoz, > > Looks good! > I want you to backport this change to both jdk11u and 8u. > > > Thanks, > > Yasumasa > > > On 2019/12/12 12:10, Fairoz Matte wrote: > > Hi, > > > > Please review this small change, > > Updating error handling, to make sure "lib_base_diff = 0" is still a valid > scenario even after calc_prelinked_load_address() call. > > > > JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 > > Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ > > > > This patch is provided by Yasumasa Suenaga > > > > Thanks, > > Fairoz > > From christoph.langer at sap.com Thu Dec 12 10:00:30 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 12 Dec 2019 10:00:30 +0000 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: Message-ID: Hi Matthias, I think your current patch is good as it is ? at least it wouldn?t make things worse, AFAICS. Further improvements can probably be done under another issue. Cheers Christoph From: serviceability-dev On Behalf Of Baesken, Matthias Sent: Freitag, 29. November 2019 08:18 To: Thomas St?fe Cc: serviceability-dev at openjdk.java.net Subject: [CAUTION] RE: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter Hi Thomas, Christoph, thanks for the comments . Of course the init of * decodedLen must be added . In case of returning NULL from decodePath , we would have tmp == NULL (in char* tmp = func; ) , assign tmp to res and then we jplis_assert , see : #define TRANSFORM(res,func) { \ char* tmp = func; \ if (tmp != res) { \ free(res); \ res = tmp; \ } \ jplis_assert((void*)res != (void*)NULL); \ } ?. TRANSFORM(path, decodePath(path,&len)); New webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.2/ Best regards, Matthias From: Thomas St?fe > Sent: Freitag, 29. November 2019 07:30 To: Baesken, Matthias > Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter Hi Matthias, I am not certain the callers are prepared to handle NULL. This is used in a chain of TRANSFORM macro calls which AFAICS do not handle NULL; e.g. , at 872, we pass the returned pointer to convertUft8ToPlatformString which passes it on (on Windows) to MultiByteToWideChar, which does not handle NULL input. So I wonder whether a clear error message with an exit would be better in this case. Otherwise we may get a crash just some instructions later. Cheers, Thomas On Thu, Nov 28, 2019 at 5:21 PM Baesken, Matthias > wrote: Hello, please review this small patch . It adds return value checking for calloc at one place where it is missing . Thanks, Matthias Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8234968 http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.1/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Thu Dec 12 11:37:33 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 12 Dec 2019 11:37:33 +0000 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: Message-ID: <155511e7-f336-5280-2d9d-06c48270b1f2@oracle.com> On 12/12/2019 10:00, Langer, Christoph wrote: > > Hi Matthias, > > I think your current patch is good as it is ? at least it wouldn?t > make things worse, AFAICS. > > Further improvements can probably be done under another issue. > > Yes, another issue is fine. If decodePath can't allocate during the onload phase (-javaagent case) then it would be better to have the VM initialization abort. The late binding agent case is trickery but it is wrong to continue with the Boot-Class-Path attribute ignored. -Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Thu Dec 12 12:01:05 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 12 Dec 2019 13:01:05 +0100 Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min generation capacity > max generation capacity) Message-ID: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com> Hi all, Please review this patch to fix a problem with unintialized values in our generation counters. https://cr.openjdk.java.net/~stefank/8226797/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8226797 The jstat values NGCMN and OGCMN both return uninitialized values. I stumbled upon this while creating a patch to remove the GenerationSpec class. GenerationSpec::_min_size is never initialized, and then used to create the generations: case Generation::DefNew: return new DefNewGeneration(rs, _init_size, _min_size, _max_size); case Generation::MarkSweepCompact: return new TenuredGeneration(rs, _init_size, _min_size, _max_size, remset); That in turn uses it to initialize the perf counters: DefNewGeneration::DefNewGeneration(ReservedSpace rs, size_t initial_size, size_t min_size, size_t max_size, const char* policy) ... _gen_counters = new GenerationCounters("new", 0, 3, min_size, max_size, &_virtual_space); I'm setting the value to _init_size, because it reflects how MinNewSize and MinOldSize relates to NewSize and OldSize. Thanks, StefanK From stefan.karlsson at oracle.com Thu Dec 12 15:23:09 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 12 Dec 2019 16:23:09 +0100 Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min generation capacity > max generation capacity) In-Reply-To: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com> References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com> Message-ID: <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com> In the interest to get this integrated before the RDP cut-off I'm going to push this ASAP. This has gone through tier1-tier3 testing. StefanK On 2019-12-12 13:01, Stefan Karlsson wrote: > Hi all, > > Please review this patch to fix a problem with unintialized values in > our generation counters. > > https://cr.openjdk.java.net/~stefank/8226797/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8226797 > > The jstat values NGCMN and OGCMN both return uninitialized values. > > I stumbled upon this while creating a patch to remove the GenerationSpec > class. > > GenerationSpec::_min_size is never initialized, and then used to create > the generations: > > ??? case Generation::DefNew: > ????? return new DefNewGeneration(rs, _init_size, _min_size, _max_size); > > ??? case Generation::MarkSweepCompact: > ????? return new TenuredGeneration(rs, _init_size, _min_size, > _max_size, remset); > > That in turn uses it to initialize the perf counters: > DefNewGeneration::DefNewGeneration(ReservedSpace rs, > ?????????????????????????????????? size_t initial_size, > ?????????????????????????????????? size_t min_size, > ?????????????????????????????????? size_t max_size, > ?????????????????????????????????? const char* policy) > ... > ? _gen_counters = new GenerationCounters("new", 0, 3, > ????? min_size, max_size, &_virtual_space); > > I'm setting the value to _init_size, because it reflects how MinNewSize > and MinOldSize relates to NewSize and OldSize. > > Thanks, > StefanK From daniel.daugherty at oracle.com Thu Dec 12 16:06:14 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 12 Dec 2019 11:06:14 -0500 Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min generation capacity > max generation capacity) In-Reply-To: <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com> References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com> <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com> Message-ID: <8ee949e7-b7c2-b8b7-e7fc-eaaff444e59f@oracle.com> src/hotspot/share/gc/shared/generationSpec.hpp ??? No comments. test/hotspot/jtreg/serviceability/tmtools/jstat/utils/JstatGcCapacityResults.java ??? No comments. Thumbs up. Dan On 12/12/19 10:23 AM, Stefan Karlsson wrote: > In the interest to get this integrated before the RDP cut-off I'm > going to push this ASAP. This has gone through tier1-tier3 testing. > > StefanK > > On 2019-12-12 13:01, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to fix a problem with unintialized values in >> our generation counters. >> >> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8226797 >> >> The jstat values NGCMN and OGCMN both return uninitialized values. >> >> I stumbled upon this while creating a patch to remove the >> GenerationSpec class. >> >> GenerationSpec::_min_size is never initialized, and then used to >> create the generations: >> >> ???? case Generation::DefNew: >> ?????? return new DefNewGeneration(rs, _init_size, _min_size, >> _max_size); >> >> ???? case Generation::MarkSweepCompact: >> ?????? return new TenuredGeneration(rs, _init_size, _min_size, >> _max_size, remset); >> >> That in turn uses it to initialize the perf counters: >> DefNewGeneration::DefNewGeneration(ReservedSpace rs, >> ??????????????????????????????????? size_t initial_size, >> ??????????????????????????????????? size_t min_size, >> ??????????????????????????????????? size_t max_size, >> ??????????????????????????????????? const char* policy) >> ... >> ?? _gen_counters = new GenerationCounters("new", 0, 3, >> ?????? min_size, max_size, &_virtual_space); >> >> I'm setting the value to _init_size, because it reflects how >> MinNewSize and MinOldSize relates to NewSize and OldSize. >> >> Thanks, >> StefanK From chris.plummer at oracle.com Thu Dec 12 16:18:13 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 12 Dec 2019 08:18:13 -0800 Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: References: Message-ID: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR. Chris On 12/11/19 7:10 PM, Fairoz Matte wrote: > Hi, > > Please review this small change, > Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call. > > JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 > Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ > > This patch is provided by Yasumasa Suenaga > > Thanks, > Fairoz From stefan.karlsson at oracle.com Thu Dec 12 16:19:23 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 12 Dec 2019 17:19:23 +0100 Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min generation capacity > max generation capacity) In-Reply-To: <8ee949e7-b7c2-b8b7-e7fc-eaaff444e59f@oracle.com> References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com> <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com> <8ee949e7-b7c2-b8b7-e7fc-eaaff444e59f@oracle.com> Message-ID: <50c7ec6f-5cd1-3363-02be-1a9058942d89@oracle.com> Thanks, Dan. StefanK On 2019-12-12 17:06, Daniel D. Daugherty wrote: > src/hotspot/share/gc/shared/generationSpec.hpp > ??? No comments. > > test/hotspot/jtreg/serviceability/tmtools/jstat/utils/JstatGcCapacityResults.java > > ??? No comments. > > Thumbs up. > > Dan > > > On 12/12/19 10:23 AM, Stefan Karlsson wrote: >> In the interest to get this integrated before the RDP cut-off I'm >> going to push this ASAP. This has gone through tier1-tier3 testing. >> >> StefanK >> >> On 2019-12-12 13:01, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please review this patch to fix a problem with unintialized values in >>> our generation counters. >>> >>> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8226797 >>> >>> The jstat values NGCMN and OGCMN both return uninitialized values. >>> >>> I stumbled upon this while creating a patch to remove the >>> GenerationSpec class. >>> >>> GenerationSpec::_min_size is never initialized, and then used to >>> create the generations: >>> >>> ???? case Generation::DefNew: >>> ?????? return new DefNewGeneration(rs, _init_size, _min_size, >>> _max_size); >>> >>> ???? case Generation::MarkSweepCompact: >>> ?????? return new TenuredGeneration(rs, _init_size, _min_size, >>> _max_size, remset); >>> >>> That in turn uses it to initialize the perf counters: >>> DefNewGeneration::DefNewGeneration(ReservedSpace rs, >>> ??????????????????????????????????? size_t initial_size, >>> ??????????????????????????????????? size_t min_size, >>> ??????????????????????????????????? size_t max_size, >>> ??????????????????????????????????? const char* policy) >>> ... >>> ?? _gen_counters = new GenerationCounters("new", 0, 3, >>> ?????? min_size, max_size, &_virtual_space); >>> >>> I'm setting the value to _init_size, because it reflects how >>> MinNewSize and MinOldSize relates to NewSize and OldSize. >>> >>> Thanks, >>> StefanK > From vladimir.kozlov at oracle.com Thu Dec 12 18:20:25 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 12 Dec 2019 10:20:25 -0800 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com> References: <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com> Message-ID: <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com> Hi David, Tiered is disabled because we don't want to see compilations and outputs from C1 compiler which does not have EA. The test is specifically written for C2 only (not for C1 or Graal) to verify its Escape Analysis optimization. I did not look in great details into test's code but its analysis may be affected if C1 compiler is also used. Richard may clarify this. thanks, Vladimir On 12/11/19 1:04 PM, David Holmes wrote: > On 12/12/2019 5:21 am, Vladimir Kozlov wrote: >> I will do full review later. I want to comment about test command line. >> >> You don't need vm.opt.TieredCompilation != true in @requires because >> you specified -XX:-TieredCompilation in @run command. > > And per my comment this should be being tested with tiered as well. > > David > >> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip >> test from running in Interpreter mode too. >> >> Thanks, >> Vladimir >> >> On 12/11/19 7:07 AM, Reingruber, Richard wrote: >>> Hi David, >>> >>> ?? > Most of the details here are in areas I can comment on in >>> detail, but I >>> ?? > did take an initial general look at things. >>> >>> Thanks for taking the time! >>> >>> ?? > The only thing that jumped out at me is that I think the >>> ?? > DeoptimizeObjectsALotThread should be a hidden thread. >>> ?? > >>> ?? > +? bool is_hidden_from_external_view() const { return true; } >>> >>> Yes, it should. Will add the method like above. >>> >>> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>> Without >>> ?? > active testing this will just bit-rot. >>> >>> DeoptimizeObjectsALot is meant for stress testing with a larger >>> workload. I will add a minimal test >>> to keep it fresh. >>> >>> ?? > Also on the tests I don't understand your @requires clause: >>> ?? > >>> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>> ?? > (vm.opt.TieredCompilation != true)) >>> ?? > >>> ?? > This seems to require that TieredCompilation is disabled, but >>> tiered is >>> ?? > our normal mode of operation. ?? >>> ?? > >>> >>> I removed the clause. I guess I wanted to target the tests towards >>> the code they are supposed to >>> test, and it's easier to analyze failures w/o tiered compilation and >>> with just one compiler thread. >>> >>> Additionally I will make use of >>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>> >>> Thanks, >>> Richard. >>> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>> To: Reingruber, Richard ; >>> serviceability-dev at openjdk.java.net; >>> hotspot-compiler-dev at openjdk.java.net; >>> hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi Richard, >>> >>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>> Hi, >>>> >>>> I would like to get reviews please for >>>> >>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>> >>>> Corresponding RFE: >>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>> >>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>> >>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>> issues (thanks!). In addition the >>>> change is being tested at SAP since I posted the first RFR some >>>> months ago. >>>> >>>> The intention of this enhancement is to benefit performance wise >>>> from escape analysis even if JVMTI >>>> agents request capabilities that allow them to access local variable >>>> values. E.g. if you start-up >>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>> escape analysis is disabled right >>>> from the beginning, well before a debugger attaches -- if ever one >>>> should do so. With the >>>> enhancement, escape analysis will remain enabled until and after a >>>> debugger attaches. EA based >>>> optimizations are reverted just before an agent acquires the >>>> reference to an object. In the JBS item >>>> you'll find more details. >>> >>> Most of the details here are in areas I can comment on in detail, but I >>> did take an initial general look at things. >>> >>> The only thing that jumped out at me is that I think the >>> DeoptimizeObjectsALotThread should be a hidden thread. >>> >>> +? bool is_hidden_from_external_view() const { return true; } >>> >>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without >>> active testing this will just bit-rot. >>> >>> Also on the tests I don't understand your @requires clause: >>> >>> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>> (vm.opt.TieredCompilation != true)) >>> >>> This seems to require that TieredCompilation is disabled, but tiered is >>> our normal mode of operation. ?? >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> Richard. >>>> >>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>> >>>> From richard.reingruber at sap.com Thu Dec 12 23:02:26 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 12 Dec 2019 23:02:26 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com> References: <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com> <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com> Message-ID: Hello Vladimir, thanks for having a look. > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip > test from running in Interpreter mode too. Done. > You don't need vm.opt.TieredCompilation != true in @requires because you > specified -XX:-TieredCompilation in @run command. Ok. > The test is specifically written for C2 only (not for C1 or Graal) to > verify its Escape Analysis optimization. > I did not look in great details into test's code but its analysis may be > affected if C1 compiler is also used. > > Richard may clarify this. The test cases aim to get their testmethod 'dontinline_testMethod' compiled by C2. If they get C1 compiled before doesn't matter all that much. I've got a slight preference to disabled tiered compilation for simplicity. Thanks, Richard. -----Original Message----- From: Vladimir Kozlov Sent: Donnerstag, 12. Dezember 2019 19:20 To: David Holmes ; hotspot-runtime-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; Reingruber, Richard Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi David, Tiered is disabled because we don't want to see compilations and outputs from C1 compiler which does not have EA. The test is specifically written for C2 only (not for C1 or Graal) to verify its Escape Analysis optimization. I did not look in great details into test's code but its analysis may be affected if C1 compiler is also used. Richard may clarify this. thanks, Vladimir On 12/11/19 1:04 PM, David Holmes wrote: > On 12/12/2019 5:21 am, Vladimir Kozlov wrote: >> I will do full review later. I want to comment about test command line. >> >> You don't need vm.opt.TieredCompilation != true in @requires because >> you specified -XX:-TieredCompilation in @run command. > > And per my comment this should be being tested with tiered as well. > > David > >> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip >> test from running in Interpreter mode too. >> >> Thanks, >> Vladimir >> >> On 12/11/19 7:07 AM, Reingruber, Richard wrote: >>> Hi David, >>> >>> ?? > Most of the details here are in areas I can comment on in >>> detail, but I >>> ?? > did take an initial general look at things. >>> >>> Thanks for taking the time! >>> >>> ?? > The only thing that jumped out at me is that I think the >>> ?? > DeoptimizeObjectsALotThread should be a hidden thread. >>> ?? > >>> ?? > +? bool is_hidden_from_external_view() const { return true; } >>> >>> Yes, it should. Will add the method like above. >>> >>> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>> Without >>> ?? > active testing this will just bit-rot. >>> >>> DeoptimizeObjectsALot is meant for stress testing with a larger >>> workload. I will add a minimal test >>> to keep it fresh. >>> >>> ?? > Also on the tests I don't understand your @requires clause: >>> ?? > >>> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>> ?? > (vm.opt.TieredCompilation != true)) >>> ?? > >>> ?? > This seems to require that TieredCompilation is disabled, but >>> tiered is >>> ?? > our normal mode of operation. ?? >>> ?? > >>> >>> I removed the clause. I guess I wanted to target the tests towards >>> the code they are supposed to >>> test, and it's easier to analyze failures w/o tiered compilation and >>> with just one compiler thread. >>> >>> Additionally I will make use of >>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>> >>> Thanks, >>> Richard. >>> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>> To: Reingruber, Richard ; >>> serviceability-dev at openjdk.java.net; >>> hotspot-compiler-dev at openjdk.java.net; >>> hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi Richard, >>> >>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>> Hi, >>>> >>>> I would like to get reviews please for >>>> >>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>> >>>> Corresponding RFE: >>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>> >>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>> >>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>> issues (thanks!). In addition the >>>> change is being tested at SAP since I posted the first RFR some >>>> months ago. >>>> >>>> The intention of this enhancement is to benefit performance wise >>>> from escape analysis even if JVMTI >>>> agents request capabilities that allow them to access local variable >>>> values. E.g. if you start-up >>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>> escape analysis is disabled right >>>> from the beginning, well before a debugger attaches -- if ever one >>>> should do so. With the >>>> enhancement, escape analysis will remain enabled until and after a >>>> debugger attaches. EA based >>>> optimizations are reverted just before an agent acquires the >>>> reference to an object. In the JBS item >>>> you'll find more details. >>> >>> Most of the details here are in areas I can comment on in detail, but I >>> did take an initial general look at things. >>> >>> The only thing that jumped out at me is that I think the >>> DeoptimizeObjectsALotThread should be a hidden thread. >>> >>> +? bool is_hidden_from_external_view() const { return true; } >>> >>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without >>> active testing this will just bit-rot. >>> >>> Also on the tests I don't understand your @requires clause: >>> >>> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>> (vm.opt.TieredCompilation != true)) >>> >>> This seems to require that TieredCompilation is disabled, but tiered is >>> our normal mode of operation. ?? >>> >>> Thanks, >>> David >>> >>>> Thanks, >>>> Richard. >>>> >>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>> >>>> From david.holmes at oracle.com Thu Dec 12 23:32:40 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Dec 2019 09:32:40 +1000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com> <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com> Message-ID: <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com> On 13/12/2019 9:02 am, Reingruber, Richard wrote: > Hello Vladimir, > > thanks for having a look. > > > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip > > test from running in Interpreter mode too. > > Done. > > > You don't need vm.opt.TieredCompilation != true in @requires because you > > specified -XX:-TieredCompilation in @run command. > > Ok. > > > The test is specifically written for C2 only (not for C1 or Graal) to > > verify its Escape Analysis optimization. > > I did not look in great details into test's code but its analysis may be > > affected if C1 compiler is also used. > > > > Richard may clarify this. > > The test cases aim to get their testmethod 'dontinline_testMethod' compiled by C2. If they get C1 > compiled before doesn't matter all that much. I've got a slight preference to disabled tiered > compilation for simplicity. My concern - perhaps unfounded - is that this seems to be being tested only in a pure C2 environment when the actual changes will have to operate correctly in a tiered environment (and JVMCI). Thanks, David > Thanks, Richard. > > -----Original Message----- > From: Vladimir Kozlov > Sent: Donnerstag, 12. Dezember 2019 19:20 > To: David Holmes ; hotspot-runtime-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; Reingruber, Richard > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi David, > > Tiered is disabled because we don't want to see compilations and outputs > from C1 compiler which does not have EA. > > The test is specifically written for C2 only (not for C1 or Graal) to > verify its Escape Analysis optimization. > I did not look in great details into test's code but its analysis may be > affected if C1 compiler is also used. > > Richard may clarify this. > > thanks, > Vladimir > > On 12/11/19 1:04 PM, David Holmes wrote: >> On 12/12/2019 5:21 am, Vladimir Kozlov wrote: >>> I will do full review later. I want to comment about test command line. >>> >>> You don't need vm.opt.TieredCompilation != true in @requires because >>> you specified -XX:-TieredCompilation in @run command. >> >> And per my comment this should be being tested with tiered as well. >> >> David >> >>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip >>> test from running in Interpreter mode too. >>> >>> Thanks, >>> Vladimir >>> >>> On 12/11/19 7:07 AM, Reingruber, Richard wrote: >>>> Hi David, >>>> >>>> ?? > Most of the details here are in areas I can comment on in >>>> detail, but I >>>> ?? > did take an initial general look at things. >>>> >>>> Thanks for taking the time! >>>> >>>> ?? > The only thing that jumped out at me is that I think the >>>> ?? > DeoptimizeObjectsALotThread should be a hidden thread. >>>> ?? > >>>> ?? > +? bool is_hidden_from_external_view() const { return true; } >>>> >>>> Yes, it should. Will add the method like above. >>>> >>>> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>> Without >>>> ?? > active testing this will just bit-rot. >>>> >>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>> workload. I will add a minimal test >>>> to keep it fresh. >>>> >>>> ?? > Also on the tests I don't understand your @requires clause: >>>> ?? > >>>> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>> ?? > (vm.opt.TieredCompilation != true)) >>>> ?? > >>>> ?? > This seems to require that TieredCompilation is disabled, but >>>> tiered is >>>> ?? > our normal mode of operation. ?? >>>> ?? > >>>> >>>> I removed the clause. I guess I wanted to target the tests towards >>>> the code they are supposed to >>>> test, and it's easier to analyze failures w/o tiered compilation and >>>> with just one compiler thread. >>>> >>>> Additionally I will make use of >>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>> >>>> Thanks, >>>> Richard. >>>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>> To: Reingruber, Richard ; >>>> serviceability-dev at openjdk.java.net; >>>> hotspot-compiler-dev at openjdk.java.net; >>>> hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>> Performance in the Presence of JVMTI Agents >>>> >>>> Hi Richard, >>>> >>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>> Hi, >>>>> >>>>> I would like to get reviews please for >>>>> >>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>> >>>>> Corresponding RFE: >>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>> >>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>> >>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>> issues (thanks!). In addition the >>>>> change is being tested at SAP since I posted the first RFR some >>>>> months ago. >>>>> >>>>> The intention of this enhancement is to benefit performance wise >>>>> from escape analysis even if JVMTI >>>>> agents request capabilities that allow them to access local variable >>>>> values. E.g. if you start-up >>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>> escape analysis is disabled right >>>>> from the beginning, well before a debugger attaches -- if ever one >>>>> should do so. With the >>>>> enhancement, escape analysis will remain enabled until and after a >>>>> debugger attaches. EA based >>>>> optimizations are reverted just before an agent acquires the >>>>> reference to an object. In the JBS item >>>>> you'll find more details. >>>> >>>> Most of the details here are in areas I can comment on in detail, but I >>>> did take an initial general look at things. >>>> >>>> The only thing that jumped out at me is that I think the >>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>> >>>> +? bool is_hidden_from_external_view() const { return true; } >>>> >>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without >>>> active testing this will just bit-rot. >>>> >>>> Also on the tests I don't understand your @requires clause: >>>> >>>> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>> (vm.opt.TieredCompilation != true)) >>>> >>>> This seems to require that TieredCompilation is disabled, but tiered is >>>> our normal mode of operation. ?? >>>> >>>> Thanks, >>>> David >>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>>> >>>>> From david.holmes at oracle.com Thu Dec 12 23:55:35 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Dec 2019 09:55:35 +1000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> Message-ID: Hi Richard, Some further queries/concerns: src/hotspot/share/runtime/objectMonitor.cpp Can you please explain the changes to ObjectMonitor::wait: ! _recursions = save // restore the old recursion count ! + jt->get_and_reset_relock_count_after_wait(); // increased by the deferred relock count what is the "deferred relock count"? I gather it relates to "The code was extended to be able to deoptimize objects of a frame that is not the top frame and to let another thread than the owning thread do it." which I don't like the sound of at all when it comes to ObjectMonitor state. So I'd like to understand in detail exactly what is going on here and why. This is a very intrusive change that seems to badly break encapsulation and impacts future changes to ObjectMonitor that are under investigation. --- src/hotspot/share/runtime/thread.cpp Can you please explain why JavaThread::wait_for_object_deoptimization has to be handcrafted in this way rather than using proper transitions. We got rid of "deopt suspend" some time ago and it is disturbing to see it being added back (effectively). This seems like it may be something that handshakes could be used for. Thanks, David ----- On 12/12/2019 7:02 am, David Holmes wrote: > On 12/12/2019 1:07 am, Reingruber, Richard wrote: >> Hi David, >> >> ?? > Most of the details here are in areas I can comment on in detail, >> but I >> ?? > did take an initial general look at things. >> >> Thanks for taking the time! > > Apologies the above should read: > > "Most of the details here are in areas I *can't* comment on in detail ..." > > David > >> ?? > The only thing that jumped out at me is that I think the >> ?? > DeoptimizeObjectsALotThread should be a hidden thread. >> ?? > >> ?? > +? bool is_hidden_from_external_view() const { return true; } >> >> Yes, it should. Will add the method like above. >> >> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >> Without >> ?? > active testing this will just bit-rot. >> >> DeoptimizeObjectsALot is meant for stress testing with a larger >> workload. I will add a minimal test >> to keep it fresh. >> >> ?? > Also on the tests I don't understand your @requires clause: >> ?? > >> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >> ?? > (vm.opt.TieredCompilation != true)) >> ?? > >> ?? > This seems to require that TieredCompilation is disabled, but >> tiered is >> ?? > our normal mode of operation. ?? >> ?? > >> >> I removed the clause. I guess I wanted to target the tests towards the >> code they are supposed to >> test, and it's easier to analyze failures w/o tiered compilation and >> with just one compiler thread. >> >> Additionally I will make use of >> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >> >> Thanks, >> Richard. >> >> -----Original Message----- >> From: David Holmes >> Sent: Mittwoch, 11. Dezember 2019 08:03 >> To: Reingruber, Richard ; >> serviceability-dev at openjdk.java.net; >> hotspot-compiler-dev at openjdk.java.net; >> hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >> Performance in the Presence of JVMTI Agents >> >> Hi Richard, >> >> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>> Hi, >>> >>> I would like to get reviews please for >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>> >>> Corresponding RFE: >>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>> >>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>> >>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>> issues (thanks!). In addition the >>> change is being tested at SAP since I posted the first RFR some >>> months ago. >>> >>> The intention of this enhancement is to benefit performance wise from >>> escape analysis even if JVMTI >>> agents request capabilities that allow them to access local variable >>> values. E.g. if you start-up >>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>> escape analysis is disabled right >>> from the beginning, well before a debugger attaches -- if ever one >>> should do so. With the >>> enhancement, escape analysis will remain enabled until and after a >>> debugger attaches. EA based >>> optimizations are reverted just before an agent acquires the >>> reference to an object. In the JBS item >>> you'll find more details. >> >> Most of the details here are in areas I can comment on in detail, but I >> did take an initial general look at things. >> >> The only thing that jumped out at me is that I think the >> DeoptimizeObjectsALotThread should be a hidden thread. >> >> +? bool is_hidden_from_external_view() const { return true; } >> >> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without >> active testing this will just bit-rot. >> >> Also on the tests I don't understand your @requires clause: >> >> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >> (vm.opt.TieredCompilation != true)) >> >> This seems to require that TieredCompilation is disabled, but tiered is >> our normal mode of operation. ?? >> >> Thanks, >> David >> >>> Thanks, >>> Richard. >>> >>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>> >>> From serguei.spitsyn at oracle.com Fri Dec 13 00:41:12 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 12 Dec 2019 16:41:12 -0800 Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min generation capacity > max generation capacity) In-Reply-To: <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com> References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com> <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com> Message-ID: <35c25ed3-948b-482d-2f21-1ffffdf1afd9@oracle.com> Hi Stefan, It looks good to me. Sorry, I was on the meeting, wrote this email and forgot to push 'send' button. Just now discovered that it has not been really sent. :( Thanks, Serguei On 12/12/19 07:23, Stefan Karlsson wrote: > In the interest to get this integrated before the RDP cut-off I'm > going to push this ASAP. This has gone through tier1-tier3 testing. > > StefanK > > On 2019-12-12 13:01, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to fix a problem with unintialized values in >> our generation counters. >> >> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8226797 >> >> The jstat values NGCMN and OGCMN both return uninitialized values. >> >> I stumbled upon this while creating a patch to remove the >> GenerationSpec class. >> >> GenerationSpec::_min_size is never initialized, and then used to >> create the generations: >> >> ???? case Generation::DefNew: >> ?????? return new DefNewGeneration(rs, _init_size, _min_size, >> _max_size); >> >> ???? case Generation::MarkSweepCompact: >> ?????? return new TenuredGeneration(rs, _init_size, _min_size, >> _max_size, remset); >> >> That in turn uses it to initialize the perf counters: >> DefNewGeneration::DefNewGeneration(ReservedSpace rs, >> ??????????????????????????????????? size_t initial_size, >> ??????????????????????????????????? size_t min_size, >> ??????????????????????????????????? size_t max_size, >> ??????????????????????????????????? const char* policy) >> ... >> ?? _gen_counters = new GenerationCounters("new", 0, 3, >> ?????? min_size, max_size, &_virtual_space); >> >> I'm setting the value to _init_size, because it reflects how >> MinNewSize and MinOldSize relates to NewSize and OldSize. >> >> Thanks, >> StefanK From vladimir.kozlov at oracle.com Fri Dec 13 00:56:16 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 12 Dec 2019 16:56:16 -0800 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com> References: <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com> <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com> <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com> Message-ID: Yes, David You are correct these changes touch all part of VM and may affect Graal (which also has EA) too. Changes should be tested in all our modes: tiered, C1 only, Graal, Interpreter. And I realized that I only ran tier3-graal testing so I submitted the rest of Graal's tiers now. I had assumed that our current testing (I ran all from tier1 to tier8) should exercise all paths in VM these changes touch. But I may be wrong and it is correct to ask author to add testing in all VM modes to make sure new code in VM's runtime and JVMTI is tested. I do like to keep what current test is doing with C2. May be add an other test for other modes or modify current one to enable to run it in other modes. Thanks, Vladimir On 12/12/19 3:32 PM, David Holmes wrote: > On 13/12/2019 9:02 am, Reingruber, Richard wrote: >> Hello Vladimir, >> >> thanks for having a look. >> >> ?? > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip >> ?? > test from running in Interpreter mode too. >> >> Done. >> >> ?? > You don't need vm.opt.TieredCompilation != true in @requires because you >> ?? > specified -XX:-TieredCompilation in @run command. >> >> Ok. >> >> ?? > The test is specifically written for C2 only (not for C1 or Graal) to >> ?? > verify its Escape Analysis optimization. >> ?? > I did not look in great details into test's code but its analysis may be >> ?? > affected if C1 compiler is also used. >> ?? > >> ?? > Richard may clarify this. >> >> The test cases aim to get their testmethod 'dontinline_testMethod' compiled by C2. If they get C1 >> compiled before doesn't matter all that much. I've got a slight preference to disabled tiered >> compilation for simplicity. > > My concern - perhaps unfounded - is that this seems to be being tested only in a pure C2 environment > when the actual changes will have to operate correctly in a tiered environment (and JVMCI). > > Thanks, > David > >> Thanks, Richard. >> >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Donnerstag, 12. Dezember 2019 19:20 >> To: David Holmes ; hotspot-runtime-dev at openjdk.java.net; >> hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; Reingruber, Richard >> >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of >> JVMTI Agents >> >> Hi David, >> >> Tiered is disabled because we don't want to see compilations and outputs >> from C1 compiler which does not have EA. >> >> The test is specifically written for C2 only (not for C1 or Graal) to >> verify its Escape Analysis optimization. >> I did not look in great details into test's code but its analysis may be >> affected if C1 compiler is also used. >> >> Richard may clarify this. >> >> thanks, >> Vladimir >> >> On 12/11/19 1:04 PM, David Holmes wrote: >>> On 12/12/2019 5:21 am, Vladimir Kozlov wrote: >>>> I will do full review later. I want to comment about test command line. >>>> >>>> You don't need vm.opt.TieredCompilation != true in @requires because >>>> you specified -XX:-TieredCompilation in @run command. >>> >>> And per my comment this should be being tested with tiered as well. >>> >>> David >>> >>>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip >>>> test from running in Interpreter mode too. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 12/11/19 7:07 AM, Reingruber, Richard wrote: >>>>> Hi David, >>>>> >>>>> ??? > Most of the details here are in areas I can comment on in >>>>> detail, but I >>>>> ??? > did take an initial general look at things. >>>>> >>>>> Thanks for taking the time! >>>>> >>>>> ??? > The only thing that jumped out at me is that I think the >>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>>> ??? > >>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>>> >>>>> Yes, it should. Will add the method like above. >>>>> >>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>> Without >>>>> ??? > active testing this will just bit-rot. >>>>> >>>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>>> workload. I will add a minimal test >>>>> to keep it fresh. >>>>> >>>>> ??? > Also on the tests I don't understand your @requires clause: >>>>> ??? > >>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>> ??? > (vm.opt.TieredCompilation != true)) >>>>> ??? > >>>>> ??? > This seems to require that TieredCompilation is disabled, but >>>>> tiered is >>>>> ??? > our normal mode of operation. ?? >>>>> ??? > >>>>> >>>>> I removed the clause. I guess I wanted to target the tests towards >>>>> the code they are supposed to >>>>> test, and it's easier to analyze failures w/o tiered compilation and >>>>> with just one compiler thread. >>>>> >>>>> Additionally I will make use of >>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>>> To: Reingruber, Richard ; >>>>> serviceability-dev at openjdk.java.net; >>>>> hotspot-compiler-dev at openjdk.java.net; >>>>> hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>> Performance in the Presence of JVMTI Agents >>>>> >>>>> Hi Richard, >>>>> >>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>>> Hi, >>>>>> >>>>>> I would like to get reviews please for >>>>>> >>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>>> >>>>>> Corresponding RFE: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>>> >>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>>> >>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>>> issues (thanks!). In addition the >>>>>> change is being tested at SAP since I posted the first RFR some >>>>>> months ago. >>>>>> >>>>>> The intention of this enhancement is to benefit performance wise >>>>>> from escape analysis even if JVMTI >>>>>> agents request capabilities that allow them to access local variable >>>>>> values. E.g. if you start-up >>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>>> escape analysis is disabled right >>>>>> from the beginning, well before a debugger attaches -- if ever one >>>>>> should do so. With the >>>>>> enhancement, escape analysis will remain enabled until and after a >>>>>> debugger attaches. EA based >>>>>> optimizations are reverted just before an agent acquires the >>>>>> reference to an object. In the JBS item >>>>>> you'll find more details. >>>>> >>>>> Most of the details here are in areas I can comment on in detail, but I >>>>> did take an initial general look at things. >>>>> >>>>> The only thing that jumped out at me is that I think the >>>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>>> >>>>> +? bool is_hidden_from_external_view() const { return true; } >>>>> >>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without >>>>> active testing this will just bit-rot. >>>>> >>>>> Also on the tests I don't understand your @requires clause: >>>>> >>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>> (vm.opt.TieredCompilation != true)) >>>>> >>>>> This seems to require that TieredCompilation is disabled, but tiered is >>>>> our normal mode of operation. ?? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>>>> >>>>>> From david.holmes at oracle.com Fri Dec 13 01:52:48 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Dec 2019 11:52:48 +1000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com> <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com> <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com> Message-ID: <01509185-7b0b-a269-deb1-799444cf082f@oracle.com> On 13/12/2019 10:56 am, Vladimir Kozlov wrote: > Yes, David > > You are correct these changes touch all part of VM and may affect Graal > (which also has EA) too. > Changes should be tested in all our modes: tiered, C1 only, Graal, > Interpreter. And I realized that I only ran tier3-graal testing so I > submitted the rest of Graal's tiers now. > > I had assumed that our current testing (I ran all from tier1 to tier8) > should exercise all paths in VM these changes touch. But I may be wrong > and it is correct to ask author to add testing in all VM modes to make > sure new code in VM's runtime and JVMTI is tested. It may be that our existing JVM TI tests will exercise this adequately and that the new tests are more "whitebox" testing than general functional tests. But it is not obvious to me that we do have the coverage we need. Cheers, David > I do like to keep what current test is doing with C2. May be add an > other test for other modes or modify current one to enable to run it in > other modes. > > Thanks, > Vladimir > > On 12/12/19 3:32 PM, David Holmes wrote: >> On 13/12/2019 9:02 am, Reingruber, Richard wrote: >>> Hello Vladimir, >>> >>> thanks for having a look. >>> >>> ?? > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to >>> skip >>> ?? > test from running in Interpreter mode too. >>> >>> Done. >>> >>> ?? > You don't need vm.opt.TieredCompilation != true in @requires >>> because you >>> ?? > specified -XX:-TieredCompilation in @run command. >>> >>> Ok. >>> >>> ?? > The test is specifically written for C2 only (not for C1 or >>> Graal) to >>> ?? > verify its Escape Analysis optimization. >>> ?? > I did not look in great details into test's code but its >>> analysis may be >>> ?? > affected if C1 compiler is also used. >>> ?? > >>> ?? > Richard may clarify this. >>> >>> The test cases aim to get their testmethod 'dontinline_testMethod' >>> compiled by C2. If they get C1 >>> compiled before doesn't matter all that much. I've got a slight >>> preference to disabled tiered >>> compilation for simplicity. >> >> My concern - perhaps unfounded - is that this seems to be being tested >> only in a pure C2 environment when the actual changes will have to >> operate correctly in a tiered environment (and JVMCI). >> >> Thanks, >> David >> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Donnerstag, 12. Dezember 2019 19:20 >>> To: David Holmes ; >>> hotspot-runtime-dev at openjdk.java.net; >>> hotspot-compiler-dev at openjdk.java.net; >>> serviceability-dev at openjdk.java.net; Reingruber, Richard >>> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi David, >>> >>> Tiered is disabled because we don't want to see compilations and outputs >>> from C1 compiler which does not have EA. >>> >>> The test is specifically written for C2 only (not for C1 or Graal) to >>> verify its Escape Analysis optimization. >>> I did not look in great details into test's code but its analysis may be >>> affected if C1 compiler is also used. >>> >>> Richard may clarify this. >>> >>> thanks, >>> Vladimir >>> >>> On 12/11/19 1:04 PM, David Holmes wrote: >>>> On 12/12/2019 5:21 am, Vladimir Kozlov wrote: >>>>> I will do full review later. I want to comment about test command >>>>> line. >>>>> >>>>> You don't need vm.opt.TieredCompilation != true in @requires because >>>>> you specified -XX:-TieredCompilation in @run command. >>>> >>>> And per my comment this should be being tested with tiered as well. >>>> >>>> David >>>> >>>>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip >>>>> test from running in Interpreter mode too. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 12/11/19 7:07 AM, Reingruber, Richard wrote: >>>>>> Hi David, >>>>>> >>>>>> ??? > Most of the details here are in areas I can comment on in >>>>>> detail, but I >>>>>> ??? > did take an initial general look at things. >>>>>> >>>>>> Thanks for taking the time! >>>>>> >>>>>> ??? > The only thing that jumped out at me is that I think the >>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>>>> ??? > >>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>>>> >>>>>> Yes, it should. Will add the method like above. >>>>>> >>>>>> ??? > Also I don't see any testing of the >>>>>> DeoptimizeObjectsALotThread. >>>>>> Without >>>>>> ??? > active testing this will just bit-rot. >>>>>> >>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>>>> workload. I will add a minimal test >>>>>> to keep it fresh. >>>>>> >>>>>> ??? > Also on the tests I don't understand your @requires clause: >>>>>> ??? > >>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>>> ??? > (vm.opt.TieredCompilation != true)) >>>>>> ??? > >>>>>> ??? > This seems to require that TieredCompilation is disabled, but >>>>>> tiered is >>>>>> ??? > our normal mode of operation. ?? >>>>>> ??? > >>>>>> >>>>>> I removed the clause. I guess I wanted to target the tests towards >>>>>> the code they are supposed to >>>>>> test, and it's easier to analyze failures w/o tiered compilation and >>>>>> with just one compiler thread. >>>>>> >>>>>> Additionally I will make use of >>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes >>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>>>> To: Reingruber, Richard ; >>>>>> serviceability-dev at openjdk.java.net; >>>>>> hotspot-compiler-dev at openjdk.java.net; >>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>>> Performance in the Presence of JVMTI Agents >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I would like to get reviews please for >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>>>> >>>>>>> Corresponding RFE: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>>>> >>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>>>> >>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>>>> issues (thanks!). In addition the >>>>>>> change is being tested at SAP since I posted the first RFR some >>>>>>> months ago. >>>>>>> >>>>>>> The intention of this enhancement is to benefit performance wise >>>>>>> from escape analysis even if JVMTI >>>>>>> agents request capabilities that allow them to access local variable >>>>>>> values. E.g. if you start-up >>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>>>> escape analysis is disabled right >>>>>>> from the beginning, well before a debugger attaches -- if ever one >>>>>>> should do so. With the >>>>>>> enhancement, escape analysis will remain enabled until and after a >>>>>>> debugger attaches. EA based >>>>>>> optimizations are reverted just before an agent acquires the >>>>>>> reference to an object. In the JBS item >>>>>>> you'll find more details. >>>>>> >>>>>> Most of the details here are in areas I can comment on in detail, >>>>>> but I >>>>>> did take an initial general look at things. >>>>>> >>>>>> The only thing that jumped out at me is that I think the >>>>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>>>> >>>>>> +? bool is_hidden_from_external_view() const { return true; } >>>>>> >>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>>> Without >>>>>> active testing this will just bit-rot. >>>>>> >>>>>> Also on the tests I don't understand your @requires clause: >>>>>> >>>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>>> (vm.opt.TieredCompilation != true)) >>>>>> >>>>>> This seems to require that TieredCompilation is disabled, but >>>>>> tiered is >>>>>> our normal mode of operation. ?? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks, >>>>>>> Richard. >>>>>>> >>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>>>>> >>>>>>> >>>>>>> From linzang at tencent.com Fri Dec 13 06:22:16 2019 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Fri, 13 Dec 2019 06:22:16 +0000 Subject: Discuss the design of parallel and incremental jmap histo. Message-ID: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com> Dear All, I want to re-activate the thread of discussion about the implementation of parallel and incremental ?Jmap -histo?. The target of these changes is to solve the problems that ?jmap -histo? may ? timeout or killed by timer? when heap is large. And the result of ?jmap -histo? is ?one or nothing?, which means if it gets killed before exit, user gets no information about the heap. The ?incremental? means that jmap -histo dumps the intermediate results when it is iterating the heap, so if it is interrupted, user can get some meaningful information. The ?parallel? targets to help speed up the heap iteration with multi-threading. Originally I have implemented the ?incremental dump? that dump the intermediate data into a separate file like , and the final result will be saved to another file . so when jmap -histo get interrupted, user can get information from , and if jmap -histo works fine, the final result would be in . And the parallel dump will have multiple thread working on heap iteration, each thread generates intermediate data timely. The main reason of using separate file for incremental dump is due to the consideration of parallel incremental dump implementation, so that every heap-iteration thread could dump its own data in separate file, to avoid using file lock. However, it seems that the original design might confuse user by having two or more result files (intermediated result and final result). So I want to ask your help to discuss it: 1. For incremental dump without parallel, Intermediate result and the final result are dumped to the same file: In this case, the intermediate data are generated in the middle of heap iteration, they are written to file at the same time. And if jmap -histo exits normally, the final result will be also dump to , then all intermediate data are flushed. 1. For parallel dump without incremental: Every thread generates its own thread-local dump buffer, and all thread local dump are merged and write to the file at the end. There is no incremental support, so the result is ?one or nothing?. 1. For parallel + incremental dump, I think it?s a little complicated because of intermediate data processing: * Every thread has its own thread-local intermediate data buffer, and all the thread-local buffers will be written to file while holding file lock. So there is only one data file generated, and if jmap -histo is interrupted, the intermediated data are save in the same file. The problem is that the file write lock can be heavy, which may cause parallel heap dump slow. * Every thread has its own thread-local intermediate data buffer, and every thread save its result in an temp file named . So there is no file lock. The parallel can be fast. But the problem is that there will be multiple files generated to save the thread-local intermediate results. And this might confuse the user. * Every thread has its own thread-local intermediate data buffer, and another ?data-merging-thread? will be generated. The parallel threads write data to its thread local buffer, and enqueue the buffer when data reach some threshold. The ?data-merging-thread? consumes the queue, merge the data from different thread, save the merged data to the result file. In this case, there is only one file generated. And there is no file lock needed, but there is queue lock, and a separate ?merging thread? impl. Do you think this is a reasonable solution? So may I ask your suggestion ? Details of previous discussion can be found at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028276.html Thanks! BRs, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From gnu.andrew at redhat.com Fri Dec 13 06:35:38 2019 From: gnu.andrew at redhat.com (Andrew John Hughes) Date: Fri, 13 Dec 2019 06:35:38 +0000 Subject: [8u] RFR: 8195088: [TEST_BUG] StartManagementAgent got unexpected exception In-Reply-To: <85c52e84-e827-9af1-3cec-6dff26daeda0@oracle.com> References: <85c52e84-e827-9af1-3cec-6dff26daeda0@oracle.com> Message-ID: <446f1d58-eb0d-4944-55b1-5f691d566f6a@redhat.com> On 03/10/2019 00:47, serguei.spitsyn at oracle.com wrote: > Hi Severin, > > It looks good and applies cleanly. > So, I'm not sure you really need a review for this. > > Thanks, > Serguei > > On 10/1/19 06:01, Severin Gehwolf wrote: >> Hi, >> >> Please review this OpenJDK 8u vs. Oracle JDK 8 parity patch. I wasn't >> sure whether I need review for this one as the bug in question is a JDK >> 8-only bug and the patch applies as-is. Anyway, here it is: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8195088 >> webrev: >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8195088/01/webrev/ >> >> Testing: StartManagementAgent.java test fails prior and passes after >> this patch. >> >> Thoughts? >> >> Thanks, >> Severin >> > What "applies cleanly"? As far as I can see, this is part of JDK-8165736. As the rest of 8165736 is not being applied, then yes, it needs review. Approved. I'll push it as part of b05. Thanks, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 https://keybase.io/gnu_andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: OpenPGP digital signature URL: From stefan.karlsson at oracle.com Fri Dec 13 08:39:29 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 13 Dec 2019 09:39:29 +0100 Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min generation capacity > max generation capacity) In-Reply-To: <35c25ed3-948b-482d-2f21-1ffffdf1afd9@oracle.com> References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com> <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com> <35c25ed3-948b-482d-2f21-1ffffdf1afd9@oracle.com> Message-ID: <5b5bbdd0-b1a0-4c1c-ee13-ce8adbbca592@oracle.com> Hi Serguei, On 2019-12-13 01:41, serguei.spitsyn at oracle.com wrote: > Hi Stefan, > > It looks good to me. Thanks for reviewing. > > Sorry, I was on the meeting, wrote this email and forgot to push 'send' > button. > Just now discovered that it has not been really sent. :( No problem. I pushed this yesterday to make the JDK 14 fork cut-off. Thanks, StefanK > > Thanks, > Serguei > > > On 12/12/19 07:23, Stefan Karlsson wrote: >> In the interest to get this integrated before the RDP cut-off I'm >> going to push this ASAP. This has gone through tier1-tier3 testing. >> >> StefanK >> >> On 2019-12-12 13:01, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please review this patch to fix a problem with unintialized values in >>> our generation counters. >>> >>> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8226797 >>> >>> The jstat values NGCMN and OGCMN both return uninitialized values. >>> >>> I stumbled upon this while creating a patch to remove the >>> GenerationSpec class. >>> >>> GenerationSpec::_min_size is never initialized, and then used to >>> create the generations: >>> >>> ???? case Generation::DefNew: >>> ?????? return new DefNewGeneration(rs, _init_size, _min_size, >>> _max_size); >>> >>> ???? case Generation::MarkSweepCompact: >>> ?????? return new TenuredGeneration(rs, _init_size, _min_size, >>> _max_size, remset); >>> >>> That in turn uses it to initialize the perf counters: >>> DefNewGeneration::DefNewGeneration(ReservedSpace rs, >>> ??????????????????????????????????? size_t initial_size, >>> ??????????????????????????????????? size_t min_size, >>> ??????????????????????????????????? size_t max_size, >>> ??????????????????????????????????? const char* policy) >>> ... >>> ?? _gen_counters = new GenerationCounters("new", 0, 3, >>> ?????? min_size, max_size, &_virtual_space); >>> >>> I'm setting the value to _init_size, because it reflects how >>> MinNewSize and MinOldSize relates to NewSize and OldSize. >>> >>> Thanks, >>> StefanK > From fairoz.matte at oracle.com Fri Dec 13 13:02:21 2019 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Fri, 13 Dec 2019 05:02:21 -0800 (PST) Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> References: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> Message-ID: Hi Chris, Thanks for the review, Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro for -1L. I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L. http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/ Thanks, Fairoz -----Original Message----- From: Chris Plummer Sent: Thursday, December 12, 2019 9:48 PM To: Fairoz Matte ; serviceability-dev at openjdk.java.net Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR. Chris On 12/11/19 7:10 PM, Fairoz Matte wrote: > Hi, > > Please review this small change, > Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call. > > JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 > Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ > > This patch is provided by Yasumasa Suenaga > > Thanks, > Fairoz From richard.reingruber at sap.com Fri Dec 13 14:17:00 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 13 Dec 2019 14:17:00 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <01509185-7b0b-a269-deb1-799444cf082f@oracle.com> References: <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com> <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com> <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com> <01509185-7b0b-a269-deb1-799444cf082f@oracle.com> Message-ID: Hi David, Vladimir, The tests are very targeted and customized towards the issues they solve. IMHO they should be run in the configuration they are tailored for, but as I said, I'm ok with removing the tiered options/conditions. The enhancement should be covered also by existing JVMTI, JDI, JDWP tests, assuming they are also executed with Xcomp. If running the tests with Graal as C2 replacement you'll get failures, because the JVMCI compiler does not provide the debug info required at runtime (see compiledVFrame::not_global_escape_in_scope() and compiledVFrame::arg_escape). Still it would be possible to change the tests to expect these failures when executed with Graal. Perhaps I should do this? Thanks, Richard. -----Original Message----- From: David Holmes Sent: Freitag, 13. Dezember 2019 02:53 To: Vladimir Kozlov ; Reingruber, Richard ; hotspot-runtime-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents On 13/12/2019 10:56 am, Vladimir Kozlov wrote: > Yes, David > > You are correct these changes touch all part of VM and may affect Graal > (which also has EA) too. > Changes should be tested in all our modes: tiered, C1 only, Graal, > Interpreter. And I realized that I only ran tier3-graal testing so I > submitted the rest of Graal's tiers now. > > I had assumed that our current testing (I ran all from tier1 to tier8) > should exercise all paths in VM these changes touch. But I may be wrong > and it is correct to ask author to add testing in all VM modes to make > sure new code in VM's runtime and JVMTI is tested. It may be that our existing JVM TI tests will exercise this adequately and that the new tests are more "whitebox" testing than general functional tests. But it is not obvious to me that we do have the coverage we need. Cheers, David > I do like to keep what current test is doing with C2. May be add an > other test for other modes or modify current one to enable to run it in > other modes. > > Thanks, > Vladimir > > On 12/12/19 3:32 PM, David Holmes wrote: >> On 13/12/2019 9:02 am, Reingruber, Richard wrote: >>> Hello Vladimir, >>> >>> thanks for having a look. >>> >>> ?? > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to >>> skip >>> ?? > test from running in Interpreter mode too. >>> >>> Done. >>> >>> ?? > You don't need vm.opt.TieredCompilation != true in @requires >>> because you >>> ?? > specified -XX:-TieredCompilation in @run command. >>> >>> Ok. >>> >>> ?? > The test is specifically written for C2 only (not for C1 or >>> Graal) to >>> ?? > verify its Escape Analysis optimization. >>> ?? > I did not look in great details into test's code but its >>> analysis may be >>> ?? > affected if C1 compiler is also used. >>> ?? > >>> ?? > Richard may clarify this. >>> >>> The test cases aim to get their testmethod 'dontinline_testMethod' >>> compiled by C2. If they get C1 >>> compiled before doesn't matter all that much. I've got a slight >>> preference to disabled tiered >>> compilation for simplicity. >> >> My concern - perhaps unfounded - is that this seems to be being tested >> only in a pure C2 environment when the actual changes will have to >> operate correctly in a tiered environment (and JVMCI). >> >> Thanks, >> David >> >>> Thanks, Richard. >>> >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Donnerstag, 12. Dezember 2019 19:20 >>> To: David Holmes ; >>> hotspot-runtime-dev at openjdk.java.net; >>> hotspot-compiler-dev at openjdk.java.net; >>> serviceability-dev at openjdk.java.net; Reingruber, Richard >>> >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi David, >>> >>> Tiered is disabled because we don't want to see compilations and outputs >>> from C1 compiler which does not have EA. >>> >>> The test is specifically written for C2 only (not for C1 or Graal) to >>> verify its Escape Analysis optimization. >>> I did not look in great details into test's code but its analysis may be >>> affected if C1 compiler is also used. >>> >>> Richard may clarify this. >>> >>> thanks, >>> Vladimir >>> >>> On 12/11/19 1:04 PM, David Holmes wrote: >>>> On 12/12/2019 5:21 am, Vladimir Kozlov wrote: >>>>> I will do full review later. I want to comment about test command >>>>> line. >>>>> >>>>> You don't need vm.opt.TieredCompilation != true in @requires because >>>>> you specified -XX:-TieredCompilation in @run command. >>>> >>>> And per my comment this should be being tested with tiered as well. >>>> >>>> David >>>> >>>>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip >>>>> test from running in Interpreter mode too. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 12/11/19 7:07 AM, Reingruber, Richard wrote: >>>>>> Hi David, >>>>>> >>>>>> ??? > Most of the details here are in areas I can comment on in >>>>>> detail, but I >>>>>> ??? > did take an initial general look at things. >>>>>> >>>>>> Thanks for taking the time! >>>>>> >>>>>> ??? > The only thing that jumped out at me is that I think the >>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>>>> ??? > >>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>>>> >>>>>> Yes, it should. Will add the method like above. >>>>>> >>>>>> ??? > Also I don't see any testing of the >>>>>> DeoptimizeObjectsALotThread. >>>>>> Without >>>>>> ??? > active testing this will just bit-rot. >>>>>> >>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>>>> workload. I will add a minimal test >>>>>> to keep it fresh. >>>>>> >>>>>> ??? > Also on the tests I don't understand your @requires clause: >>>>>> ??? > >>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>>> ??? > (vm.opt.TieredCompilation != true)) >>>>>> ??? > >>>>>> ??? > This seems to require that TieredCompilation is disabled, but >>>>>> tiered is >>>>>> ??? > our normal mode of operation. ?? >>>>>> ??? > >>>>>> >>>>>> I removed the clause. I guess I wanted to target the tests towards >>>>>> the code they are supposed to >>>>>> test, and it's easier to analyze failures w/o tiered compilation and >>>>>> with just one compiler thread. >>>>>> >>>>>> Additionally I will make use of >>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> -----Original Message----- >>>>>> From: David Holmes >>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>>>> To: Reingruber, Richard ; >>>>>> serviceability-dev at openjdk.java.net; >>>>>> hotspot-compiler-dev at openjdk.java.net; >>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>>> Performance in the Presence of JVMTI Agents >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I would like to get reviews please for >>>>>>> >>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>>>> >>>>>>> Corresponding RFE: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>>>> >>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>>>> >>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>>>> issues (thanks!). In addition the >>>>>>> change is being tested at SAP since I posted the first RFR some >>>>>>> months ago. >>>>>>> >>>>>>> The intention of this enhancement is to benefit performance wise >>>>>>> from escape analysis even if JVMTI >>>>>>> agents request capabilities that allow them to access local variable >>>>>>> values. E.g. if you start-up >>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>>>> escape analysis is disabled right >>>>>>> from the beginning, well before a debugger attaches -- if ever one >>>>>>> should do so. With the >>>>>>> enhancement, escape analysis will remain enabled until and after a >>>>>>> debugger attaches. EA based >>>>>>> optimizations are reverted just before an agent acquires the >>>>>>> reference to an object. In the JBS item >>>>>>> you'll find more details. >>>>>> >>>>>> Most of the details here are in areas I can comment on in detail, >>>>>> but I >>>>>> did take an initial general look at things. >>>>>> >>>>>> The only thing that jumped out at me is that I think the >>>>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>>>> >>>>>> +? bool is_hidden_from_external_view() const { return true; } >>>>>> >>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>>> Without >>>>>> active testing this will just bit-rot. >>>>>> >>>>>> Also on the tests I don't understand your @requires clause: >>>>>> >>>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>>> (vm.opt.TieredCompilation != true)) >>>>>> >>>>>> This seems to require that TieredCompilation is disabled, but >>>>>> tiered is >>>>>> our normal mode of operation. ?? >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> Thanks, >>>>>>> Richard. >>>>>>> >>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>>>>> >>>>>>> >>>>>>> From chris.plummer at oracle.com Fri Dec 13 17:39:20 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 13 Dec 2019 09:39:20 -0800 Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: References: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> Message-ID: Looks good. thanks, Chris On 12/13/19 5:02 AM, Fairoz Matte wrote: > Hi Chris, > > Thanks for the review, > > Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro for -1L. > I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L. > http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/ > > Thanks, > Fairoz > > -----Original Message----- > From: Chris Plummer > Sent: Thursday, December 12, 2019 9:48 PM > To: Fairoz Matte ; serviceability-dev at openjdk.java.net > Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled > > Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR. > > Chris > > On 12/11/19 7:10 PM, Fairoz Matte wrote: >> Hi, >> >> Please review this small change, >> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call. >> >> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 >> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ >> >> This patch is provided by Yasumasa Suenaga >> >> Thanks, >> Fairoz From chris.plummer at oracle.com Fri Dec 13 18:44:57 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 13 Dec 2019 10:44:57 -0800 Subject: Discuss the design of parallel and incremental jmap histo. In-Reply-To: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com> References: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com> Message-ID: An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Fri Dec 13 19:01:57 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 13 Dec 2019 19:01:57 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> Message-ID: Hi David, > Some further queries/concerns: > > src/hotspot/share/runtime/objectMonitor.cpp > > Can you please explain the changes to ObjectMonitor::wait: > > ! _recursions = save // restore the old recursion count > ! + jt->get_and_reset_relock_count_after_wait(); // > increased by the deferred relock count > > what is the "deferred relock count"? I gather it relates to > > "The code was extended to be able to deoptimize objects of a frame that > is not the top frame and to let another thread than the owning thread do > it." Yes, these relate. Currently EA based optimizations are reverted, when a compiled frame is replaced with corresponding interpreter frames. Part of this is relocking objects with eliminated locking. New with the enhancement is that we do this also just before object references are acquired through JVMTI. In this case we deoptimize also the owning compiled frame C and we register deoptimized objects as deferred updates. When control returns to C it gets deoptimized, we notice that objects are already deoptimized (reallocated and relocked), so we don't do it again (relocking twice would be incorrect of course). Deferred updates are copied into the new interpreter frames. Problem: relocking is not possible if the target thread T is waiting on the monitor that needs to be relocked. This happens only with non-local objects with EliminateNestedLocks. Instead relocking is deferred until T owns the monitor again. This is what the piece of code above does. > which I don't like the sound of at all when it comes to ObjectMonitor > state. So I'd like to understand in detail exactly what is going on here > and why. This is a very intrusive change that seems to badly break > encapsulation and impacts future changes to ObjectMonitor that are under > investigation. I would not regard this as breaking encapsulation. Certainly not badly. I've added a property relock_count_after_wait to JavaThread. The property is well encapsulated. Future ObjectMonitor implementations have to deal with recursion too. They are free in choosing a way to do that as long as that property is taken into account. This is hardly a limitation. Note also that the property is a straight forward extension of the existing concept of deferred local updates. It is embedded into the structure holding them. So not even the footprint of a JavaThread is enlarged if no deferred updates are generated. > --- > > src/hotspot/share/runtime/thread.cpp > > Can you please explain why JavaThread::wait_for_object_deoptimization > has to be handcrafted in this way rather than using proper transitions. > I wrote wait_for_object_deoptimization taking JavaThread::java_suspend_self_with_safepoint_check as template. So in short: for the same reasons :) Threads reach both methods as part of thread state transitions, therefore special handling is required to change thread state on top of ongoing transitions. > We got rid of "deopt suspend" some time ago and it is disturbing to see > it being added back (effectively). This seems like it may be something > that handshakes could be used for. Deopt suspend used to be something rather different with a similar name[1]. It is not being added back. I'm actually duplicating the existing external suspend mechanism, because a thread can be suspended at most once. And hey, and don't like that either! But it seems not unlikely that the duplicate can be removed together with the original and the new type of handshakes that will be used for thread suspend can be used for object deoptimization too. See today's discussion in JDK-8227745 [2]. Thanks, Richard. [1] Deopt suspend was something like an async. handshake for architectures with register windows, where patching the return pc for deoptimization of a compiled frame was racy if the owner thread was in native code. Instead a "deopt" suspend flag was set on which the thread patched its own frame upon return from native. So no thread was suspended. It got its name only from the name of the flags. [2] Discussion about using handshakes to sync. with the target thread: https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727 -----Original Message----- From: David Holmes Sent: Freitag, 13. Dezember 2019 00:56 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, Some further queries/concerns: src/hotspot/share/runtime/objectMonitor.cpp Can you please explain the changes to ObjectMonitor::wait: ! _recursions = save // restore the old recursion count ! + jt->get_and_reset_relock_count_after_wait(); // increased by the deferred relock count what is the "deferred relock count"? I gather it relates to "The code was extended to be able to deoptimize objects of a frame that is not the top frame and to let another thread than the owning thread do it." which I don't like the sound of at all when it comes to ObjectMonitor state. So I'd like to understand in detail exactly what is going on here and why. This is a very intrusive change that seems to badly break encapsulation and impacts future changes to ObjectMonitor that are under investigation. --- src/hotspot/share/runtime/thread.cpp Can you please explain why JavaThread::wait_for_object_deoptimization has to be handcrafted in this way rather than using proper transitions. We got rid of "deopt suspend" some time ago and it is disturbing to see it being added back (effectively). This seems like it may be something that handshakes could be used for. Thanks, David ----- On 12/12/2019 7:02 am, David Holmes wrote: > On 12/12/2019 1:07 am, Reingruber, Richard wrote: >> Hi David, >> >> ?? > Most of the details here are in areas I can comment on in detail, >> but I >> ?? > did take an initial general look at things. >> >> Thanks for taking the time! > > Apologies the above should read: > > "Most of the details here are in areas I *can't* comment on in detail ..." > > David > >> ?? > The only thing that jumped out at me is that I think the >> ?? > DeoptimizeObjectsALotThread should be a hidden thread. >> ?? > >> ?? > +? bool is_hidden_from_external_view() const { return true; } >> >> Yes, it should. Will add the method like above. >> >> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >> Without >> ?? > active testing this will just bit-rot. >> >> DeoptimizeObjectsALot is meant for stress testing with a larger >> workload. I will add a minimal test >> to keep it fresh. >> >> ?? > Also on the tests I don't understand your @requires clause: >> ?? > >> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >> ?? > (vm.opt.TieredCompilation != true)) >> ?? > >> ?? > This seems to require that TieredCompilation is disabled, but >> tiered is >> ?? > our normal mode of operation. ?? >> ?? > >> >> I removed the clause. I guess I wanted to target the tests towards the >> code they are supposed to >> test, and it's easier to analyze failures w/o tiered compilation and >> with just one compiler thread. >> >> Additionally I will make use of >> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >> >> Thanks, >> Richard. >> >> -----Original Message----- >> From: David Holmes >> Sent: Mittwoch, 11. Dezember 2019 08:03 >> To: Reingruber, Richard ; >> serviceability-dev at openjdk.java.net; >> hotspot-compiler-dev at openjdk.java.net; >> hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >> Performance in the Presence of JVMTI Agents >> >> Hi Richard, >> >> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>> Hi, >>> >>> I would like to get reviews please for >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>> >>> Corresponding RFE: >>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>> >>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>> >>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>> issues (thanks!). In addition the >>> change is being tested at SAP since I posted the first RFR some >>> months ago. >>> >>> The intention of this enhancement is to benefit performance wise from >>> escape analysis even if JVMTI >>> agents request capabilities that allow them to access local variable >>> values. E.g. if you start-up >>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>> escape analysis is disabled right >>> from the beginning, well before a debugger attaches -- if ever one >>> should do so. With the >>> enhancement, escape analysis will remain enabled until and after a >>> debugger attaches. EA based >>> optimizations are reverted just before an agent acquires the >>> reference to an object. In the JBS item >>> you'll find more details. >> >> Most of the details here are in areas I can comment on in detail, but I >> did take an initial general look at things. >> >> The only thing that jumped out at me is that I think the >> DeoptimizeObjectsALotThread should be a hidden thread. >> >> +? bool is_hidden_from_external_view() const { return true; } >> >> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without >> active testing this will just bit-rot. >> >> Also on the tests I don't understand your @requires clause: >> >> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >> (vm.opt.TieredCompilation != true)) >> >> This seems to require that TieredCompilation is disabled, but tiered is >> our normal mode of operation. ?? >> >> Thanks, >> David >> >>> Thanks, >>> Richard. >>> >>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>> >>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>> >>> From harold.seigel at oracle.com Fri Dec 13 19:35:17 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 13 Dec 2019 14:35:17 -0500 Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and TestRecordAttr.java are failing Message-ID: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com> Hi, Please review this trivial fix to prevent java/lang/instrument/... TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? The fix replaces hard-wired JDK version 14 with mechanisms that get the latest JDK version. Open Webrev: http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922 The fix was tested by running the tests locally on Linux-x64. Thanks, Harold From daniel.daugherty at oracle.com Fri Dec 13 19:45:13 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 13 Dec 2019 14:45:13 -0500 Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and TestRecordAttr.java are failing In-Reply-To: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com> References: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com> Message-ID: On 12/13/19 2:35 PM, Harold Seigel wrote: > Hi, > > Please review this trivial fix to prevent java/lang/instrument/... > TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? > The fix replaces hard-wired JDK version 14 with mechanisms that get > the latest JDK version. > > Open Webrev: > http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html test/jdk/java/lang/instrument/RedefineRecordAttr/TestRecordAttr.java ??? No comments. test/jdk/java/lang/instrument/RedefineRecordAttrGenericSig/TestRecordAttrGenericSig.java ??? No comments. Thumbs up! I agree that this is a trivial fix. Thanks for fixing this so quickly! Dan > > JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922 > > The fix was tested by running the tests locally on Linux-x64. > > Thanks, Harold > From harold.seigel at oracle.com Fri Dec 13 19:46:50 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Fri, 13 Dec 2019 14:46:50 -0500 Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and TestRecordAttr.java are failing In-Reply-To: References: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com> Message-ID: <1d159e3f-ca09-bdfa-be63-7b33997e56e6@oracle.com> Thanks Dan! Harold On 12/13/2019 2:45 PM, Daniel D. Daugherty wrote: > On 12/13/19 2:35 PM, Harold Seigel wrote: >> Hi, >> >> Please review this trivial fix to prevent java/lang/instrument/... >> TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? >> The fix replaces hard-wired JDK version 14 with mechanisms that get >> the latest JDK version. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html > > test/jdk/java/lang/instrument/RedefineRecordAttr/TestRecordAttr.java > ??? No comments. > > test/jdk/java/lang/instrument/RedefineRecordAttrGenericSig/TestRecordAttrGenericSig.java > > ??? No comments. > > Thumbs up! I agree that this is a trivial fix. > > Thanks for fixing this so quickly! > > Dan > >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922 >> >> The fix was tested by running the tests locally on Linux-x64. >> >> Thanks, Harold >> > From serguei.spitsyn at oracle.com Fri Dec 13 21:50:45 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Dec 2019 13:50:45 -0800 Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and TestRecordAttr.java are failing In-Reply-To: References: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com> Message-ID: Hi Harold, +1 Thanks, Serguei On 12/13/19 11:45 AM, Daniel D. Daugherty wrote: > On 12/13/19 2:35 PM, Harold Seigel wrote: >> Hi, >> >> Please review this trivial fix to prevent java/lang/instrument/... >> TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? >> The fix replaces hard-wired JDK version 14 with mechanisms that get >> the latest JDK version. >> >> Open Webrev: >> http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html > > test/jdk/java/lang/instrument/RedefineRecordAttr/TestRecordAttr.java > ??? No comments. > > test/jdk/java/lang/instrument/RedefineRecordAttrGenericSig/TestRecordAttrGenericSig.java > > ??? No comments. > > Thumbs up! I agree that this is a trivial fix. > > Thanks for fixing this so quickly! > > Dan > >> >> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922 >> >> The fix was tested by running the tests locally on Linux-x64. >> >> Thanks, Harold >> > From serguei.spitsyn at oracle.com Fri Dec 13 22:01:56 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Dec 2019 14:01:56 -0800 Subject: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter In-Reply-To: References: Message-ID: <97f4c96f-73bb-b5fd-1b2f-91ef18012269@oracle.com> Hi Matthias, +1 Thanks, Serguei On 12/12/19 2:00 AM, Langer, Christoph wrote: > > Hi Matthias, > > I think your current patch is good as it is ? at least it wouldn?t > make things worse, AFAICS. > > Further improvements can probably be done under another issue. > > Cheers > > Christoph > > *From:* serviceability-dev > *On Behalf Of *Baesken, > Matthias > *Sent:* Freitag, 29. November 2019 08:18 > *To:* Thomas St?fe > *Cc:* serviceability-dev at openjdk.java.net > *Subject:* [CAUTION] RE: RFR [XS]: 8234968: check calloc rv in > libinstrument InvocationAdapter > > Hi Thomas, Christoph, thanks for the comments .? Of course the init > of? * decodedLen ?must be added . > > In? case of ?returning? NULL? from ?decodePath ??,?? we would have? > tmp == NULL? (in char* tmp = func;? )?? ??, assign? tmp to res? and? > then? we ?jplis_assert?? , see : > > #define TRANSFORM(res,func) {??? \ > > ??? char* tmp = func;??????????? \ > > ??? if (tmp != res) {??????????? \ > > ??????? free(res);?????????????? \ > > ??????? res = tmp;?????????????? \ > > ??? }??????????????????????????? \ > > ??? jplis_assert((void*)res != (void*)NULL);???? \ > > } > > ?. > > TRANSFORM(path, decodePath(path,&len)); > > New webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.2/ > > Best regards, Matthias > > *From:* Thomas St?fe > > *Sent:* Freitag, 29. November 2019 07:30 > *To:* Baesken, Matthias > > *Cc:* serviceability-dev at openjdk.java.net > > *Subject:* Re: RFR [XS]: 8234968: check calloc rv in libinstrument > InvocationAdapter > > Hi Matthias, > > I am not certain the callers are prepared to handle NULL. > > This is used in a chain of TRANSFORM macro calls which AFAICS do not > handle NULL; e.g. , at 872, we pass the returned pointer to > convertUft8ToPlatformString which passes it on (on Windows) to > MultiByteToWideChar, which does not handle NULL input. > > So I wonder whether a clear error message with an exit would be better > in this case. Otherwise we may get a crash just some instructions later. > > Cheers, Thomas > > On Thu, Nov 28, 2019 at 5:21 PM Baesken, Matthias > > wrote: > > Hello, please review this small? patch . > > It adds return value checking for calloc at one place where it is > missing . > > Thanks, Matthias > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8234968 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.1/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Sat Dec 14 01:02:13 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Dec 2019 17:02:13 -0800 Subject: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> Message-ID: <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> Hi Yasumasa, This is nice move in general. Thank you for working on this! http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } I'd suggest to simplify the logic by refactoring to something like below: ????????? long libptr = dbg.findLibPtrByAddress(pc); ????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame ????????? DwarfParser dwarf = null; ????????? if (libptr != 0L) { // Native frame ??????????? try { ????????????? dwarf = new DwarfParser(libptr); ????????????? dwarf.processDwarf(pc); ????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && ???????????????????????????? !dwarf.isBPOffsetAvailable()) ??????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) ??????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) .addOffsetTo(dwarf.getCFAOffset()); ?????????? } catch (DebuggerException e) { // bail out to Java frame case ?????????? } ???????? } ???????? if (cfa == null) { ?????????? return null; ???????? } ???????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() ? Better to rename 'ofs' => 'offs'. 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); ? Extra space after '-' sign. 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { ? It feels like the logic has to be somehow refactored/simplified as ? several typical fragments appears in slightly different contexts. ? But it is not easy to understand what it is. ? Could you, please, add some comments to key places explaining this logic. ? Then I'll check if it is possible to make it a little bit simpler. 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } ?The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): ???? private CFrame javaSender(ThreadContext context) { ?????? Address nextPC = getNextPC(false); ?????? if (nextPC == null) { ???????? return null; ?????? } ?????? long libptr = dbg.findLibPtrByAddress(nextPC); ?????? DwarfParser nextDwarf = null; ?????? if (libptr != 0L) { // Native frame ???????? try { ?????????? nextDwarf = new DwarfParser(libptr); ?????????? nextDwarf.processDwarf(nextPC); ???????? } catch (DebuggerException e) { // Bail out to Java frame ???????? } ?????? } ?????? Address nextCFA = getNextCFA(nextDwarf, context); ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); ???? } 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } ?This one can be also simplified a little: ???? public CFrame sender(ThreadProxy thread) { ?????? ThreadContext context = thread.getContext(); ?????? if (dwarf == null) { // Java frame ???????? return javaSender(context); ?????? } ?????? Address nextPC = getNextPC(true); ?????? if (nextPC == null) { ???????? return null; ?????? } ?????? DwarfParser nextDwarf = null; ?????? if (!dwarf.isIn(nextPC)) { ???????? long libptr = dbg.findLibPtrByAddress(nextPC); ???????? if (libptr != 0L) { ?????????? try { ???????????? nextDwarf = new DwarfParser(libptr); ???????????? nextDwarf.processDwarf(nextPC); ?????????? } catch (DebuggerException e) { // Bail out to Java frame ?????????? } ???????? } ?????? } ?????? Address nextCFA = getNextCFA(nextDwarf, context); ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); ???? } Finally, it looks like just one method could replace both sender(ThreadProxy thread) and javaSender(ThreadContext context): ???? private CFrame commonSender(ThreadProxy thread) { ?????? ThreadContext context = thread.getContext(); ?????? Address nextPC = getNextPC(false); ?????? if (nextPC == null) { ???????? return null; ?????? } ?????? DwarfParser nextDwarf = null; ?????? long libptr = dbg.findLibPtrByAddress(nextPC); ?????? if (dwarf == null || !dwarf.isIn(nextPC)) { ???????? long libptr = dbg.findLibPtrByAddress(nextPC); ???????? if (libptr != 0L) { ?????????? try { ???????????? nextDwarf = new DwarfParser(libptr); ???????????? nextDwarf.processDwarf(nextPC); ?????????? } catch (DebuggerException e) { // Bail out to Java frame ?????????? } ???????? } ?????? } ?????? Address nextCFA = getNextCFA(nextDwarf, context); ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); ???? } I'm still reviewing the dwarf parser files. Thanks, Serguei On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: > Hi, > > I refactored LinuxAMD64CFrame.java . It works fine in > serviceability/sa tests and > all tests on submit repo > (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). > Could you review new webrev? > > ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ > > The diff from previous webrev is here: > ? http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b > > > Thanks, > > Yasumasa > > > On 2019/11/25 14:08, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this change: >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >> >> >> According to 2.7 Stack Unwind Algorithm in System V Application >> Binary Interface AMD64 >> Architecture Processor Supplement [1], we need to use DWARF in >> .eh_frame or .debug_frame >> for stack unwinding. >> >> As JDK-8022183 said, omit-frame-pointer is enabled by default since >> GCC 4.6, so system >> library (e.g. libc) might be compiled with this feature. >> >> However `jhsdb jstack --mixed` does not do so, it uses base pointer >> register (RBP). >> So it might be lack of stack frames. >> >> I guess JDK-8219201 is caused by same issue. >> >> >> Thanks, >> >> Yasumasa >> >> >> [1] >> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Sun Dec 15 01:51:36 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sun, 15 Dec 2019 10:51:36 +0900 Subject: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> Message-ID: Hi Serguei, Thanks for your comment! I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). Thanks, Yasumasa On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > This is nice move in general. > Thank you for working on this! > > http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html > > 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } > > > I'd suggest to simplify the logic by refactoring to something like below: > > ????????? long libptr = dbg.findLibPtrByAddress(pc); > ????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame > ????????? DwarfParser dwarf = null; > > ????????? if (libptr != 0L) { // Native frame > ??????????? try { > ????????????? dwarf = new DwarfParser(libptr); > ????????????? dwarf.processDwarf(pc); > ????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && > ???????????????????????????? !dwarf.isBPOffsetAvailable()) > ??????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) > ??????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) > .addOffsetTo(dwarf.getCFAOffset()); > > ?????????? } catch (DebuggerException e) { // bail out to Java frame case > ?????????? } > ???????? } > ???????? if (cfa == null) { > ?????????? return null; > ???????? } > ???????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); > > http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html > > 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() > > ? Better to rename 'ofs' => 'offs'. > > 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); > > ? Extra space after '-' sign. > > 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { > > ? It feels like the logic has to be somehow refactored/simplified as > ? several typical fragments appears in slightly different contexts. > ? But it is not easy to understand what it is. > ? Could you, please, add some comments to key places explaining this logic. > ? Then I'll check if it is possible to make it a little bit simpler. > > 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } > > ?The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): > ???? private CFrame javaSender(ThreadContext context) { > ?????? Address nextPC = getNextPC(false); > ?????? if (nextPC == null) { > ???????? return null; > ?????? } > ?????? long libptr = dbg.findLibPtrByAddress(nextPC); > ?????? DwarfParser nextDwarf = null; > > ?????? if (libptr != 0L) { // Native frame > ???????? try { > ?????????? nextDwarf = new DwarfParser(libptr); > ?????????? nextDwarf.processDwarf(nextPC); > ???????? } catch (DebuggerException e) { // Bail out to Java frame > ???????? } > ?????? } > ?????? Address nextCFA = getNextCFA(nextDwarf, context); > ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); > ???? } > > 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, > nextCFA, nextPC, nextDwarf); 167 } > > ?This one can be also simplified a little: > > ???? public CFrame sender(ThreadProxy thread) { > ?????? ThreadContext context = thread.getContext(); > > ?????? if (dwarf == null) { // Java frame > ???????? return javaSender(context); > ?????? } > ?????? Address nextPC = getNextPC(true); > ?????? if (nextPC == null) { > ???????? return null; > ?????? } > ?????? DwarfParser nextDwarf = null; > ?????? if (!dwarf.isIn(nextPC)) { > ???????? long libptr = dbg.findLibPtrByAddress(nextPC); > ???????? if (libptr != 0L) { > ?????????? try { > ???????????? nextDwarf = new DwarfParser(libptr); > ???????????? nextDwarf.processDwarf(nextPC); > ?????????? } catch (DebuggerException e) { // Bail out to Java frame > ?????????? } > ???????? } > ?????? } > ?????? Address nextCFA = getNextCFA(nextDwarf, context); > ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); > ???? } > > Finally, it looks like just one method could replace both > sender(ThreadProxy thread) and javaSender(ThreadContext context): > > ???? private CFrame commonSender(ThreadProxy thread) { > ?????? ThreadContext context = thread.getContext(); > ?????? Address nextPC = getNextPC(false); > ?????? if (nextPC == null) { > ???????? return null; > ?????? } > ?????? DwarfParser nextDwarf = null; > > ?????? long libptr = dbg.findLibPtrByAddress(nextPC); > ?????? if (dwarf == null || !dwarf.isIn(nextPC)) { > ???????? long libptr = dbg.findLibPtrByAddress(nextPC); > ???????? if (libptr != 0L) { > ?????????? try { > ???????????? nextDwarf = new DwarfParser(libptr); > ???????????? nextDwarf.processDwarf(nextPC); > ?????????? } catch (DebuggerException e) { // Bail out to Java frame > ?????????? } > ???????? } > ?????? } > ?????? Address nextCFA = getNextCFA(nextDwarf, context); > ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); > ???? } > > I'm still reviewing the dwarf parser files. > > Thanks, > Serguei > > > On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >> Hi, >> >> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >> Could you review new webrev? >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >> >> The diff from previous webrev is here: >> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>> Hi all, >>> >>> Please review this change: >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>> >>> >>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>> for stack unwinding. >>> >>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>> library (e.g. libc) might be compiled with this feature. >>> >>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>> So it might be lack of stack frames. >>> >>> I guess JDK-8219201 is caused by same issue. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf > From linzang at tencent.com Mon Dec 16 01:38:23 2019 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Mon, 16 Dec 2019 01:38:23 +0000 Subject: Discuss the design of parallel and incremental jmap histo.(Internet mail) In-Reply-To: References: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com> Message-ID: Dear Chris, >> why jmap is getting stuck or "killed by timer" as you mention below. Shouldn't this be considered a bug and addressed directly. This ?timer? is usually another process, my experience is HDFS and ZKFC, the ZKFC pings it?s NameNode periodically, and when the NameNode?s heap is large (~180GB in my case), the heap iteration by jmap can cause the process stuck, so ZFKC can not get response from NameNode, so the NameNode got killed. >> How useful are intermediate results? How often can users come to reasonable conclusions about heap usage when the data is incomplete. From my experience, I usually use jmap -histo to get the information about the distribution of objects, and I found usually the object distribution of part of the heap is similar about the distribution of the whole heap. I agree that this is not correct for all cases, but since jmap -histo give results only when it?s exit normally at present, and I think maybe info of partial heap is better than nothing, especially for memory leak analysis. >> Is there even an indication given of how much of the heap is accounted for in the output? Yes, the incremental dump information shows the number of the objects and the totally bytes have been iterated. Thanks! BRs, Lin From: Chris Plummer Date: Saturday, December 14, 2019 at 2:46 AM To: "linzang(??)" , "serviceability-dev at openjdk.java.net" Subject: Re: Discuss the design of parallel and incremental jmap histo.(Internet mail) Hi Lin, I have a question regarding the need for incremental support. The CSR states: Problem: Now, the "JMap -histo" tool can not dump intermediate result, which is useful if the heap is large and dumping the whole heap can be stuck. Two questions. The first is why jmap is getting stuck or "killed by timer" as you mention below. Shouldn't this be considered a bug and addressed directly. Second question is how useful are intermediate results? How often can users come to reasonable conclusions about heap usage when the data is incomplete. Is there even an indication given of how much of the heap is accounted for in the output? thanks, Chris On 12/12/19 10:22 PM, linzang(??) wrote: Dear All, I want to re-activate the thread of discussion about the implementation of parallel and incremental ?Jmap -histo?. The target of these changes is to solve the problems that ?jmap -histo? may ? timeout or killed by timer? when heap is large. And the result of ?jmap -histo? is ?one or nothing?, which means if it gets killed before exit, user gets no information about the heap. The ?incremental? means that jmap -histo dumps the intermediate results when it is iterating the heap, so if it is interrupted, user can get some meaningful information. The ?parallel? targets to help speed up the heap iteration with multi-threading. Originally I have implemented the ?incremental dump? that dump the intermediate data into a separate file like , and the final result will be saved to another file . so when jmap -histo get interrupted, user can get information from , and if jmap -histo works fine, the final result would be in . And the parallel dump will have multiple thread working on heap iteration, each thread generates intermediate data timely. The main reason of using separate file for incremental dump is due to the consideration of parallel incremental dump implementation, so that every heap-iteration thread could dump its own data in separate file, to avoid using file lock. However, it seems that the original design might confuse user by having two or more result files (intermediated result and final result). So I want to ask your help to discuss it: 1. For incremental dump without parallel, Intermediate result and the final result are dumped to the same file: In this case, the intermediate data are generated in the middle of heap iteration, they are written to file at the same time. And if jmap -histo exits normally, the final result will be also dump to , then all intermediate data are flushed. 1. For parallel dump without incremental: Every thread generates its own thread-local dump buffer, and all thread local dump are merged and write to the file at the end. There is no incremental support, so the result is ?one or nothing?. 1. For parallel + incremental dump, I think it?s a little complicated because of intermediate data processing: * Every thread has its own thread-local intermediate data buffer, and all the thread-local buffers will be written to file while holding file lock. So there is only one data file generated, and if jmap -histo is interrupted, the intermediated data are save in the same file. The problem is that the file write lock can be heavy, which may cause parallel heap dump slow. * Every thread has its own thread-local intermediate data buffer, and every thread save its result in an temp file named . So there is no file lock. The parallel can be fast. But the problem is that there will be multiple files generated to save the thread-local intermediate results. And this might confuse the user. * Every thread has its own thread-local intermediate data buffer, and another ?data-merging-thread? will be generated. The parallel threads write data to its thread local buffer, and enqueue the buffer when data reach some threshold. The ?data-merging-thread? consumes the queue, merge the data from different thread, save the merged data to the result file. In this case, there is only one file generated. And there is no file lock needed, but there is queue lock, and a separate ?merging thread? impl. Do you think this is a reasonable solution? So may I ask your suggestion ? Details of previous discussion can be found at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028276.html Thanks! BRs, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From fairoz.matte at oracle.com Mon Dec 16 02:47:43 2019 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Sun, 15 Dec 2019 18:47:43 -0800 (PST) Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: References: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> Message-ID: <4376860c-173f-4954-aba2-dae39986dbf2@default> Thanks Chris, > -----Original Message----- > From: Chris Plummer > Sent: Friday, December 13, 2019 11:09 PM > To: Fairoz Matte ; serviceability- > dev at openjdk.java.net > Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if > prelink is enabled > > Looks good. > > thanks, > > Chris > > On 12/13/19 5:02 AM, Fairoz Matte wrote: > > Hi Chris, > > > > Thanks for the review, > > > > Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro > for -1L. > > I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L. > > http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/ > > > > Thanks, > > Fairoz > > > > -----Original Message----- > > From: Chris Plummer > > Sent: Thursday, December 12, 2019 9:48 PM > > To: Fairoz Matte ; serviceability- > dev at openjdk.java.net > > Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if > prelink is enabled > > > > Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or > LOAD_ADDRESS_ERROR. > > > > Chris > > > > On 12/11/19 7:10 PM, Fairoz Matte wrote: > >> Hi, > >> > >> Please review this small change, > >> Updating error handling, to make sure "lib_base_diff = 0" is still a valid > scenario even after calc_prelinked_load_address() call. > >> > >> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 > >> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ > >> > >> This patch is provided by Yasumasa Suenaga > >> > >> Thanks, > >> Fairoz > From ioi.lam at oracle.com Mon Dec 16 06:21:24 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Sun, 15 Dec 2019 22:21:24 -0800 Subject: RFR(S) 8235970 [TESTBUG] Remove dependency of sun.tools.jar from RedefineClassHelper Message-ID: https://bugs.openjdk.java.net/browse/JDK-8235970 http://cr.openjdk.java.net/~iklam/jdk15/8235970-RedefineClassHelper-no-sun-tools-jar.v01/ test/lib/RedefineClassHelper.java uses the internal sun.tools.jar.Main class directly, causing a warning from javac. As a result, all tests that use RedefineClassHelper need to have this line for the additional module dependency. ?? @modules jdk.jartool/sun.tools.jar The fix is to rewrite RedefineClassHelper to use ClassFileInstaller instead. I removed "@modules jdk.jartool/sun.tools.jar" for all users of RedefineClassHelper, except for the following (which use sun.tools.jar.Main directly). ??? test/hotspot/jtreg/serviceability/jvmti/RedefineClasses/* ----- Testing with hs-tier1,hs-tier2,hs-tier5-svc which cover all the affected test cases. Thanks - Ioi From Alan.Bateman at oracle.com Mon Dec 16 07:22:07 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 16 Dec 2019 07:22:07 +0000 Subject: RFR(S) 8235970 [TESTBUG] Remove dependency of sun.tools.jar from RedefineClassHelper In-Reply-To: References: Message-ID: <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com> On 16/12/2019 06:21, Ioi Lam wrote: > : > > The fix is to rewrite RedefineClassHelper to use ClassFileInstaller > instead. This looks okay but just to point out that the jar tool can be obtained via ToolProvider, e.g. ?? ToolProvider jarTool = ToolProvider.findFirst("jar").orElseThrow(); so RedefineClassHelper, or better still ClassFileInstaller, could use that for cases where JAR files need to be created or updated in ways that would be easier if the jar tool could be used in the test. Avoids using some of the prickly APIs in java.util.zip|jar. -Alan From robbin.ehn at oracle.com Mon Dec 16 09:47:33 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 16 Dec 2019 10:47:33 +0100 Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do Message-ID: Hi all, please review. From issue, https://bugs.openjdk.java.net/browse/JDK-8235912: JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in a vm operation) before they are installed in the safeopint and after they have been installed, walked with JvmtiCurrentBreakpoints::oops_do(). By putting the class holder inside oopStorage there is no need for this. JvmtiCurrentBreakpoints::metadata_do is not needed because redefine classes actually removes the breakpoints before updating them (so there is no breakpoints to update). We can just remove metadata_do. I also removed some unused code. Changeset: http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/ Passes several runs of nsk jvmti/jdi and t1-7. Thanks, Robbin From robbin.ehn at oracle.com Mon Dec 16 10:20:39 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 16 Dec 2019 11:20:39 +0100 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: Message-ID: <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com> Hi Richard, as mentioned it would be better if you could do this with handshakes, instead of using _suspend_flag (since they are going away). But I can't think of a way doing it without blocking safepoints, so we need to add some more features in handshakes first. When possible I hope you are willing to move this code to handshakes instead. You could stop one thread with, e.g.: class EscapeBarrierSuspendHandshake : public HandshakeClosure { Semaphore _is_waiting; Semaphore _wait; bool _started; public: EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), _wait(0), _started(false) { } void do_thread(Thread* th) { _is_waiting.signal(); _wait.wait(); Atomic::store(&_started, true); } void wait_until_eb_stopped() { _is_waiting.wait(); } void start_thread() { _wait.signal(); while(!Atomic::load(&_started)) { os::naked_yield(); } } }; But it would block safepoints. Thanks, Robbin On 12/10/19 10:45 PM, Reingruber, Richard wrote: > Hi, > > I would like to get reviews please for > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > > Corresponding RFE: > https://bugs.openjdk.java.net/browse/JDK-8227745 > > Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] > > Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the > change is being tested at SAP since I posted the first RFR some months ago. > > The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI > agents request capabilities that allow them to access local variable values. E.g. if you start-up > with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right > from the beginning, well before a debugger attaches -- if ever one should do so. With the > enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based > optimizations are reverted just before an agent acquires the reference to an object. In the JBS item > you'll find more details. > > Thanks, > Richard. > > [1] Experimental fix for JDK-8214584 based on JDK-8227745 > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch > From coleen.phillimore at oracle.com Mon Dec 16 11:41:05 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 16 Dec 2019 06:41:05 -0500 Subject: [14] RFR 8235829: graal crashes with Zombie.java test Message-ID: Summary: Start ServiceThread before compiler threads, and run nmethod barriers for zgc before adding to the service thread queue, or posting the events on the java thread queue. See bug for description of the problems found with the new Zombie.java test. open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8235829 Ran tier1 all platforms, and tier2-8 testing, as well as rerunning original test failure from bug https://bugs.openjdk.java.net/browse/JDK-8173361. Thanks, Coleen From ralf.schmelter at sap.com Mon Dec 16 12:27:58 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Mon, 16 Dec 2019 12:27:58 +0000 Subject: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump In-Reply-To: References: Message-ID: I forgot to post the updated webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8234510/webrev.2/ In addition to the changes requested by Thomas, I also renamed the entries in the heap dump segment from entries to sub-records, since that is what they are called in the comment describing the format. Best regards, Ralf From coleen.phillimore at oracle.com Mon Dec 16 12:32:56 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 16 Dec 2019 07:32:56 -0500 Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do In-Reply-To: References: Message-ID: I have to think about this.?? Could there be breakpoints in old emcp methods that we do not remove??? The metadata_do function is trying to keep old Methods from being deleted while there are still references to them. http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html + oop* _class_holder; // keeps _method memory from being deallocated We created the class OopHandle to encapsulate strong oopStorage references, although it's missing oop_store.? Can you use that? Coleen On 12/16/19 4:47 AM, Robbin Ehn wrote: > Hi all, please review. > > From issue, https://bugs.openjdk.java.net/browse/JDK-8235912: > > JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in > a vm operation) before they are installed in the safeopint and after > they have been installed, walked with JvmtiCurrentBreakpoints::oops_do(). > By putting the class holder inside oopStorage there is no need for this. > > JvmtiCurrentBreakpoints::metadata_do is not needed because redefine > classes actually removes the breakpoints before updating them (so > there is no breakpoints to update). > We can just remove metadata_do. > > > I also removed some unused code. > > Changeset: > http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/ > > Passes several runs of nsk jvmti/jdi and t1-7. > > Thanks, Robbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Dec 16 13:04:34 2019 From: david.holmes at oracle.com (David Holmes) Date: Mon, 16 Dec 2019 23:04:34 +1000 Subject: [14] RFR 8235829: graal crashes with Zombie.java test In-Reply-To: References: Message-ID: <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com> Hi Coleen, On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote: > Summary: Start ServiceThread before compiler threads, and run nmethod > barriers for zgc before adding to the service thread queue, or posting > the events on the java thread queue. I can't comment on most of this but the earlier starting of the service thread has some concerns: - there is a lot of JDK level initialization which now will not have happened before the service thread is started and it is far from obvious that all possible initialization dependencies will be satisfied - current starting of the service thread in Management::initialize is guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the service thread unconditionally for all builds. Hmm just saw your latest comment to the bug report - so the service thread is now (for quite some time?) being used for other than management tasks and so should always be present even if INCLUDE_MANAGEMENT is not enabled. Is that sufficient or are there likely to be other changes needed to actually ensure that all works correctly e.g. any code the service thread executes that is only defined for INCLUDE_MANAGEMENT will need to be compiled out explicitly. - the service thread and the notification thread are (were?) closely related but now started at completely different times The bug report states the problem as: "The graal crash is because compiled_method_load events are added to the ServiceThread's deferred event queue before the ServiceThread is created so are not walked to keep them from being zombied." so why isn't the solution to ensure the deferred event queue is walked? I'm not clear how starting the service thread relates to walking the queue. Thanks, David > See bug for description of the problems found with the new Zombie.java > test. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8235829 > > Ran tier1 all platforms, and tier2-8 testing, as well as rerunning > original test failure from bug > https://bugs.openjdk.java.net/browse/JDK-8173361. > > Thanks, > Coleen From robbin.ehn at oracle.com Mon Dec 16 13:13:18 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 16 Dec 2019 14:13:18 +0100 Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do In-Reply-To: References: Message-ID: Hi Coleen, in VM_RedefineClasses::doit: This updates the breakpoints: MetadataOnStackMark md_on_stack(/*walk_all_metadata*/true, /*redefinition_walk*/true); And this removes breakpoints: for (int i = 0; i < _class_count; i++) { redefine_single_class(_class_defs[i].klass, _scratch_classes[i], thread); } So we skip updating, since we do remove them after we updated them. But you are the expert here. Let me know if there is something I missed. OopHandle just adds more code. Thanks for having a look, Robbin On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote: > > I have to think about this.?? Could there be breakpoints in old emcp methods > that we do not remove??? The metadata_do function is trying to keep old Methods > from being deleted while there are still references to them. > > http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html > > > + oop* _class_holder; // keeps _method memory from being deallocated > > > We created the class OopHandle to encapsulate strong oopStorage references, > although it's missing oop_store.? Can you use that? > > Coleen > > On 12/16/19 4:47 AM, Robbin Ehn wrote: >> Hi all, please review. >> >> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912: >> >> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in a vm >> operation) before they are installed in the safeopint and after they have been >> installed, walked with JvmtiCurrentBreakpoints::oops_do(). >> By putting the class holder inside oopStorage there is no need for this. >> >> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine classes >> actually removes the breakpoints before updating them (so there is no >> breakpoints to update). >> We can just remove metadata_do. >> >> >> I also removed some unused code. >> >> Changeset: >> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/ >> >> Passes several runs of nsk jvmti/jdi and t1-7. >> >> Thanks, Robbin > From coleen.phillimore at oracle.com Mon Dec 16 13:26:58 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 16 Dec 2019 08:26:58 -0500 Subject: [14] RFR 8235829: graal crashes with Zombie.java test In-Reply-To: <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com> References: <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com> Message-ID: <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com> On 12/16/19 8:04 AM, David Holmes wrote: > Hi Coleen, > > On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote: >> Summary: Start ServiceThread before compiler threads, and run nmethod >> barriers for zgc before adding to the service thread queue, or >> posting the events on the java thread queue. > > I can't comment on most of this but the earlier starting of the > service thread has some concerns: > > - there is a lot of JDK level initialization which now will not have > happened before the service thread is started and it is far from > obvious that all possible initialization dependencies will be satisfied I agree that the order of initialization is very sensitive.? From the actions that the service thread does, the one that I found was a problem was that events were posted before the LIVE phase (see comment in has_events()), which could have happened with the existing code, but the window for the race is a lot smaller. ? The other actions can be run if there's a GC before initialization but would be a bug in the initialization code, and I didn't find these bugs in all my testing.? There are some ordering dependencies that do have odd side effects (between the compiler thread startup and initialization jsr292 classes) which have comments.? This patch doesn't touch those. > > - current starting of the service thread in Management::initialize is > guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the > service thread unconditionally for all builds. Hmm just saw your > latest comment to the bug report - so the service thread is now (for > quite some time?) being used for other than management tasks and so > should always be present even if INCLUDE_MANAGEMENT is not enabled. Is > that sufficient or are there likely to be other changes needed to > actually ensure that all works correctly? e.g. any code the service > thread executes that is only defined for INCLUDE_MANAGEMENT will need > to be compiled out explicitly. > I asked Jie offline to check the minimal build.? I don't think there are other INCLUDE_MANAGEMENT actions in the service thread and I'm not sure why it was initialized there in the first place.? The minimal vm would have been broken ie. hashtables would not have been cleaned up, etc, but I'm not sure how well that is tested or if one would notice. > - the service thread and the notification thread are (were?) closely > related but now started at completely different times The notification thread is limited to "services" so it makes sense where it is.? The ServiceThread does lots of other things.? Maybe it needs renaming in 15. > > The bug report states the problem as: > > "The graal crash is because compiled_method_load events are added to > the ServiceThread's deferred event queue before the ServiceThread is > created so are not walked to keep them from being zombied." > > so why isn't the solution to ensure the deferred event queue is > walked? I'm not clear how starting the service thread relates to > walking the queue. > The service thread is responsible for walking the deferred event queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could be changed to have some global walk somewhere of this queue, but essentially this queue is processed by the service thread. I had an additional change to make the queue non-static but want to limit the change at this point. Thanks, Coleen > Thanks, > David > >> See bug for description of the problems found with the new >> Zombie.java test. >> >> open webrev at >> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8235829 >> >> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning >> original test failure from bug >> https://bugs.openjdk.java.net/browse/JDK-8173361. >> >> Thanks, >> Coleen From richard.reingruber at sap.com Mon Dec 16 13:41:49 2019 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 16 Dec 2019 13:41:49 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com> References: <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com> Message-ID: Hi Robbin, first of all: thanks a lot for providing feedback. I do appreciate it. I am absolutely willing to move this to handshakes. Only I still can't see how to achieve it. Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? [1] I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() where calling_thread T1 would apply it on another thread T2. 1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot move anymore? 2. Handshakes between two threads are synchronous, correct? If so, then T1 will block handshaking T2, because either T2 or the VMThread will block in L10. I cannot figure out, how you mean this. Only if a helper thread H would handshake T2 then T1 could continue and call wait_until_eb_stopped(). But returning from there T1 would block if reallocating objects triggers GC or attempting to execute the vm operation in JvmtiEnv::GetOwnedMonitorStackDepthInfo(). It might be impossible to replace my suspend flag with handshakes that are available today, because if it was you could replace all the suspend flags right away, couldn't you? Or I'm simply missing something... quite possible... :) Thanks, Richard. [1] Drafted by Robbin (thanks!) 1 class EscapeBarrierSuspendHandshake : public HandshakeClosure { 2 Semaphore _is_waiting; 3 Semaphore _wait; 4 bool _started; 5 public: 6 EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), 7 _wait(0), _started(false) { } 8 void do_thread(Thread* th) { 9 _is_waiting.signal(); 10 _wait.wait(); 11 Atomic::store(&_started, true); 12 } 13 void wait_until_eb_stopped() { _is_waiting.wait(); } 14 void start_thread() { 15 _wait.signal(); 16 while(!Atomic::load(&_started)) { 17 os::naked_yield(); 18 } 19 } 20 }; -----Original Message----- From: Robbin Ehn Sent: Montag, 16. Dezember 2019 11:21 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, as mentioned it would be better if you could do this with handshakes, instead of using _suspend_flag (since they are going away). But I can't think of a way doing it without blocking safepoints, so we need to add some more features in handshakes first. When possible I hope you are willing to move this code to handshakes instead. You could stop one thread with, e.g.: class EscapeBarrierSuspendHandshake : public HandshakeClosure { Semaphore _is_waiting; Semaphore _wait; bool _started; public: EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), _wait(0), _started(false) { } void do_thread(Thread* th) { _is_waiting.signal(); _wait.wait(); Atomic::store(&_started, true); } void wait_until_eb_stopped() { _is_waiting.wait(); } void start_thread() { _wait.signal(); while(!Atomic::load(&_started)) { os::naked_yield(); } } }; But it would block safepoints. Thanks, Robbin On 12/10/19 10:45 PM, Reingruber, Richard wrote: > Hi, > > I would like to get reviews please for > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > > Corresponding RFE: > https://bugs.openjdk.java.net/browse/JDK-8227745 > > Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] > > Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the > change is being tested at SAP since I posted the first RFR some months ago. > > The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI > agents request capabilities that allow them to access local variable values. E.g. if you start-up > with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right > from the beginning, well before a debugger attaches -- if ever one should do so. With the > enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based > optimizations are reverted just before an agent acquires the reference to an object. In the JBS item > you'll find more details. > > Thanks, > Richard. > > [1] Experimental fix for JDK-8214584 based on JDK-8227745 > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch > From kevin.walls at oracle.com Mon Dec 16 15:13:36 2019 From: kevin.walls at oracle.com (Kevin Walls) Date: Mon, 16 Dec 2019 15:13:36 +0000 Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: References: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> Message-ID: Nice to know the difference between something that is zero, and something that has failed. 8-) ...oops, that says INAVLID_LOAD_ADDRESS ..not INVALID, so other that the typo yes it looks good. --- Kevin On 13/12/2019 13:02, Fairoz Matte wrote: > Hi Chris, > > Thanks for the review, > > Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro for -1L. > I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L. > http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/ > > Thanks, > Fairoz > > -----Original Message----- > From: Chris Plummer > Sent: Thursday, December 12, 2019 9:48 PM > To: Fairoz Matte ; serviceability-dev at openjdk.java.net > Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled > > Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR. > > Chris > > On 12/11/19 7:10 PM, Fairoz Matte wrote: >> Hi, >> >> Please review this small change, >> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call. >> >> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 >> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ >> >> This patch is provided by Yasumasa Suenaga >> >> Thanks, >> Fairoz From fairoz.matte at oracle.com Mon Dec 16 15:36:34 2019 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Mon, 16 Dec 2019 07:36:34 -0800 (PST) Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: References: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> Message-ID: <43fd053e-7fd4-41fe-8bf3-7382cf77ca33@default> Oh yes, Thanks Kevin for the review. Corrected the same - http://cr.openjdk.java.net/~fmatte/8235637/webrev.02 Thanks, Fairoz -----Original Message----- From: Kevin Walls Sent: Monday, December 16, 2019 8:44 PM To: Fairoz Matte ; Chris Plummer ; serviceability-dev at openjdk.java.net Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled Nice to know the difference between something that is zero, and something that has failed. 8-) ...oops, that says INAVLID_LOAD_ADDRESS ..not INVALID, so other that the typo yes it looks good. --- Kevin On 13/12/2019 13:02, Fairoz Matte wrote: > Hi Chris, > > Thanks for the review, > > Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro for -1L. > I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L. > http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/ > > Thanks, > Fairoz > > -----Original Message----- > From: Chris Plummer > Sent: Thursday, December 12, 2019 9:48 PM > To: Fairoz Matte ; > serviceability-dev at openjdk.java.net > Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work > if prelink is enabled > > Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR. > > Chris > > On 12/11/19 7:10 PM, Fairoz Matte wrote: >> Hi, >> >> Please review this small change, >> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call. >> >> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637 >> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/ >> >> This patch is provided by Yasumasa Suenaga >> >> Thanks, >> Fairoz From kevin.walls at oracle.com Mon Dec 16 16:35:26 2019 From: kevin.walls at oracle.com (Kevin Walls) Date: Mon, 16 Dec 2019 16:35:26 +0000 Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: <43fd053e-7fd4-41fe-8bf3-7382cf77ca33@default> References: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> <43fd053e-7fd4-41fe-8bf3-7382cf77ca33@default> Message-ID: <068d2c84-d865-3c78-1b53-eab3f3af660d@oracle.com> Great! 8-) On 16/12/2019 15:36, Fairoz Matte wrote: > Oh yes, > Thanks Kevin for the review. > > Corrected the same - http://cr.openjdk.java.net/~fmatte/8235637/webrev.02 > > Thanks, > Fairoz > From robbin.ehn at oracle.com Mon Dec 16 17:20:50 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 16 Dec 2019 18:20:50 +0100 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com> Message-ID: <9f24ec2c-d737-f9b7-8821-5905264971a7@oracle.com> Hi Richard, On 2019-12-16 14:41, Reingruber, Richard wrote: > Hi Robbin, > > first of all: thanks a lot for providing feedback. I do appreciate it. > > I am absolutely willing to move this to handshakes. Only I still can't see how to achieve it. > > Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? [1] > > I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() where calling_thread > T1 would apply it on another thread T2. Sorry I don't immediately see what issue there is in doing a handshake instead of: VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread, owned_monitors_list); > > 1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot move anymore? > > 2. Handshakes between two threads are synchronous, correct? If so, then T1 will block handshaking > T2, because either T2 or the VMThread will block in L10. Yes, sorry, I forgot/confused myself about asynch handshake. (I have a test prototype for that, which removes suspend flag) > > I cannot figure out, how you mean this. Only if a helper thread H would handshake T2 then T1 could > continue and call wait_until_eb_stopped(). But returning from there T1 would block if reallocating > objects triggers GC or attempting to execute the vm operation in > JvmtiEnv::GetOwnedMonitorStackDepthInfo(). > > It might be impossible to replace my suspend flag with handshakes that are available today, because > if it was you could replace all the suspend flags right away, couldn't you? So adding asynch handshakes and a per thread handshake queue, we can. (which this test prototype does) The issue I'm thinking of is if we need selective polling first. Suspend flags are not checked in every transition, e.g. vm->native. A JVM TI agent don't expect to suspend it's own thread when suspending all threads. (that thread would be suspended when trying to get back to agent code when it does vm->native transition) > > Or I'm simply missing something... quite possible... :) No I think you got it right. Thanks, Robbin > > Thanks, Richard. > > [1] Drafted by Robbin (thanks!) > > 1 class EscapeBarrierSuspendHandshake : public HandshakeClosure { > 2 Semaphore _is_waiting; > 3 Semaphore _wait; > 4 bool _started; > 5 public: > 6 EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), > 7 _wait(0), _started(false) { } > 8 void do_thread(Thread* th) { > 9 _is_waiting.signal(); > 10 _wait.wait(); > 11 Atomic::store(&_started, true); > 12 } > 13 void wait_until_eb_stopped() { _is_waiting.wait(); } > 14 void start_thread() { > 15 _wait.signal(); > 16 while(!Atomic::load(&_started)) { > 17 os::naked_yield(); > 18 } > 19 } > 20 }; > > -----Original Message----- > From: Robbin Ehn > Sent: Montag, 16. Dezember 2019 11:21 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > Hi Richard, as mentioned it would be better if you could do this with > handshakes, instead of using _suspend_flag (since they are going away). > But I can't think of a way doing it without blocking safepoints, so we need to > add some more features in handshakes first. > When possible I hope you are willing to move this code to handshakes instead. > > You could stop one thread with, e.g.: > class EscapeBarrierSuspendHandshake : public HandshakeClosure { > Semaphore _is_waiting; > Semaphore _wait; > bool _started; > public: > EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), > _wait(0), _started(false) { } > void do_thread(Thread* th) { > _is_waiting.signal(); > _wait.wait(); > Atomic::store(&_started, true); > } > void wait_until_eb_stopped() { _is_waiting.wait(); } > void start_thread() { > _wait.signal(); > while(!Atomic::load(&_started)) { > os::naked_yield(); > } > } > }; > > But it would block safepoints. > > Thanks, Robbin > > On 12/10/19 10:45 PM, Reingruber, Richard wrote: >> Hi, >> >> I would like to get reviews please for >> >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >> >> Corresponding RFE: >> https://bugs.openjdk.java.net/browse/JDK-8227745 >> >> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >> >> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the >> change is being tested at SAP since I posted the first RFR some months ago. >> >> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI >> agents request capabilities that allow them to access local variable values. E.g. if you start-up >> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right >> from the beginning, well before a debugger attaches -- if ever one should do so. With the >> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based >> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item >> you'll find more details. >> >> Thanks, >> Richard. >> >> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >> From ioi.lam at oracle.com Mon Dec 16 18:40:51 2019 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 16 Dec 2019 10:40:51 -0800 Subject: RFR(S) 8235970 [TESTBUG] Remove dependency of sun.tools.jar from RedefineClassHelper In-Reply-To: <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com> References: <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com> Message-ID: <3ca2fd1a-cbde-62e2-e8b6-dcce04742091@oracle.com> Hi Alan, Thanks for the review and the tip. I will use ToolProvider for JDK-8236028 [TESTBUG] Remove dependency of sun.tools.jar from appcds/JarBuilder - Ioi On 12/15/19 11:22 PM, Alan Bateman wrote: > On 16/12/2019 06:21, Ioi Lam wrote: >> : >> >> The fix is to rewrite RedefineClassHelper to use ClassFileInstaller >> instead. > This looks okay but just to point out that the jar tool can be > obtained via ToolProvider, e.g. > ?? ToolProvider jarTool = ToolProvider.findFirst("jar").orElseThrow(); > > so RedefineClassHelper, or better still ClassFileInstaller, could use > that for cases where JAR files need to be created or updated in ways > that would be easier if the jar tool could be used in the test. Avoids > using some of the prickly APIs in java.util.zip|jar. > > -Alan From harold.seigel at oracle.com Mon Dec 16 18:48:16 2019 From: harold.seigel at oracle.com (Harold Seigel) Date: Mon, 16 Dec 2019 13:48:16 -0500 Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and TestRecordAttr.java are failing In-Reply-To: References: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com> Message-ID: Thanks Serguei! Harold On 12/13/2019 4:50 PM, serguei.spitsyn at oracle.com wrote: > Hi Harold, > > +1 > > Thanks, > Serguei > > On 12/13/19 11:45 AM, Daniel D. Daugherty wrote: >> On 12/13/19 2:35 PM, Harold Seigel wrote: >>> Hi, >>> >>> Please review this trivial fix to prevent java/lang/instrument/... >>> TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? >>> The fix replaces hard-wired JDK version 14 with mechanisms that get >>> the latest JDK version. >>> >>> Open Webrev: >>> http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html >> >> test/jdk/java/lang/instrument/RedefineRecordAttr/TestRecordAttr.java >> ??? No comments. >> >> test/jdk/java/lang/instrument/RedefineRecordAttrGenericSig/TestRecordAttrGenericSig.java >> >> ??? No comments. >> >> Thumbs up! I agree that this is a trivial fix. >> >> Thanks for fixing this so quickly! >> >> Dan >> >>> >>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922 >>> >>> The fix was tested by running the tests locally on Linux-x64. >>> >>> Thanks, Harold >>> >> > From chris.plummer at oracle.com Mon Dec 16 19:09:10 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 16 Dec 2019 11:09:10 -0800 Subject: Discuss the design of parallel and incremental jmap histo.(Internet mail) In-Reply-To: References: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com> Message-ID: <8b179abc-cda8-0ec7-88f3-5bfe1af78eb3@oracle.com> On 12/15/19 5:38 PM, linzang(??) wrote: > > Dear Chris, > > >> why jmap is getting stuck or "killed by timer" as you mention below. > Shouldn't this be considered a bug and addressed directly. > > This ?timer? is usually another process, my experience is HDFS and > ZKFC, the ZKFC pings it?s NameNode periodically, and when the > NameNode?s heap is large (~180GB in my case), the heap iteration by > jmap can cause the process stuck, so ZFKC can not get response from > NameNode, so the NameNode got killed. > This is the first I've heard mentioned of any of these. I assume NameNode is the process you are getting the heap dump from, and while dumping the heap it can't respond to ZKFC? That still sounds like something to me that should be addressed directly, and not worked around with the incremental solution. The parallel solution is ok because it also has a performance benefit, so if as a side affect it helps prevent the timeout issue, then that's ok. > > >> ?How useful are intermediate results? How often can users come to > reasonable conclusions about heap usage when the data is incomplete. > > From my experience, ?I usually use jmap -histo to get the information > about the distribution of objects, and I found usually the object > distribution of part of the heap is similar about the distribution of > the whole heap. I agree that this is not correct for all cases, but > since jmap -histo give results only when it?s exit normally at > present, ?and I think maybe info of partial heap is better than > nothing, especially for memory leak analysis. > Ok, but I still think avoiding the need for incremental dumps would be a better approach. thanks, Chris > > >> Is there even an indication given of how much of the heap is > accounted for in the output? > > Yes, the incremental dump information shows the number of the objects > and the totally bytes have been iterated. > > Thanks! > > BRs, > > Lin > > *From: *Chris Plummer > *Date: *Saturday, December 14, 2019 at 2:46 AM > *To: *"linzang(??)" , > "serviceability-dev at openjdk.java.net" > > *Subject: *Re: Discuss the design of parallel and incremental jmap > histo.(Internet mail) > > Hi Lin, > > I have a question regarding the need for incremental support. The CSR > states: > > Problem: Now, the "JMap -histo" tool can not dump intermediate result, > which is useful if the heap is large and dumping the whole heap can be > stuck. > > Two questions. The first is why jmap is getting stuck or "killed by > timer" as you mention below. Shouldn't this be considered a bug and > addressed directly. Second question is how useful are intermediate > results? How often can users come to reasonable conclusions about heap > usage when the data is incomplete. Is there even an indication given > of how much of the heap is accounted for in the output? > > thanks, > > Chris > > On 12/12/19 10:22 PM, linzang(??) wrote: > > Dear All, > > ?? I want to re-activate the thread of discussion about the > implementation of parallel and incremental ?Jmap -histo?. > > ???The target of these changes is to solve the problems that ?jmap > -histo? may ? timeout or killed by timer? when heap is large. And > the result of ?jmap -histo? is ?one or nothing?, which means if it > gets killed before exit, user gets no information about the heap. > > ?? The ?incremental? means that jmap -histo dumps the intermediate > results when it is iterating the heap, so if it is interrupted, > user can get some meaningful information. > > ?? The ?parallel? targets to help speed up the heap iteration with > multi-threading. > > Originally I have implemented the ?incremental dump? that dump the > intermediate data into a separate file like > , and the final result will be saved to > another file . so when jmap -histo get > interrupted, user can get information from > , and if jmap -histo works fine, the final > result would be in . > > ?? And the parallel dump will have multiple thread working on heap > iteration, each thread generates intermediate data timely. > > ?? The main reason of using separate file for incremental dump is > due to the consideration of parallel incremental dump > implementation, so that every heap-iteration thread could dump its > own data in separate file, to avoid using file lock. > > However, it seems that the original design might confuse user by > having two or more result files (intermediated result and final > result).? So I want to ask your help to discuss it: > > 1. For incremental dump without parallel, Intermediate result and > the final result are dumped to the same file: > > In this case, the intermediate data are generated in the middle of > heap iteration, they are written to file at the > same time. And if jmap -histo exits normally, the final result > will be also dump to , then all intermediate > data are flushed. > > 2. For parallel dump without incremental: > > Every thread generates its own thread-local dump buffer, and all > thread local dump are merged and write to the > file at the end. > > There is no incremental support, so the result is ?one or nothing?. > > 3. For parallel + incremental dump, I think it?s a little > complicated because of intermediate data processing: > > 1. Every thread has its own thread-local intermediate data > buffer, and all the thread-local buffers will be written > to file while holding file lock. So > there is only one data file generated, and if jmap -histo > is interrupted, ?the intermediated data are save in the > same file. > > The problem is that the file write lock can be heavy, which may > cause parallel heap dump slow. > > 2. Every thread has its own thread-local intermediate data > buffer, and every thread save its result in an temp file > named . > > So there is no ?file lock. The parallel can be fast. But the > problem is that there will be multiple files generated to save the > thread-local intermediate results. And this might confuse the user. > > 3. Every thread has its own thread-local intermediate data > buffer, and another ?data-merging-thread? will be generated. > > The parallel threads write data to its thread local buffer, and > enqueue the buffer when data reach some threshold. The > ?data-merging-thread? consumes the queue, merge the data from > different thread, save the merged data to the result file. > > In this case, there is only one file generated. > And there is no file lock needed, but there is queue lock, and a > separate ?merging thread? impl. Do you think this is a reasonable > solution? > > So may I ask your suggestion ? > > Details of previous discussion can be found at > https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028276.html > > Thanks! > > BRs, > > Lin > > > From chris.plummer at oracle.com Mon Dec 16 19:12:24 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 16 Dec 2019 11:12:24 -0800 Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled In-Reply-To: <068d2c84-d865-3c78-1b53-eab3f3af660d@oracle.com> References: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com> <43fd053e-7fd4-41fe-8bf3-7382cf77ca33@default> <068d2c84-d865-3c78-1b53-eab3f3af660d@oracle.com> Message-ID: <3306a3c9-c26c-d72e-983d-3f7e4c890abe@oracle.com> +1 On 12/16/19 8:35 AM, Kevin Walls wrote: > Great! 8-) > > On 16/12/2019 15:36, Fairoz Matte wrote: >> Oh yes, >> Thanks Kevin for the review. >> >> Corrected the same - >> http://cr.openjdk.java.net/~fmatte/8235637/webrev.02 >> >> Thanks, >> Fairoz >> From coleen.phillimore at oracle.com Mon Dec 16 20:21:31 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 16 Dec 2019 15:21:31 -0500 Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do In-Reply-To: References: Message-ID: On 12/16/19 8:13 AM, Robbin Ehn wrote: > Hi Coleen, in VM_RedefineClasses::doit: > > This updates the breakpoints: > ? MetadataOnStackMark md_on_stack(/*walk_all_metadata*/true, > /*redefinition_walk*/true); > > And this removes breakpoints: > ? for (int i = 0; i < _class_count; i++) { > ??? redefine_single_class(_class_defs[i].klass, _scratch_classes[i], > thread); > ? } > > So we skip updating, since we do remove them after we updated them. > But you are the expert here. Let me know if there is something I missed. > No, you are correct. The JVMTI spec says that the breakpoints are all deleted.? I'm remembering code that sets/clears breakpoints that has to walk emcp methods, and set them there also.? But redefinition does clear them. If the old Method* is still executing or referenced somehow, the other metadata walking would find it anyway.? So maybe this was never needed. > OopHandle just adds more code. > It doesn't.? And if we want to make all native memory never point directly to oops and point to oopStorage instead, having some encapsulation makes that easier.? It also makes it so that we don't have to stare at oop* in data structures and wonder if we're going to miss the mumble-fratz access and decorators that we need.? ie: http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html + NativeAccess<>::oop_store(_class_holder, class_holder_oop); This should probably be: 41 NativeAccess::oop_store(handle, obj); You can leave out using OopHandle.? I have a patch to add the missing functionality and add it to your code.?? Actually, I was looking to see how much OopHandle is used to see if it's helping anything and there is a lot of code using it.? Most of it is to hide oop* in ClassLoaderData. This change otherwise looks great. Thanks, Coleen > Thanks for having a look, Robbin > > On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote: >> >> I have to think about this.?? Could there be breakpoints in old emcp >> methods that we do not remove??? The metadata_do function is trying >> to keep old Methods from being deleted while there are still >> references to them. >> >> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html >> >> >> + oop* _class_holder; // keeps _method memory from being deallocated >> >> >> We created the class OopHandle to encapsulate strong oopStorage >> references, although it's missing oop_store.? Can you use that? > > >> >> Coleen >> >> On 12/16/19 4:47 AM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912: >>> >>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is >>> in a vm operation) before they are installed in the safeopint and >>> after they have been installed, walked with >>> JvmtiCurrentBreakpoints::oops_do(). >>> By putting the class holder inside oopStorage there is no need for >>> this. >>> >>> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine >>> classes actually removes the breakpoints before updating them (so >>> there is no breakpoints to update). >>> We can just remove metadata_do. >>> >>> >>> I also removed some unused code. >>> >>> Changeset: >>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/ >>> >>> Passes several runs of nsk jvmti/jdi and t1-7. >>> >>> Thanks, Robbin >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Mon Dec 16 22:51:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Dec 2019 08:51:00 +1000 Subject: [14] RFR 8235829: graal crashes with Zombie.java test In-Reply-To: <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com> References: <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com> <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com> Message-ID: <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com> Hi Coleen, A quick initial response ... On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote: > > > On 12/16/19 8:04 AM, David Holmes wrote: >> Hi Coleen, >> >> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote: >>> Summary: Start ServiceThread before compiler threads, and run nmethod >>> barriers for zgc before adding to the service thread queue, or >>> posting the events on the java thread queue. >> >> I can't comment on most of this but the earlier starting of the >> service thread has some concerns: >> >> - there is a lot of JDK level initialization which now will not have >> happened before the service thread is started and it is far from >> obvious that all possible initialization dependencies will be satisfied > > I agree that the order of initialization is very sensitive.? From the > actions that the service thread does, the one that I found was a problem > was that events were posted before the LIVE phase (see comment in > has_events()), which could have happened with the existing code, but the > window for the race is a lot smaller. ? The other actions can be run if > there's a GC before initialization but would be a bug in the > initialization code, and I didn't find these bugs in all my testing. > There are some ordering dependencies that do have odd side effects > (between the compiler thread startup and initialization jsr292 classes) > which have comments.? This patch doesn't touch those. > >> >> - current starting of the service thread in Management::initialize is >> guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the >> service thread unconditionally for all builds. Hmm just saw your >> latest comment to the bug report - so the service thread is now (for >> quite some time?) being used for other than management tasks and so >> should always be present even if INCLUDE_MANAGEMENT is not enabled. Is >> that sufficient or are there likely to be other changes needed to >> actually ensure that all works correctly? e.g. any code the service >> thread executes that is only defined for INCLUDE_MANAGEMENT will need >> to be compiled out explicitly. >> > > I asked Jie offline to check the minimal build.? I don't think there are > other INCLUDE_MANAGEMENT actions in the service thread and I'm not sure > why it was initialized there in the first place.? The minimal vm would > have been broken ie. hashtables would not have been cleaned up, etc, but > I'm not sure how well that is tested or if one would notice. >> - the service thread and the notification thread are (were?) closely >> related but now started at completely different times > > The notification thread is limited to "services" so it makes sense where > it is.? The ServiceThread does lots of other things.? Maybe it needs > renaming in 15. >> >> The bug report states the problem as: >> >> "The graal crash is because compiled_method_load events are added to >> the ServiceThread's deferred event queue before the ServiceThread is >> created so are not walked to keep them from being zombied." >> >> so why isn't the solution to ensure the deferred event queue is >> walked? I'm not clear how starting the service thread relates to >> walking the queue. >> > > The service thread is responsible for walking the deferred event > queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could be > changed to have some global walk somewhere of this queue, but > essentially this queue is processed by the service thread. Sorry I don't follow. I thought "oops_do" and friends are for the GC threads and/or VMThread to call to process oops when GC updates them. David ----- > I had an additional change to make the queue non-static but want to > limit the change at this point. > > Thanks, > Coleen >> Thanks, >> David >> >>> See bug for description of the problems found with the new >>> Zombie.java test. >>> >>> open webrev at >>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829 >>> >>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning >>> original test failure from bug >>> https://bugs.openjdk.java.net/browse/JDK-8173361. >>> >>> Thanks, >>> Coleen > From serguei.spitsyn at oracle.com Mon Dec 16 23:33:31 2019 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Dec 2019 15:33:31 -0800 Subject: RFR(S) 8235970 [TESTBUG] Remove dependency of sun.tools.jar from RedefineClassHelper In-Reply-To: <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com> References: <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com> Message-ID: <6d09f46e-d253-c37d-dad7-e048f6aafb47@oracle.com> Hi Ioi, It looks good. It is nice to get rid of unneeded dependency. Thanks, Serguei On 12/15/19 23:22, Alan Bateman wrote: > On 16/12/2019 06:21, Ioi Lam wrote: >> : >> >> The fix is to rewrite RedefineClassHelper to use ClassFileInstaller >> instead. > This looks okay but just to point out that the jar tool can be > obtained via ToolProvider, e.g. > ?? ToolProvider jarTool = ToolProvider.findFirst("jar").orElseThrow(); > > so RedefineClassHelper, or better still ClassFileInstaller, could use > that for cases where JAR files need to be created or updated in ways > that would be easier if the jar tool could be used in the test. Avoids > using some of the prickly APIs in java.util.zip|jar. > > -Alan From coleen.phillimore at oracle.com Tue Dec 17 02:40:57 2019 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 16 Dec 2019 21:40:57 -0500 Subject: [14] RFR 8235829: graal crashes with Zombie.java test In-Reply-To: <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com> References: <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com> <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com> <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com> Message-ID: <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com> Short answer below. On 12/16/19 5:51 PM, David Holmes wrote: > Hi Coleen, > > A quick initial response ... > > On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote: >> >> >> On 12/16/19 8:04 AM, David Holmes wrote: >>> Hi Coleen, >>> >>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote: >>>> Summary: Start ServiceThread before compiler threads, and run >>>> nmethod barriers for zgc before adding to the service thread queue, >>>> or posting the events on the java thread queue. >>> >>> I can't comment on most of this but the earlier starting of the >>> service thread has some concerns: >>> >>> - there is a lot of JDK level initialization which now will not have >>> happened before the service thread is started and it is far from >>> obvious that all possible initialization dependencies will be satisfied >> >> I agree that the order of initialization is very sensitive. From the >> actions that the service thread does, the one that I found was a >> problem was that events were posted before the LIVE phase (see >> comment in has_events()), which could have happened with the existing >> code, but the window for the race is a lot smaller. ? The other >> actions can be run if there's a GC before initialization but would be >> a bug in the initialization code, and I didn't find these bugs in all >> my testing. There are some ordering dependencies that do have odd >> side effects (between the compiler thread startup and initialization >> jsr292 classes) which have comments.? This patch doesn't touch those. >> >>> >>> - current starting of the service thread in Management::initialize >>> is guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the >>> service thread unconditionally for all builds. Hmm just saw your >>> latest comment to the bug report - so the service thread is now (for >>> quite some time?) being used for other than management tasks and so >>> should always be present even if INCLUDE_MANAGEMENT is not enabled. >>> Is that sufficient or are there likely to be other changes needed to >>> actually ensure that all works correctly? e.g. any code the service >>> thread executes that is only defined for INCLUDE_MANAGEMENT will >>> need to be compiled out explicitly. >>> >> >> I asked Jie offline to check the minimal build.? I don't think there >> are other INCLUDE_MANAGEMENT actions in the service thread and I'm >> not sure why it was initialized there in the first place.? The >> minimal vm would have been broken ie. hashtables would not have been >> cleaned up, etc, but I'm not sure how well that is tested or if one >> would notice. >>> - the service thread and the notification thread are (were?) closely >>> related but now started at completely different times >> >> The notification thread is limited to "services" so it makes sense >> where it is.? The ServiceThread does lots of other things.? Maybe it >> needs renaming in 15. >>> >>> The bug report states the problem as: >>> >>> "The graal crash is because compiled_method_load events are added to >>> the ServiceThread's deferred event queue before the ServiceThread is >>> created so are not walked to keep them from being zombied." >>> >>> so why isn't the solution to ensure the deferred event queue is >>> walked? I'm not clear how starting the service thread relates to >>> walking the queue. >>> >> >> The service thread is responsible for walking the deferred event >> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could >> be changed to have some global walk somewhere of this queue, but >> essentially this queue is processed by the service thread. > > Sorry I don't follow. I thought "oops_do" and friends are for the GC > threads and/or VMThread to call to process oops when GC updates them. The oops_do and nmethods_do() can be called by a thread walk in handshakes (by the sweeper thread) and by parallel GC thread walks. There isn't a single entry to do the thread-specific closures that we need to do for these deferred event queues.?? I tried a version that walked the queues with a static call but missed some places where it would be needed to make this call (didn't work).? Keeping this associated with the ServiceThread simplifies a lot. thanks, Coleen > > David > ----- > >> I had an additional change to make the queue non-static but want to >> limit the change at this point. >> >> Thanks, >> Coleen >>> Thanks, >>> David >>> >>>> See bug for description of the problems found with the new >>>> Zombie.java test. >>>> >>>> open webrev at >>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829 >>>> >>>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning >>>> original test failure from bug >>>> https://bugs.openjdk.java.net/browse/JDK-8173361. >>>> >>>> Thanks, >>>> Coleen >> From linzang at tencent.com Tue Dec 17 02:57:08 2019 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 17 Dec 2019 02:57:08 +0000 Subject: Discuss the design of parallel and incremental jmap histo. In-Reply-To: <8b179abc-cda8-0ec7-88f3-5bfe1af78eb3@oracle.com> References: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com> <8b179abc-cda8-0ec7-88f3-5bfe1af78eb3@oracle.com> Message-ID: Dear Chris, I think I can first make the patch of parallel jmap. it seems to me that if parallel is fast enough, there is no need for incremental. So l will not work on it until I found new cases that show it is necessary, then we can discuss it again. Thanks! BRs, Lin On Dec 17, 2019, at 3:09 AM, Chris Plummer > wrote: On 12/15/19 5:38 PM, linzang(??) wrote: Dear Chris, >> why jmap is getting stuck or "killed by timer" as you mention below. Shouldn't this be considered a bug and addressed directly. This ?timer? is usually another process, my experience is HDFS and ZKFC, the ZKFC pings it?s NameNode periodically, and when the NameNode?s heap is large (~180GB in my case), the heap iteration by jmap can cause the process stuck, so ZFKC can not get response from NameNode, so the NameNode got killed. This is the first I've heard mentioned of any of these. I assume NameNode is the process you are getting the heap dump from, and while dumping the heap it can't respond to ZKFC? That still sounds like something to me that should be addressed directly, and not worked around with the incremental solution. The parallel solution is ok because it also has a performance benefit, so if as a side affect it helps prevent the timeout issue, then that's ok. >> How useful are intermediate results? How often can users come to reasonable conclusions about heap usage when the data is incomplete. From my experience, I usually use jmap -histo to get the information about the distribution of objects, and I found usually the object distribution of part of the heap is similar about the distribution of the whole heap. I agree that this is not correct for all cases, but since jmap -histo give results only when it?s exit normally at present, and I think maybe info of partial heap is better than nothing, especially for memory leak analysis. Ok, but I still think avoiding the need for incremental dumps would be a better approach. thanks, Chris >> Is there even an indication given of how much of the heap is accounted for in the output? Yes, the incremental dump information shows the number of the objects and the totally bytes have been iterated. Thanks! BRs, Lin *From: *Chris Plummer > *Date: *Saturday, December 14, 2019 at 2:46 AM *To: *"linzang(??)" >, "serviceability-dev at openjdk.java.net" > *Subject: *Re: Discuss the design of parallel and incremental jmap histo.(Internet mail) Hi Lin, I have a question regarding the need for incremental support. The CSR states: Problem: Now, the "JMap -histo" tool can not dump intermediate result, which is useful if the heap is large and dumping the whole heap can be stuck. Two questions. The first is why jmap is getting stuck or "killed by timer" as you mention below. Shouldn't this be considered a bug and addressed directly. Second question is how useful are intermediate results? How often can users come to reasonable conclusions about heap usage when the data is incomplete. Is there even an indication given of how much of the heap is accounted for in the output? thanks, Chris On 12/12/19 10:22 PM, linzang(??) wrote: Dear All, I want to re-activate the thread of discussion about the implementation of parallel and incremental ?Jmap -histo?. The target of these changes is to solve the problems that ?jmap -histo? may ? timeout or killed by timer? when heap is large. And the result of ?jmap -histo? is ?one or nothing?, which means if it gets killed before exit, user gets no information about the heap. The ?incremental? means that jmap -histo dumps the intermediate results when it is iterating the heap, so if it is interrupted, user can get some meaningful information. The ?parallel? targets to help speed up the heap iteration with multi-threading. Originally I have implemented the ?incremental dump? that dump the intermediate data into a separate file like , and the final result will be saved to another file . so when jmap -histo get interrupted, user can get information from , and if jmap -histo works fine, the final result would be in . And the parallel dump will have multiple thread working on heap iteration, each thread generates intermediate data timely. The main reason of using separate file for incremental dump is due to the consideration of parallel incremental dump implementation, so that every heap-iteration thread could dump its own data in separate file, to avoid using file lock. However, it seems that the original design might confuse user by having two or more result files (intermediated result and final result). So I want to ask your help to discuss it: 1. For incremental dump without parallel, Intermediate result and the final result are dumped to the same file: In this case, the intermediate data are generated in the middle of heap iteration, they are written to file at the same time. And if jmap -histo exits normally, the final result will be also dump to , then all intermediate data are flushed. 2. For parallel dump without incremental: Every thread generates its own thread-local dump buffer, and all thread local dump are merged and write to the file at the end. There is no incremental support, so the result is ?one or nothing?. 3. For parallel + incremental dump, I think it?s a little complicated because of intermediate data processing: 1. Every thread has its own thread-local intermediate data buffer, and all the thread-local buffers will be written to file while holding file lock. So there is only one data file generated, and if jmap -histo is interrupted, the intermediated data are save in the same file. The problem is that the file write lock can be heavy, which may cause parallel heap dump slow. 2. Every thread has its own thread-local intermediate data buffer, and every thread save its result in an temp file named . So there is no file lock. The parallel can be fast. But the problem is that there will be multiple files generated to save the thread-local intermediate results. And this might confuse the user. 3. Every thread has its own thread-local intermediate data buffer, and another ?data-merging-thread? will be generated. The parallel threads write data to its thread local buffer, and enqueue the buffer when data reach some threshold. The ?data-merging-thread? consumes the queue, merge the data from different thread, save the merged data to the result file. In this case, there is only one file generated. And there is no file lock needed, but there is queue lock, and a separate ?merging thread? impl. Do you think this is a reasonable solution? So may I ask your suggestion ? Details of previous discussion can be found at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028276.html Thanks! BRs, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From linzang at tencent.com Tue Dec 17 03:18:38 2019 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 17 Dec 2019 03:18:38 +0000 Subject: Jhsdb jmap --heap print large value of MaxMetaspaceSize Message-ID: <6EDC6A6E-668A-4E7B-847C-E9155F6E6598@tencent.com> Dear All, I found jhsdb jmap ?heap print the value of uint_max (17592186044415 MB) when MaxMetaspaceSize is not set by user. This number confused me a little. And I also found the jcmd VM.metaspace prints ?unimited? if MaxMetaspaceSize is not set. Which seems more reasonable. So Do you think it is OK if I make "jhsdb jmap" print the same ?unlimited? value as jcmd does for MaxMetaspaceSize? BRs, Lin From david.holmes at oracle.com Tue Dec 17 04:04:16 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Dec 2019 14:04:16 +1000 Subject: [14] RFR 8235829: graal crashes with Zombie.java test In-Reply-To: <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com> References: <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com> <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com> <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com> <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com> Message-ID: Clarification ... On 17/12/2019 12:40 pm, coleen.phillimore at oracle.com wrote: > > Short answer below. > > On 12/16/19 5:51 PM, David Holmes wrote: >> Hi Coleen, >> >> A quick initial response ... >> >> On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 12/16/19 8:04 AM, David Holmes wrote: >>>> Hi Coleen, >>>> >>>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote: >>>>> Summary: Start ServiceThread before compiler threads, and run >>>>> nmethod barriers for zgc before adding to the service thread queue, >>>>> or posting the events on the java thread queue. >>>> >>>> I can't comment on most of this but the earlier starting of the >>>> service thread has some concerns: >>>> >>>> - there is a lot of JDK level initialization which now will not have >>>> happened before the service thread is started and it is far from >>>> obvious that all possible initialization dependencies will be satisfied >>> >>> I agree that the order of initialization is very sensitive. From the >>> actions that the service thread does, the one that I found was a >>> problem was that events were posted before the LIVE phase (see >>> comment in has_events()), which could have happened with the existing >>> code, but the window for the race is a lot smaller. ? The other >>> actions can be run if there's a GC before initialization but would be >>> a bug in the initialization code, and I didn't find these bugs in all >>> my testing. There are some ordering dependencies that do have odd >>> side effects (between the compiler thread startup and initialization >>> jsr292 classes) which have comments.? This patch doesn't touch those. >>> >>>> >>>> - current starting of the service thread in Management::initialize >>>> is guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the >>>> service thread unconditionally for all builds. Hmm just saw your >>>> latest comment to the bug report - so the service thread is now (for >>>> quite some time?) being used for other than management tasks and so >>>> should always be present even if INCLUDE_MANAGEMENT is not enabled. >>>> Is that sufficient or are there likely to be other changes needed to >>>> actually ensure that all works correctly? e.g. any code the service >>>> thread executes that is only defined for INCLUDE_MANAGEMENT will >>>> need to be compiled out explicitly. >>>> >>> >>> I asked Jie offline to check the minimal build.? I don't think there >>> are other INCLUDE_MANAGEMENT actions in the service thread and I'm >>> not sure why it was initialized there in the first place.? The >>> minimal vm would have been broken ie. hashtables would not have been >>> cleaned up, etc, but I'm not sure how well that is tested or if one >>> would notice. >>>> - the service thread and the notification thread are (were?) closely >>>> related but now started at completely different times >>> >>> The notification thread is limited to "services" so it makes sense >>> where it is.? The ServiceThread does lots of other things.? Maybe it >>> needs renaming in 15. >>>> >>>> The bug report states the problem as: >>>> >>>> "The graal crash is because compiled_method_load events are added to >>>> the ServiceThread's deferred event queue before the ServiceThread is >>>> created so are not walked to keep them from being zombied." >>>> >>>> so why isn't the solution to ensure the deferred event queue is >>>> walked? I'm not clear how starting the service thread relates to >>>> walking the queue. >>>> >>> >>> The service thread is responsible for walking the deferred event >>> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could >>> be changed to have some global walk somewhere of this queue, but >>> essentially this queue is processed by the service thread. >> >> Sorry I don't follow. I thought "oops_do" and friends are for the GC >> threads and/or VMThread to call to process oops when GC updates them. > > The oops_do and nmethods_do() can be called by a thread walk in > handshakes (by the sweeper thread) and by parallel GC thread walks. > There isn't a single entry to do the thread-specific closures that we > need to do for these deferred event queues.?? I tried a version that > walked the queues with a static call but missed some places where it > would be needed to make this call (didn't work).? Keeping this > associated with the ServiceThread simplifies a lot. Just to clarify that further, the thread walk requires the thread appears in ALL_JAVA_THREADS but that only happens after the ServiceThread has been started. So in essence we don't really need the ServiceThread to have commenced execution earlier, but we need it to have been created. Those two steps are combined in practice. Cheers, David > thanks, > Coleen > >> >> David >> ----- >> >>> I had an additional change to make the queue non-static but want to >>> limit the change at this point. >>> >>> Thanks, >>> Coleen >>>> Thanks, >>>> David >>>> >>>>> See bug for description of the problems found with the new >>>>> Zombie.java test. >>>>> >>>>> open webrev at >>>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829 >>>>> >>>>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning >>>>> original test failure from bug >>>>> https://bugs.openjdk.java.net/browse/JDK-8173361. >>>>> >>>>> Thanks, >>>>> Coleen >>> > From chris.plummer at oracle.com Tue Dec 17 05:36:44 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 16 Dec 2019 21:36:44 -0800 Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA javascript support since it will always fail, and will likely be removed soon Message-ID: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8236062 http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/ Since SA javascript support is broken as described in [1] JDK-8235594, and we'll likely remove it, I'd like to at least get it disabled now. I'd like to get this into 14 mostly because I really want to get [2] JDK-8234048 fixed in 14 because we'll start seeing the clhsdb test failures on macos 10.14 and 10.15 more often over the coming months as we deploy more macosx test hosts with those versions. However, [2] JDK-8234048 is blocked by [3] JDK-8234277 (which improves error checking and failure output for the clhsdb tests), and [3] JDK-8234277 is blocked by this CR because the exceptions produced when javascript fails to initialize end up cluttering the clhsdb test logs, even when the test passes (and is misleading when the test fails). Sorry about all the bug references and inter-dependencies. It's taken a while myself to get my head wrapped around how I wanted to approach fixing them all in a meaningful order. thanks, Chris [1] https://bugs.openjdk.java.net/browse/JDK-8235594 [2] https://bugs.openjdk.java.net/browse/JDK-8234048 [3] https://bugs.openjdk.java.net/browse/JDK-8234277 From david.holmes at oracle.com Tue Dec 17 07:03:00 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 17 Dec 2019 17:03:00 +1000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> Message-ID: <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> David On 17/12/2019 4:57 pm, David Holmes wrote: > Hi Richard, > > On 14/12/2019 5:01 am, Reingruber, Richard wrote: >> Hi David, >> >> ?? > Some further queries/concerns: >> ?? > >> ?? > src/hotspot/share/runtime/objectMonitor.cpp >> ?? > >> ?? > Can you please explain the changes to ObjectMonitor::wait: >> ?? > >> ?? > !?? _recursions = save????? // restore the old recursion count >> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >> ?? > increased by the deferred relock count >> ?? > >> ?? > what is the "deferred relock count"? I gather it relates to >> ?? > >> ?? > "The code was extended to be able to deoptimize objects of a >> frame that >> ?? > is not the top frame and to let another thread than the owning >> thread do >> ?? > it." >> >> Yes, these relate. Currently EA based optimizations are reverted, when >> a compiled frame is replaced >> with corresponding interpreter frames. Part of this is relocking >> objects with eliminated >> locking. New with the enhancement is that we do this also just before >> object references are acquired >> through JVMTI. In this case we deoptimize also the owning compiled >> frame C and we register >> deoptimized objects as deferred updates. When control returns to C it >> gets deoptimized, we notice >> that objects are already deoptimized (reallocated and relocked), so we >> don't do it again (relocking >> twice would be incorrect of course). Deferred updates are copied into >> the new interpreter frames. >> >> Problem: relocking is not possible if the target thread T is waiting >> on the monitor that needs to be >> relocked. This happens only with non-local objects with >> EliminateNestedLocks. Instead relocking is >> deferred until T owns the monitor again. This is what the piece of >> code above does. > > Sorry I need some more detail here. How can you wait() on an object > monitor if the object allocation and/or locking was optimised away? And > what is a "non-local object" in this context? Isn't EA restricted to > thread-confined objects? > > Is it just that some of the locking gets optimized away e.g. > > synchronised(obj) { > ? synchronised(obj) { > ??? synchronised(obj) { > ????? obj.wait(); > ??? } > ? } > } > > If this is reduced to a form as-if it were a single lock of the monitor > (due to EA) and the wait() triggers a JVM TI event which leads to the > escape of "obj" then we need to reconstruct the true lock state, and so > when the wait() internally unblocks and reacquires the monitor it has to > set the true recursion count to 3, not the 1 that it appeared to be when > wait() was initially called. Is that the scenario? > > If so I find this truly awful. Anyone using wait() in a realistic form > requires a notification and so the object cannot be thread confined. In > which case I would strongly argue that upon hitting the wait() the deopt > should occur unconditionally and so the lock state is correct before we > wait and so we don't need to mess with the recursion count internally > when we reacquire the monitor. > >> >> ?? > which I don't like the sound of at all when it comes to >> ObjectMonitor >> ?? > state. So I'd like to understand in detail exactly what is going >> on here >> ?? > and why.? This is a very intrusive change that seems to badly break >> ?? > encapsulation and impacts future changes to ObjectMonitor that >> are under >> ?? > investigation. >> >> I would not regard this as breaking encapsulation. Certainly not badly. >> >> I've added a property relock_count_after_wait to JavaThread. The >> property is well >> encapsulated. Future ObjectMonitor implementations have to deal with >> recursion too. They are free in >> choosing a way to do that as long as that property is taken into >> account. This is hardly a >> limitation. > > I do think this badly breaks encapsulation as you have to add a callout > from the guts of the ObjectMonitor code to reach into the thread to get > this lock count adjustment. I understand why you have had to do this but > I would much rather see a change to the EA optimisation strategy so that > this is not needed. > >> Note also that the property is a straight forward extension of the >> existing concept of deferred >> local updates. It is embedded into the structure holding them. So not >> even the footprint of a >> JavaThread is enlarged if no deferred updates are generated. >> >> ?? > --- >> ?? > >> ?? > src/hotspot/share/runtime/thread.cpp >> ?? > >> ?? > Can you please explain why >> JavaThread::wait_for_object_deoptimization >> ?? > has to be handcrafted in this way rather than using proper >> transitions. >> ?? > >> >> I wrote wait_for_object_deoptimization taking >> JavaThread::java_suspend_self_with_safepoint_check >> as template. So in short: for the same reasons :) >> >> Threads reach both methods as part of thread state transitions, >> therefore special handling is >> required to change thread state on top of ongoing transitions. >> >> ?? > We got rid of "deopt suspend" some time ago and it is disturbing >> to see >> ?? > it being added back (effectively). This seems like it may be >> something >> ?? > that handshakes could be used for. >> >> Deopt suspend used to be something rather different with a similar >> name[1]. It is not being added back. > > I stand corrected. Despite comments in the code to the contrary > deopt_suspend didn't actually cause a self-suspend. I was doing a lot of > cleanup in this area 13 years ago :) > >> >> I'm actually duplicating the existing external suspend mechanism, >> because a thread can be suspended >> at most once. And hey, and don't like that either! But it seems not >> unlikely that the duplicate can >> be removed together with the original and the new type of handshakes >> that will be used for >> thread suspend can be used for object deoptimization too. See today's >> discussion in JDK-8227745 [2]. > > I hope that discussion bears some fruit, at the moment it seems not to > be possible to use handshakes here. :( > > The external suspend mechanism is a royal pain in the proverbial that we > have to carefully live with. The idea that we're duplicating that for > use in another fringe area of functionality does not thrill me at all. > > To be clear, I understand the problem that exists and that you wish to > solve, but for the runtime parts I balk at the complexity cost of > solving it. > > Thanks, > David > ----- > >> Thanks, Richard. >> >> [1] Deopt suspend was something like an async. handshake for >> architectures with register windows, >> ???? where patching the return pc for deoptimization of a compiled >> frame was racy if the owner thread >> ???? was in native code. Instead a "deopt" suspend flag was set on >> which the thread patched its own >> ???? frame upon return from native. So no thread was suspended. It got >> its name only from the name of >> ???? the flags. >> >> [2] Discussion about using handshakes to sync. with the target thread: >> >> https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727 >> >> >> -----Original Message----- >> From: David Holmes >> Sent: Freitag, 13. Dezember 2019 00:56 >> To: Reingruber, Richard ; >> serviceability-dev at openjdk.java.net; >> hotspot-compiler-dev at openjdk.java.net; >> hotspot-runtime-dev at openjdk.java.net >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >> Performance in the Presence of JVMTI Agents >> >> Hi Richard, >> >> Some further queries/concerns: >> >> src/hotspot/share/runtime/objectMonitor.cpp >> >> Can you please explain the changes to ObjectMonitor::wait: >> >> !?? _recursions = save????? // restore the old recursion count >> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >> increased by the deferred relock count >> >> what is the "deferred relock count"? I gather it relates to >> >> "The code was extended to be able to deoptimize objects of a frame that >> is not the top frame and to let another thread than the owning thread do >> it." >> >> which I don't like the sound of at all when it comes to ObjectMonitor >> state. So I'd like to understand in detail exactly what is going on here >> and why.? This is a very intrusive change that seems to badly break >> encapsulation and impacts future changes to ObjectMonitor that are under >> investigation. >> >> --- >> >> src/hotspot/share/runtime/thread.cpp >> >> Can you please explain why JavaThread::wait_for_object_deoptimization >> has to be handcrafted in this way rather than using proper transitions. >> >> We got rid of "deopt suspend" some time ago and it is disturbing to see >> it being added back (effectively). This seems like it may be something >> that handshakes could be used for. >> >> Thanks, >> David >> ----- >> >> On 12/12/2019 7:02 am, David Holmes wrote: >>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: >>>> Hi David, >>>> >>>> ??? > Most of the details here are in areas I can comment on in detail, >>>> but I >>>> ??? > did take an initial general look at things. >>>> >>>> Thanks for taking the time! >>> >>> Apologies the above should read: >>> >>> "Most of the details here are in areas I *can't* comment on in detail >>> ..." >>> >>> David >>> >>>> ??? > The only thing that jumped out at me is that I think the >>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>> ??? > >>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>> >>>> Yes, it should. Will add the method like above. >>>> >>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>> Without >>>> ??? > active testing this will just bit-rot. >>>> >>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>> workload. I will add a minimal test >>>> to keep it fresh. >>>> >>>> ??? > Also on the tests I don't understand your @requires clause: >>>> ??? > >>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>> ??? > (vm.opt.TieredCompilation != true)) >>>> ??? > >>>> ??? > This seems to require that TieredCompilation is disabled, but >>>> tiered is >>>> ??? > our normal mode of operation. ?? >>>> ??? > >>>> >>>> I removed the clause. I guess I wanted to target the tests towards the >>>> code they are supposed to >>>> test, and it's easier to analyze failures w/o tiered compilation and >>>> with just one compiler thread. >>>> >>>> Additionally I will make use of >>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>> >>>> Thanks, >>>> Richard. >>>> >>>> -----Original Message----- >>>> From: David Holmes >>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>> To: Reingruber, Richard ; >>>> serviceability-dev at openjdk.java.net; >>>> hotspot-compiler-dev at openjdk.java.net; >>>> hotspot-runtime-dev at openjdk.java.net >>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>> Performance in the Presence of JVMTI Agents >>>> >>>> Hi Richard, >>>> >>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>> Hi, >>>>> >>>>> I would like to get reviews please for >>>>> >>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>> >>>>> Corresponding RFE: >>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>> >>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>> >>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>> issues (thanks!). In addition the >>>>> change is being tested at SAP since I posted the first RFR some >>>>> months ago. >>>>> >>>>> The intention of this enhancement is to benefit performance wise from >>>>> escape analysis even if JVMTI >>>>> agents request capabilities that allow them to access local variable >>>>> values. E.g. if you start-up >>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>> escape analysis is disabled right >>>>> from the beginning, well before a debugger attaches -- if ever one >>>>> should do so. With the >>>>> enhancement, escape analysis will remain enabled until and after a >>>>> debugger attaches. EA based >>>>> optimizations are reverted just before an agent acquires the >>>>> reference to an object. In the JBS item >>>>> you'll find more details. >>>> >>>> Most of the details here are in areas I can comment on in detail, but I >>>> did take an initial general look at things. >>>> >>>> The only thing that jumped out at me is that I think the >>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>> >>>> +? bool is_hidden_from_external_view() const { return true; } >>>> >>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>> Without >>>> active testing this will just bit-rot. >>>> >>>> Also on the tests I don't understand your @requires clause: >>>> >>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>> (vm.opt.TieredCompilation != true)) >>>> >>>> This seems to require that TieredCompilation is disabled, but tiered is >>>> our normal mode of operation. ?? >>>> >>>> Thanks, >>>> David >>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>>> >>>>> >>>>> From suenaga at oss.nttdata.com Tue Dec 17 07:36:12 2019 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 17 Dec 2019 16:36:12 +0900 Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA javascript support since it will always fail, and will likely be removed soon In-Reply-To: References: Message-ID: <286e679f-e887-dee3-afe0-b98cd67d9f0a@oss.nttdata.com> Hi Chris, Looks good. BTW do you have any plan to provide alternative(s) on CLHSDB? Thanks, Yasumasa On 2019/12/17 14:36, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8236062 > http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/ > > Since SA javascript support is broken as described in [1] JDK-8235594, and we'll likely remove it, I'd like to at least get it disabled now. I'd like to get this into 14 mostly because I really want to get [2] JDK-8234048 fixed in 14 because we'll start seeing the clhsdb test failures on macos 10.14 and 10.15 more often over the coming months as we deploy more macosx test hosts with those versions. However, [2] JDK-8234048 is blocked by [3] JDK-8234277 (which improves error checking and failure output for the clhsdb tests), and [3] JDK-8234277 is blocked by this CR because the exceptions produced when javascript fails to initialize end up cluttering the clhsdb test logs, even when the test passes (and is misleading when the test fails). > > Sorry about all the bug references and inter-dependencies. It's taken a while myself to get my head wrapped around how I wanted to approach fixing them all in a meaningful order. > > thanks, > > Chris > > [1] https://bugs.openjdk.java.net/browse/JDK-8235594 > [2] https://bugs.openjdk.java.net/browse/JDK-8234048 > [3] https://bugs.openjdk.java.net/browse/JDK-8234277 From chris.plummer at oracle.com Tue Dec 17 07:55:06 2019 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 16 Dec 2019 23:55:06 -0800 Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA javascript support since it will always fail, and will likely be removed soon In-Reply-To: <286e679f-e887-dee3-afe0-b98cd67d9f0a@oss.nttdata.com> References: <286e679f-e887-dee3-afe0-b98cd67d9f0a@oss.nttdata.com> Message-ID: Hi Yasumasa, Thanks for the review. If you mean plans for an alternate extension mechanism, we have none at the moment. We are of course open to suggestions. Some have been discussed on the recent thread regarding this topic, but there doesn't seem to be much consensus on how to approach this. thanks, Chris On 12/16/19 11:36 PM, Yasumasa Suenaga wrote: > Hi Chris, > > Looks good. > > BTW do you have any plan to provide alternative(s) on CLHSDB? > > > Thanks, > > Yasumasa > > > On 2019/12/17 14:36, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8236062 >> http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/ >> >> Since SA javascript support is broken as described in [1] >> JDK-8235594, and we'll likely remove it, I'd like to at least get it >> disabled now. I'd like to get this into 14 mostly because I really >> want to get [2] JDK-8234048 fixed in 14 because we'll start seeing >> the clhsdb test failures on macos 10.14 and 10.15 more often over the >> coming months as we deploy more macosx test hosts with those >> versions. However, [2] JDK-8234048 is blocked by [3] JDK-8234277 >> (which improves error checking and failure output for the clhsdb >> tests), and [3] JDK-8234277 is blocked by this CR because the >> exceptions produced when javascript fails to initialize end up >> cluttering the clhsdb test logs, even when the test passes (and is >> misleading when the test fails). >> >> Sorry about all the bug references and inter-dependencies. It's taken >> a while myself to get my head wrapped around how I wanted to approach >> fixing them all in a meaningful order. >> >> thanks, >> >> Chris >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8235594 >> [2] https://bugs.openjdk.java.net/browse/JDK-8234048 >> [3] https://bugs.openjdk.java.net/browse/JDK-8234277 From per.liden at oracle.com Tue Dec 17 08:14:37 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 17 Dec 2019 09:14:37 +0100 Subject: [14] RFR 8235829: graal crashes with Zombie.java test In-Reply-To: References: Message-ID: <7535d8d8-f245-78a0-fe58-f0625af43b3a@oracle.com> Hi Coleen, The "nmethod entry barrier"-part looks good to me. Just one minor nit, maybe JvmtiDeferredEventQueue::run_nmethod_entry_barrier should have an "s" on it (i.e. JvmtiDeferredEventQueue::run_nmethod_entry_barriers) since it loops over all entries in the queue? But I don't dare to comment on the ServiceThread initialization order. cheers, Per On 12/16/19 12:41 PM, coleen.phillimore at oracle.com wrote: > Summary: Start ServiceThread before compiler threads, and run nmethod > barriers for zgc before adding to the service thread queue, or posting > the events on the java thread queue. > > See bug for description of the problems found with the new Zombie.java > test. > > open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8235829 > > Ran tier1 all platforms, and tier2-8 testing, as well as rerunning > original test failure from bug > https://bugs.openjdk.java.net/browse/JDK-8173361. > > Thanks, > Coleen From robbin.ehn at oracle.com Tue Dec 17 09:21:32 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 17 Dec 2019 10:21:32 +0100 Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do In-Reply-To: References: Message-ID: <05aa6993-e1e0-c43e-0040-8fee71401f4c@oracle.com> Hi Coleen, On 12/16/19 9:21 PM, coleen.phillimore at oracle.com wrote: > > http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html > > + NativeAccess<>::oop_store(_class_holder, class_holder_oop); > > > This should probably be: > > 41 NativeAccess::oop_store(handle, obj); > I have not seen any stores to oopStorage that use that? oopStorage should be 'initialized'. So I prefer not adding another decorator if it's not needed. That would just be confusing. > > You can leave out using OopHandle.? I have a patch to add the missing > functionality and add it to your code.?? Actually, I was looking to see how much > OopHandle is used to see if it's helping anything and there is a lot of code > using it.? Most of it is to hide oop* in ClassLoaderData. > > This change otherwise looks great. Thanks, Robbin > Thanks, > Coleen > > >> Thanks for having a look, Robbin >> >> On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote: >>> >>> I have to think about this.?? Could there be breakpoints in old emcp methods >>> that we do not remove??? The metadata_do function is trying to keep old >>> Methods from being deleted while there are still references to them. >>> >>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html >>> >>> >>> + oop* _class_holder; // keeps _method memory from being deallocated >>> >>> >>> We created the class OopHandle to encapsulate strong oopStorage references, >>> although it's missing oop_store.? Can you use that? >> >> >>> >>> Coleen >>> >>> On 12/16/19 4:47 AM, Robbin Ehn