From coleen.phillimore at oracle.com  Mon Dec  2 13:42:09 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 2 Dec 2019 08:42:09 -0500
Subject: RFR (S) 8173361: various crashes in
 JvmtiExport::post_compiled_method_load
In-Reply-To: <f2ea5361-9d66-d8c3-c82d-816001855939@oracle.com>
References: <adcc73a3-2647-2a7b-b032-8fe97fe293ab@oracle.com>
 <1bce8841-8e23-8702-d2df-6f6c58a01bbf@oracle.com>
 <399eb99f-08ba-59e1-2fdc-ba5fc66d9ae5@oracle.com>
 <4f9f5421-29a6-d191-b03e-093a219e41cb@oracle.com>
 <40068f02-812c-b4bc-c150-d12e2fa03f01@oracle.com>
 <bad60054-4cdc-097b-3948-548a7db55995@oracle.com>
 <3f95ae42-346e-3caa-06bd-0facfb939225@oracle.com>
 <2cb4cdb4-4acf-2abf-af79-445bdfc651b5@oracle.com>
 <eca6dfa3-4c25-0f2f-781b-0515c4e48e0a@oracle.com>
 <24fc9c1c-bfd1-5edf-2231-d9ba0e0885f5@oracle.com>
 <020d7c80-4d77-ee96-5a7b-74acdbd54f86@oracle.com>
 <568c2562-0a56-73ac-c0af-43339d701b19@oracle.com>
 <200cb839-9019-58f1-17e5-7a0426a6035b@oracle.com>
 <f2ea5361-9d66-d8c3-c82d-816001855939@oracle.com>
Message-ID: <c216522c-9b57-9b2e-a1c4-0f0410ff7e33@oracle.com>

Thanks Erik!
Coleen

On 11/25/19 9:37 AM, Erik ?sterlund wrote:
> Hi Coleen,
>
> Still good BTW!
>
> Thanks,
> /Erik
>
> On 2019-11-25 14:47, coleen.phillimore at oracle.com wrote:
>> Thanks for the code review, Serguei!
>> Coleen
>>
>> On 11/22/19 6:34 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi Coleen,
>>>
>>> +1
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 11/22/19 14:53, Daniel D. Daugherty wrote:
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev
>>>>
>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>> ??? No comments.
>>>>
>>>> src/hotspot/share/prims/jvmtiImpl.hpp
>>>> ??? No comments.
>>>>
>>>> src/hotspot/share/runtime/serviceThread.cpp
>>>> ??? No comments.
>>>>
>>>> Thumbs up.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 11/22/19 2:15 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Dan, Thank you for reviewing this!
>>>>>
>>>>> On 11/22/19 12:49 PM, Daniel D. Daugherty wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> Sorry for the delay in getting back to this re-review.
>>>>>>
>>>>>>
>>>>>> On 11/21/19 9:12 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>> Please review a new version of this change that keeps the 
>>>>>>> nmethod from being unloaded, after it is added to the deferred 
>>>>>>> event queue:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html
>>>>>>
>>>>>> src/hotspot/share/code/nmethod.cpp
>>>>>> ??? No comments.
>>>>>>
>>>>>> src/hotspot/share/oops/instanceKlass.cpp
>>>>>> ??? No comments.
>>>>>>
>>>>>> src/hotspot/share/prims/jvmtiExport.cpp
>>>>>> ??? No comments.
>>>>>>
>>>>>> src/hotspot/share/prims/jvmtiImpl.cpp
>>>>>> ??? Nice solution with the new oops_do() and nmethods_do() functions!
>>>>> Erik's insistance!
>>>>>>
>>>>>> ??? old L988: void JvmtiDeferredEventQueue::enqueue(const 
>>>>>> JvmtiDeferredEvent& event) {
>>>>>> ??? new L998: void 
>>>>>> JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) {
>>>>>> ??????? Not sure why this was changed.
>>>>>>
>>>>>> ??????? Update: Looks like Serguei raised the issue and Coleen 
>>>>>> has already
>>>>>> ??????? resolved it.
>>>>>
>>>>> Yes.
>>>>>>
>>>>>> src/hotspot/share/prims/jvmtiImpl.hpp
>>>>>> ??? old L494: ??? QueueNode(const JvmtiDeferredEvent& event)
>>>>>> ??? new L498: ??? QueueNode(JvmtiDeferredEvent& event)
>>>>>> ??????? Why was this changed?
>>>>>>
>>>>>> ??????? Update: Not clear if this was covered by Coleen's reply 
>>>>>> to Serguei.
>>>>>>
>>>>>> ??? old L497: ??? const JvmtiDeferredEvent& event() const { 
>>>>>> return _event; }
>>>>>> ??? new L501: ??? JvmtiDeferredEvent& event() { return _event; }
>>>>>> ??????? Why was this changed?
>>>>>>
>>>>>> ??????? Update: Coleen's reply to Serguei explained this. Perhaps 
>>>>>> add:
>>>>>> ????????????????? // Not const because of oops_do() and 
>>>>>> nmethods_do().
>>>>>>
>>>>>> ??? old L509: ? static void enqueue(const JvmtiDeferredEvent& 
>>>>>> event) NOT_JVMTI_RETURN;
>>>>>> ??? new L513: ? static void enqueue(JvmtiDeferredEvent event) 
>>>>>> NOT_JVMTI_RETURN;
>>>>>> ??????? Why was this changed?
>>>>>>
>>>>>> ??????? Update: Looks like Serguei raised the issue and Coleen 
>>>>>> has already
>>>>>> ??????? resolved it.
>>>>>
>>>>> Yes, I fixed these.
>>>>>>
>>>>>> src/hotspot/share/runtime/mutexLocker.cpp
>>>>>> ??? This change is going to require some testing to make sure we 
>>>>>> don't
>>>>>> ??? have any new deadlock scenarios.
>>>>>
>>>>> Luckily, I've previously added an implicit NoSafepointVerifier to 
>>>>> locks that are _allow_vm_block = true, like this one.
>>>>> + def(JmethodIdCreation_lock , PaddedMutex , leaf, true, 
>>>>> _safepoint_check_never); // used for creating jmethodIDs.
>>>>> which prevents one class of deadlock. If we take out another lock 
>>>>> with a higher rank, we'll get the ranking assert.
>>>>>
>>>>> This lock prevents insertion into an array, and has little outside 
>>>>> calls.
>>>>>
>>>>> I'm running tests in tier 1-6 but any code that travels through 
>>>>> this should get these assertion checks, rather than deadlocking.
>>>>>
>>>>>>
>>>>>> src/hotspot/share/runtime/serviceThread.cpp
>>>>>> ??? L50 - nit - why the extra blank line?
>>>>>
>>>>> To separate static data member definitions from functions.? I 
>>>>> removed it.
>>>>>>
>>>>>> src/hotspot/share/runtime/serviceThread.hpp
>>>>>> ??? Thanks for cleaning up the static:
>>>>>>
>>>>>> ????? ServiceThread::is_service_thread(Thread* thread)
>>>>>>
>>>>>> ??? stuff. Having it be different than the other threads was
>>>>>> ??? a bit jarring.
>>>>>>
>>>>>> src/hotspot/share/runtime/thread.hpp
>>>>>> ??? No comments.
>>>>>>
>>>>>> Thumbs up. My only comments are nits so I don't need to see a
>>>>>> new webrev if you decide to fix them.
>>>>>
>>>>> So it turns out that in stress testing my fix 
>>>>> forhttps://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>
>>>>> Because I was in the area and thought this was a duplicate of that 
>>>>> bug (it is not).?? I found that calling oops_do and nmethods_do 
>>>>> the ServiceThread? needs to hold the Service_lock, because other 
>>>>> threads can be adding things to the global queue while the sweeper 
>>>>> thread is calling this in a handshake.
>>>>>
>>>>> I am now retesting this change with the changes above, and with 
>>>>> the Service_lock.?? So far my stress tests for JDK-81212160 and 
>>>>> the stress test for this bug pass, but I'm going to run through 
>>>>> all the tiers 1-6 over the weekend.
>>>>>
>>>>> Please have a look at the changes in the meantime.
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04.incr/webrev
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.04/webrev
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>>
>>>>>>> Ran the test that failed 100 times without failure, tier1 on 
>>>>>>> Oracle supported platforms, and tier2-3 including jvmti and jdi 
>>>>>>> tests locally.
>>>>>>>
>>>>>>> See bug for more details about the crash.
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>> On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>> Hi Serguei,
>>>>>>>>
>>>>>>>> Sorry for not sending an update.? I talked to Erik and am 
>>>>>>>> working on a version that keeps the nmethod from being unloaded 
>>>>>>>> while it's in the deferred event queue, with a version that the 
>>>>>>>> GC people will like, and I like.? I'm testing it out now.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Coleen,
>>>>>>>>>
>>>>>>>>> Sorry for the latency, I had to investigate it a little bit.
>>>>>>>>> I still have some doubt your fix is right thing to do.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Hi Coleen,
>>>>>>>>>>>
>>>>>>>>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi, I've been working on answers to these questions, so 
>>>>>>>>>>>> I'll start with this one.
>>>>>>>>>>>>
>>>>>>>>>>>> The nmethodLocker keeps the nmethod from being reclaimed 
>>>>>>>>>>>> (made_zombie or memory released) by the sweeper, but the 
>>>>>>>>>>>> nmethod could be unloaded.? Unloading the nmethod clears 
>>>>>>>>>>>> the Method* _method field.
>>>>>>>>>>>
>>>>>>>>>>> Yes, I see it is done in the nmethod::make_unloaded().
>>>>>>>>>>>
>>>>>>>>>>>> The post_compiled_method_load event needs the _method field 
>>>>>>>>>>>> to look at things like inlining and ScopeDesc fields.?? If 
>>>>>>>>>>>> the nmethod is unloaded, some of the oops are dead.? There 
>>>>>>>>>>>> are "holder" oops that correspond to the metadata in the 
>>>>>>>>>>>> nmethod. If these oops are dead, causing the nmethod to get 
>>>>>>>>>>>> unloaded, then the metadata may not be valid.
>>>>>>>>>>>>
>>>>>>>>>>>> So my change 02 looks for a NULL nmethod._method field to 
>>>>>>>>>>>> tell whether we can post information about the nmethod.
>>>>>>>>>>>>
>>>>>>>>>>>> There's code in nmethod.cpp like:
>>>>>>>>>>>>
>>>>>>>>>>>> jmethodID nmethod::get_and_cache_jmethod_id() {
>>>>>>>>>>>> ? if (_jmethod_id == NULL) {
>>>>>>>>>>>> ??? // Cache the jmethod_id since it can no longer be 
>>>>>>>>>>>> looked up once the
>>>>>>>>>>>> ??? // method itself has been marked for unloading.
>>>>>>>>>>>> ??? _jmethod_id = method()->jmethod_id();
>>>>>>>>>>>> ? }
>>>>>>>>>>>> ? return _jmethod_id;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> Which was added when post_method_load and unload were 
>>>>>>>>>>>> turned into deferred events.
>>>>>>>>>>>
>>>>>>>>>>> Could we cache the jmethodID in the 
>>>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event
>>>>>>>>>>> similarly as we do in the 
>>>>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event?
>>>>>>>>>>> This would help to get rid of the dependency on the 
>>>>>>>>>>> nmethod::_method.
>>>>>>>>>>> Do we depend on any other nmethod fields?
>>>>>>>>>>
>>>>>>>>>> Yes, there are other nmethod metadata that we rely on to 
>>>>>>>>>> print inline information, and this function 
>>>>>>>>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it 
>>>>>>>>>> uses the ScopeDesc data in the nmethod.
>>>>>>>>>
>>>>>>>>> One possible approach is to prepare and cache all this information
>>>>>>>>> in the nmethod::post_compiled_method_load_event() before the
>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event() is called.
>>>>>>>>> The event parameters are:
>>>>>>>>> typedef struct {
>>>>>>>>>      const void* start_address;
>>>>>>>>>      jlocation location;
>>>>>>>>> } jvmtiAddrLocationMap;
>>>>>>>>> CompiledMethodLoad(jvmtiEnv *jvmti_env,
>>>>>>>>>              jmethodID method,
>>>>>>>>>              jint code_size,
>>>>>>>>>              const void* code_addr,
>>>>>>>>>              jint map_length,
>>>>>>>>>              const jvmtiAddrLocationMap* map,
>>>>>>>>>              const void* compile_info)
>>>>>>>>> Some of these addresses above could be not accessible when an 
>>>>>>>>> event is posted.
>>>>>>>>> Not sure yet if it is Okay.
>>>>>>>>> The question is if this kind of refactoring is worth and right 
>>>>>>>>> thing to do.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We do cache the jmethodID but that's not good enough.? See my 
>>>>>>>>>> last comment in the bug report. The jmethodID can point to an 
>>>>>>>>>> unloaded method.
>>>>>>>>>
>>>>>>>>> This looks like it is done a little bit late.
>>>>>>>>> It'd better to do it before the event is deferred (see above).
>>>>>>>>>
>>>>>>>>>> I tried a version of keeping the nmethod alive, but the GC 
>>>>>>>>>> folks will hate it.? And it doesn't work and I hate it.
>>>>>>>>>
>>>>>>>>> From serviceability point of view this is the best and most 
>>>>>>>>> consistent approach.
>>>>>>>>> I seems to me, it was initially designed this way.
>>>>>>>>> The downside is it adds some extra complexity to the GC.
>>>>>>>>>
>>>>>>>>>> My version 01 is the best, with the caveat that maybe it 
>>>>>>>>>> should check for _method == NULL instead of 
>>>>>>>>>> nmethod->is_alive().? I have to talk to Erik to see if 
>>>>>>>>>> there's a race with concurrent class unloading.
>>>>>>>>>>
>>>>>>>>>> Any application that depends on a compiled method loading 
>>>>>>>>>> event on a class that could be unloaded is a buggy 
>>>>>>>>>> application.? Applications should not rely on when the JIT 
>>>>>>>>>> compiler decides to compile a method!? This happens to us for 
>>>>>>>>>> a stress test.? Most applications will get most of their 
>>>>>>>>>> compiled method loading events as they normally do.
>>>>>>>>>
>>>>>>>>> It is not an application that relies on the compiled method 
>>>>>>>>> loading event.
>>>>>>>>> It is about profiling tools to be able to get correct 
>>>>>>>>> information about what is going on with compilations.
>>>>>>>>> My concern is that if we skip such compiled method load events 
>>>>>>>>> then profilers have no way
>>>>>>>>> to find out there many unneeded compilations that are thrown 
>>>>>>>>> away without any real use.
>>>>>>>>> Also, it is not clear what happens with the subsequent 
>>>>>>>>> compiled method unload events.
>>>>>>>>> Are they going to be skipped as well or they can appear and 
>>>>>>>>> confuse profilers?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>> I put more debugging in the bug to show this crash was from 
>>>>>>>>>>>> an unloaded nmethod.
>>>>>>>>>>>>
>>>>>>>>>>>> Coleen
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Coleen,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have some questions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Both the compiler method load and unload are posted as 
>>>>>>>>>>>>> deferred events.
>>>>>>>>>>>>> Both events keep the nmethod alive until the ServiceThread 
>>>>>>>>>>>>> processes the event.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The implementation is:
>>>>>>>>>>>>>
>>>>>>>>>>>>> JvmtiDeferredEvent 
>>>>>>>>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) {
>>>>>>>>>>>>> ? . . .
>>>>>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can 
>>>>>>>>>>>>> process
>>>>>>>>>>>>> ? // this deferred event.
>>>>>>>>>>>>> ? nmethodLocker::lock_nmethod(nm);
>>>>>>>>>>>>> ? return event;
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> JvmtiDeferredEvent 
>>>>>>>>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* 
>>>>>>>>>>>>> nm, jmethodID id, const void* code) {
>>>>>>>>>>>>> ? . . .
>>>>>>>>>>>>> ? // Keep the nmethod alive until the ServiceThread can 
>>>>>>>>>>>>> process
>>>>>>>>>>>>> ? // this deferred event. This will keep the memory for the
>>>>>>>>>>>>> ? // generated code from being reused too early. We pass
>>>>>>>>>>>>> ? // zombie_ok == true here so that our nmethod that was just
>>>>>>>>>>>>> ? // made into a zombie can be locked.
>>>>>>>>>>>>> ? nmethodLocker::lock_nmethod(nm, true /* zombie_ok */);
>>>>>>>>>>>>> ? return event;
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> void JvmtiDeferredEvent::post() {
>>>>>>>>>>>>> assert(ServiceThread::is_service_thread(Thread::current()),
>>>>>>>>>>>>> ???????? "Service thread must post enqueued events");
>>>>>>>>>>>>> ? switch(_type) {
>>>>>>>>>>>>> ??? case TYPE_COMPILED_METHOD_LOAD: {
>>>>>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_load;
>>>>>>>>>>>>> JvmtiExport::post_compiled_method_load(nm);
>>>>>>>>>>>>> ????? // done with the deferred event so unlock the nmethod
>>>>>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm);
>>>>>>>>>>>>> ????? break;
>>>>>>>>>>>>> ??? }
>>>>>>>>>>>>> ??? case TYPE_COMPILED_METHOD_UNLOAD: {
>>>>>>>>>>>>> ????? nmethod* nm = _event_data.compiled_method_unload.nm;
>>>>>>>>>>>>> JvmtiExport::post_compiled_method_unload(
>>>>>>>>>>>>> _event_data.compiled_method_unload.method_id,
>>>>>>>>>>>>> _event_data.compiled_method_unload.code_begin);
>>>>>>>>>>>>> ????? // done with the deferred event so unlock the nmethod
>>>>>>>>>>>>> ????? nmethodLocker::unlock_nmethod(nm);
>>>>>>>>>>>>> ????? break;
>>>>>>>>>>>>> ??? }
>>>>>>>>>>>>> ??? . . .
>>>>>>>>>>>>> ? }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> Then I wonder how is it possible for the nmethod to be not 
>>>>>>>>>>>>> alive here?:
>>>>>>>>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) {
>>>>>>>>>>>>> . . .
>>>>>>>>>>>>> 2173 // It's not safe to look at metadata for unloaded 
>>>>>>>>>>>>> methods.
>>>>>>>>>>>>> 2174 if (!nm->is_alive()) {
>>>>>>>>>>>>> 2175 return;
>>>>>>>>>>>>> 2176 }
>>>>>>>>>>>>> At least, it lokks like something else is broken.
>>>>>>>>>>>>> Do I miss something important here?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>>>>>> Summary: Don't post information which uses metadata from 
>>>>>>>>>>>>>> unloaded nmethods
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tested tier1-3 and 100 times with test that failed 
>>>>>>>>>>>>>> (reproduced failure without the fix).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> open webrev at 
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev
>>>>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Coleen
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191202/a06566d1/attachment-0001.html>

From coleen.phillimore at oracle.com  Mon Dec  2 14:43:38 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 2 Dec 2019 09:43:38 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
Message-ID: <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>


On 11/26/19 7:03 PM, David Holmes wrote:
> (adding runtime as well)
>
> Hi Coleen,
>
> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>> Summary: Add local deferred event list to thread to post events 
>> outside CodeCache_lock.
>>
>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>> and have one per thread.? The CodeBlob event that used to drop the 
>> CodeCache_lock and raced with the sweeper thread, adds the events it 
>> wants to post to its thread local list, and processes it outside the 
>> lock.? The list is walked in GC and by the sweeper to keep the 
>> nmethods from being unloaded and zombied, respectively.
>
> Sorry I don't understand why we would want/need a deferred event queue 
> for every JavaThread? Isn't this only relevant for non-JavaThreads 
> that need to have the ServiceThread process the deferred event?

I thought I'd written this in the bug but I had only discussed this with 
Erik.? I've added a comment to the bug to explain why I added the 
per-JavaThread queue.? In order to process these events after the 
CodeCache_lock is dropped, I have to queue them somewhere safe. The 
ServiceThread queue is safe, *but* the ServiceThread can't keep up with 
the events, especially from this test case.? So the test case gets a 
native OOM.

So I've added the safe queue as a field to each JavaThread because 
multiple JavaThreads could be posting these events at the same time, and 
there didn't seem to be a better safe place to cache them, without 
adding another layer of queuing code.

I did write comments to this effect here:

http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html

Thanks,
Coleen

>
> David
>
>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>> don't create a jmethod_id until needed for post_compiled_method_unload.
>>
>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed in 
>> the original bug report.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>
>> Thanks,
>> Coleen


From david.holmes at oracle.com  Tue Dec  3 04:52:28 2019
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Dec 2019 14:52:28 +1000
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
Message-ID: <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>

Hi Coleen,

On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
> 
> 
> On 11/26/19 7:03 PM, David Holmes wrote:
>> (adding runtime as well)
>>
>> Hi Coleen,
>>
>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>> Summary: Add local deferred event list to thread to post events 
>>> outside CodeCache_lock.
>>>
>>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>>> and have one per thread.? The CodeBlob event that used to drop the 
>>> CodeCache_lock and raced with the sweeper thread, adds the events it 
>>> wants to post to its thread local list, and processes it outside the 
>>> lock.? The list is walked in GC and by the sweeper to keep the 
>>> nmethods from being unloaded and zombied, respectively.
>>
>> Sorry I don't understand why we would want/need a deferred event queue 
>> for every JavaThread? Isn't this only relevant for non-JavaThreads 
>> that need to have the ServiceThread process the deferred event?
> 
> I thought I'd written this in the bug but I had only discussed this with 
> Erik.? I've added a comment to the bug to explain why I added the 
> per-JavaThread queue.? In order to process these events after the 
> CodeCache_lock is dropped, I have to queue them somewhere safe. The 
> ServiceThread queue is safe, *but* the ServiceThread can't keep up with 
> the events, especially from this test case.? So the test case gets a 
> native OOM.
> 
> So I've added the safe queue as a field to each JavaThread because 
> multiple JavaThreads could be posting these events at the same time, and 
> there didn't seem to be a better safe place to cache them, without 
> adding another layer of queuing code.

I think I'm getting the picture now. At the time the events are 
generated we can't post them directly because the current thread is 
inside compiler code. Hence the events must be deferred. Using the 
ServiceThread to handle the deferred events is one way to deal with this 
- but it can't keep up in this scenario. So instead we store the events 
in the current thread and when the current thread returns to code where 
it is safe to post the events, it does so itself. Is that generally correct?

I admit I'm not keen on adding this additional field per-thread just for 
a temporary usage. Some kind of stack allocated helper would be 
preferable, but would need to be passed through the call chain so that 
the events could be added to it.

Also I'm not clear why we aggressively delete the _jvmti_event_queue 
after posting the events. I'd be worried about the overhead we are 
introducing for creating and deleting this queue. When the 
JvmtiDeferredEventQueue data structure was intended only for use by the 
ServiceThread its dynamic node allocation may have made more sense. But 
now that seems like a liability to me - if JvmtiDeferredEvents could be 
linked directly we wouldn't need dynamic nodes, nor dynamic per-thread 
queues (just a per-thread pointer).

Just some thoughts.

Thanks,
David

> I did write comments to this effect here:
> 
> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
> 
> 
> Thanks,
> Coleen
> 
>>
>> David
>>
>>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>>> don't create a jmethod_id until needed for post_compiled_method_unload.
>>>
>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed in 
>>> the original bug report.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>
>>> Thanks,
>>> Coleen
> 

From erik.osterlund at oracle.com  Tue Dec  3 12:48:21 2019
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 3 Dec 2019 13:48:21 +0100
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
Message-ID: <30df2087-a20d-f745-8c19-fb0173a50421@oracle.com>

Hi Coleen,

This looks great. Thanks for sorting this out!

/Erik

On 12/2/19 3:43 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 11/26/19 7:03 PM, David Holmes wrote:
>> (adding runtime as well)
>>
>> Hi Coleen,
>>
>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>> Summary: Add local deferred event list to thread to post events 
>>> outside CodeCache_lock.
>>>
>>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>>> and have one per thread.? The CodeBlob event that used to drop the 
>>> CodeCache_lock and raced with the sweeper thread, adds the events it 
>>> wants to post to its thread local list, and processes it outside the 
>>> lock.? The list is walked in GC and by the sweeper to keep the 
>>> nmethods from being unloaded and zombied, respectively.
>>
>> Sorry I don't understand why we would want/need a deferred event 
>> queue for every JavaThread? Isn't this only relevant for 
>> non-JavaThreads that need to have the ServiceThread process the 
>> deferred event?
>
> I thought I'd written this in the bug but I had only discussed this 
> with Erik.? I've added a comment to the bug to explain why I added the 
> per-JavaThread queue.? In order to process these events after the 
> CodeCache_lock is dropped, I have to queue them somewhere safe. The 
> ServiceThread queue is safe, *but* the ServiceThread can't keep up 
> with the events, especially from this test case.? So the test case 
> gets a native OOM.
>
> So I've added the safe queue as a field to each JavaThread because 
> multiple JavaThreads could be posting these events at the same time, 
> and there didn't seem to be a better safe place to cache them, without 
> adding another layer of queuing code.
>
> I did write comments to this effect here:
>
> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>
>
> Thanks,
> Coleen
>
>>
>> David
>>
>>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>>> don't create a jmethod_id until needed for post_compiled_method_unload.
>>>
>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed 
>>> in the original bug report.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>
>>> Thanks,
>>> Coleen
>


From coleen.phillimore at oracle.com  Tue Dec  3 13:08:25 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 3 Dec 2019 08:08:25 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
Message-ID: <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>


On 12/2/19 11:52 PM, David Holmes wrote:
> Hi Coleen,
>
> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 11/26/19 7:03 PM, David Holmes wrote:
>>> (adding runtime as well)
>>>
>>> Hi Coleen,
>>>
>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>> Summary: Add local deferred event list to thread to post events 
>>>> outside CodeCache_lock.
>>>>
>>>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>>>> and have one per thread.? The CodeBlob event that used to drop the 
>>>> CodeCache_lock and raced with the sweeper thread, adds the events 
>>>> it wants to post to its thread local list, and processes it outside 
>>>> the lock.? The list is walked in GC and by the sweeper to keep the 
>>>> nmethods from being unloaded and zombied, respectively.
>>>
>>> Sorry I don't understand why we would want/need a deferred event 
>>> queue for every JavaThread? Isn't this only relevant for 
>>> non-JavaThreads that need to have the ServiceThread process the 
>>> deferred event?
>>
>> I thought I'd written this in the bug but I had only discussed this 
>> with Erik.? I've added a comment to the bug to explain why I added 
>> the per-JavaThread queue.? In order to process these events after the 
>> CodeCache_lock is dropped, I have to queue them somewhere safe. The 
>> ServiceThread queue is safe, *but* the ServiceThread can't keep up 
>> with the events, especially from this test case.? So the test case 
>> gets a native OOM.
>>
>> So I've added the safe queue as a field to each JavaThread because 
>> multiple JavaThreads could be posting these events at the same time, 
>> and there didn't seem to be a better safe place to cache them, 
>> without adding another layer of queuing code.
>
> I think I'm getting the picture now. At the time the events are 
> generated we can't post them directly because the current thread is 
> inside compiler code. Hence the events must be deferred. Using the 
> ServiceThread to handle the deferred events is one way to deal with 
> this - but it can't keep up in this scenario. So instead we store the 
> events in the current thread and when the current thread returns to 
> code where it is safe to post the events, it does so itself. Is that 
> generally correct?

Yes.
>
> I admit I'm not keen on adding this additional field per-thread just 
> for a temporary usage. Some kind of stack allocated helper would be 
> preferable, but would need to be passed through the call chain so that 
> the events could be added to it.

Right, and the GC and nmethods_do has to find it somehow.? It wasn't my 
first choice of where to put it also because there is too many things in 
JavaThread.? Might be time for a future cleanup of Thread.
>
> Also I'm not clear why we aggressively delete the _jvmti_event_queue 
> after posting the events. I'd be worried about the overhead we are 
> introducing for creating and deleting this queue. When the 
> JvmtiDeferredEventQueue data structure was intended only for use by 
> the ServiceThread its dynamic node allocation may have made more 
> sense. But now that seems like a liability to me - if 
> JvmtiDeferredEvents could be linked directly we wouldn't need dynamic 
> nodes, nor dynamic per-thread queues (just a per-thread pointer).

I'm not following.? The queue is for multiple events that might be 
posted while in the CodeCache_lock, so they need to be in order and 
linked together.? While we post them and take them off, if the callback 
safepoints (maybe calls back into the JVM), we don't want to have GC or 
nmethods_do walk the one that's been posted already. So a queue seems to 
make sense.

One thing that I experimented with was to have the ServiceThread take 
ownership of the queue in it's local thread queue and post them all, 
which could be a future enhancement.? It didn't help my OOM situation.

Deleting the queue after all the events are posted allows 
JavaThread::oops_do and nmethods_do only a null check to deal with this 
jvmti wart.

Thanks,
Coleen
>
> Just some thoughts.
>
> Thanks,
> David
>
>> I did write comments to this effect here:
>>
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>
>>
>> Thanks,
>> Coleen
>>
>>>
>>> David
>>>
>>>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>>>> don't create a jmethod_id until needed for 
>>>> post_compiled_method_unload.
>>>>
>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed 
>>>> in the original bug report.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>
>>>> Thanks,
>>>> Coleen
>>


From coleen.phillimore at oracle.com  Tue Dec  3 13:11:12 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 3 Dec 2019 08:11:12 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <30df2087-a20d-f745-8c19-fb0173a50421@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <30df2087-a20d-f745-8c19-fb0173a50421@oracle.com>
Message-ID: <71c63f79-a893-6e3d-f418-bc16670c65ca@oracle.com>


Thanks Erik!
Coleen

On 12/3/19 7:48 AM, erik.osterlund at oracle.com wrote:
> Hi Coleen,
>
> This looks great. Thanks for sorting this out!
>
> /Erik
>
> On 12/2/19 3:43 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 11/26/19 7:03 PM, David Holmes wrote:
>>> (adding runtime as well)
>>>
>>> Hi Coleen,
>>>
>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>> Summary: Add local deferred event list to thread to post events 
>>>> outside CodeCache_lock.
>>>>
>>>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>>>> and have one per thread.? The CodeBlob event that used to drop the 
>>>> CodeCache_lock and raced with the sweeper thread, adds the events 
>>>> it wants to post to its thread local list, and processes it outside 
>>>> the lock.? The list is walked in GC and by the sweeper to keep the 
>>>> nmethods from being unloaded and zombied, respectively.
>>>
>>> Sorry I don't understand why we would want/need a deferred event 
>>> queue for every JavaThread? Isn't this only relevant for 
>>> non-JavaThreads that need to have the ServiceThread process the 
>>> deferred event?
>>
>> I thought I'd written this in the bug but I had only discussed this 
>> with Erik.? I've added a comment to the bug to explain why I added 
>> the per-JavaThread queue.? In order to process these events after the 
>> CodeCache_lock is dropped, I have to queue them somewhere safe. The 
>> ServiceThread queue is safe, *but* the ServiceThread can't keep up 
>> with the events, especially from this test case.? So the test case 
>> gets a native OOM.
>>
>> So I've added the safe queue as a field to each JavaThread because 
>> multiple JavaThreads could be posting these events at the same time, 
>> and there didn't seem to be a better safe place to cache them, 
>> without adding another layer of queuing code.
>>
>> I did write comments to this effect here:
>>
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>
>>
>> Thanks,
>> Coleen
>>
>>>
>>> David
>>>
>>>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>>>> don't create a jmethod_id until needed for 
>>>> post_compiled_method_unload.
>>>>
>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed 
>>>> in the original bug report.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>
>>>> Thanks,
>>>> Coleen
>>
>


From david.holmes at oracle.com  Tue Dec  3 13:31:22 2019
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Dec 2019 23:31:22 +1000
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
 <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>
Message-ID: <b5634ad7-3e62-ad72-7458-0b73e26ba59a@oracle.com>

On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote:
> 
> 
> On 12/2/19 11:52 PM, David Holmes wrote:
>> Hi Coleen,
>>
>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 11/26/19 7:03 PM, David Holmes wrote:
>>>> (adding runtime as well)
>>>>
>>>> Hi Coleen,
>>>>
>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>>> Summary: Add local deferred event list to thread to post events 
>>>>> outside CodeCache_lock.
>>>>>
>>>>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>>>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>>>>> and have one per thread.? The CodeBlob event that used to drop the 
>>>>> CodeCache_lock and raced with the sweeper thread, adds the events 
>>>>> it wants to post to its thread local list, and processes it outside 
>>>>> the lock.? The list is walked in GC and by the sweeper to keep the 
>>>>> nmethods from being unloaded and zombied, respectively.
>>>>
>>>> Sorry I don't understand why we would want/need a deferred event 
>>>> queue for every JavaThread? Isn't this only relevant for 
>>>> non-JavaThreads that need to have the ServiceThread process the 
>>>> deferred event?
>>>
>>> I thought I'd written this in the bug but I had only discussed this 
>>> with Erik.? I've added a comment to the bug to explain why I added 
>>> the per-JavaThread queue.? In order to process these events after the 
>>> CodeCache_lock is dropped, I have to queue them somewhere safe. The 
>>> ServiceThread queue is safe, *but* the ServiceThread can't keep up 
>>> with the events, especially from this test case.? So the test case 
>>> gets a native OOM.
>>>
>>> So I've added the safe queue as a field to each JavaThread because 
>>> multiple JavaThreads could be posting these events at the same time, 
>>> and there didn't seem to be a better safe place to cache them, 
>>> without adding another layer of queuing code.
>>
>> I think I'm getting the picture now. At the time the events are 
>> generated we can't post them directly because the current thread is 
>> inside compiler code. Hence the events must be deferred. Using the 
>> ServiceThread to handle the deferred events is one way to deal with 
>> this - but it can't keep up in this scenario. So instead we store the 
>> events in the current thread and when the current thread returns to 
>> code where it is safe to post the events, it does so itself. Is that 
>> generally correct?
> 
> Yes.
>>
>> I admit I'm not keen on adding this additional field per-thread just 
>> for a temporary usage. Some kind of stack allocated helper would be 
>> preferable, but would need to be passed through the call chain so that 
>> the events could be added to it.
> 
> Right, and the GC and nmethods_do has to find it somehow.? It wasn't my 
> first choice of where to put it also because there is too many things in 
> JavaThread.? Might be time for a future cleanup of Thread.

I see.

>>
>> Also I'm not clear why we aggressively delete the _jvmti_event_queue 
>> after posting the events. I'd be worried about the overhead we are 
>> introducing for creating and deleting this queue. When the 
>> JvmtiDeferredEventQueue data structure was intended only for use by 
>> the ServiceThread its dynamic node allocation may have made more 
>> sense. But now that seems like a liability to me - if 
>> JvmtiDeferredEvents could be linked directly we wouldn't need dynamic 
>> nodes, nor dynamic per-thread queues (just a per-thread pointer).
> 
> I'm not following.? The queue is for multiple events that might be 
> posted while in the CodeCache_lock, so they need to be in order and 
> linked together.? While we post them and take them off, if the callback 
> safepoints (maybe calls back into the JVM), we don't want to have GC or 
> nmethods_do walk the one that's been posted already. So a queue seems to 
> make sense.

Yes but you can make a queue just by having each event have a _next 
pointer, rather than dynamically creating nodes to hold the event. Each 
event is its own queue node implicitly.

> One thing that I experimented with was to have the ServiceThread take 
> ownership of the queue in it's local thread queue and post them all, 
> which could be a future enhancement.? It didn't help my OOM situation.

Your OOM situation seems to be a basic case of overwhelming the 
ServiceThread. A single serviceThread will always have a limit on how 
many events it can handle. Maybe this test is being too unrealistic in 
its expectations of the current design?

> Deleting the queue after all the events are posted allows 
> JavaThread::oops_do and nmethods_do only a null check to deal with this 
> jvmti wart.

If the nodes are not dynamically allocated you don't need to delete you 
just set the queue-head pointer to NULL - actually it will already be 
NULL once the last event has been processed.

David
-----

> Thanks,
> Coleen
>>
>> Just some thoughts.
>>
>> Thanks,
>> David
>>
>>> I did write comments to this effect here:
>>>
>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>>
>>>
>>> Thanks,
>>> Coleen
>>>
>>>>
>>>> David
>>>>
>>>>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>>>>> don't create a jmethod_id until needed for 
>>>>> post_compiled_method_unload.
>>>>>
>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed 
>>>>> in the original bug report.
>>>>>
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>
> 

From coleen.phillimore at oracle.com  Tue Dec  3 13:35:58 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 3 Dec 2019 08:35:58 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <b5634ad7-3e62-ad72-7458-0b73e26ba59a@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
 <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>
 <b5634ad7-3e62-ad72-7458-0b73e26ba59a@oracle.com>
Message-ID: <a867c025-673d-999c-4690-cdf7f3099c19@oracle.com>


On 12/3/19 8:31 AM, David Holmes wrote:
> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/2/19 11:52 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 11/26/19 7:03 PM, David Holmes wrote:
>>>>> (adding runtime as well)
>>>>>
>>>>> Hi Coleen,
>>>>>
>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Add local deferred event list to thread to post events 
>>>>>> outside CodeCache_lock.
>>>>>>
>>>>>> This patch builds on the patch for JDK-8173361.? With this patch, 
>>>>>> I made the JvmtiDeferredEventQueue an instance class (not 
>>>>>> AllStatic) and have one per thread. The CodeBlob event that used 
>>>>>> to drop the CodeCache_lock and raced with the sweeper thread, 
>>>>>> adds the events it wants to post to its thread local list, and 
>>>>>> processes it outside the lock.? The list is walked in GC and by 
>>>>>> the sweeper to keep the nmethods from being unloaded and zombied, 
>>>>>> respectively.
>>>>>
>>>>> Sorry I don't understand why we would want/need a deferred event 
>>>>> queue for every JavaThread? Isn't this only relevant for 
>>>>> non-JavaThreads that need to have the ServiceThread process the 
>>>>> deferred event?
>>>>
>>>> I thought I'd written this in the bug but I had only discussed this 
>>>> with Erik.? I've added a comment to the bug to explain why I added 
>>>> the per-JavaThread queue.? In order to process these events after 
>>>> the CodeCache_lock is dropped, I have to queue them somewhere safe. 
>>>> The ServiceThread queue is safe, *but* the ServiceThread can't keep 
>>>> up with the events, especially from this test case.? So the test 
>>>> case gets a native OOM.
>>>>
>>>> So I've added the safe queue as a field to each JavaThread because 
>>>> multiple JavaThreads could be posting these events at the same 
>>>> time, and there didn't seem to be a better safe place to cache 
>>>> them, without adding another layer of queuing code.
>>>
>>> I think I'm getting the picture now. At the time the events are 
>>> generated we can't post them directly because the current thread is 
>>> inside compiler code. Hence the events must be deferred. Using the 
>>> ServiceThread to handle the deferred events is one way to deal with 
>>> this - but it can't keep up in this scenario. So instead we store 
>>> the events in the current thread and when the current thread returns 
>>> to code where it is safe to post the events, it does so itself. Is 
>>> that generally correct?
>>
>> Yes.
>>>
>>> I admit I'm not keen on adding this additional field per-thread just 
>>> for a temporary usage. Some kind of stack allocated helper would be 
>>> preferable, but would need to be passed through the call chain so 
>>> that the events could be added to it.
>>
>> Right, and the GC and nmethods_do has to find it somehow.? It wasn't 
>> my first choice of where to put it also because there is too many 
>> things in JavaThread.? Might be time for a future cleanup of Thread.
>
> I see.
>
>>>
>>> Also I'm not clear why we aggressively delete the _jvmti_event_queue 
>>> after posting the events. I'd be worried about the overhead we are 
>>> introducing for creating and deleting this queue. When the 
>>> JvmtiDeferredEventQueue data structure was intended only for use by 
>>> the ServiceThread its dynamic node allocation may have made more 
>>> sense. But now that seems like a liability to me - if 
>>> JvmtiDeferredEvents could be linked directly we wouldn't need 
>>> dynamic nodes, nor dynamic per-thread queues (just a per-thread 
>>> pointer).
>>
>> I'm not following.? The queue is for multiple events that might be 
>> posted while in the CodeCache_lock, so they need to be in order and 
>> linked together.? While we post them and take them off, if the 
>> callback safepoints (maybe calls back into the JVM), we don't want to 
>> have GC or nmethods_do walk the one that's been posted already. So a 
>> queue seems to make sense.
>
> Yes but you can make a queue just by having each event have a _next 
> pointer, rather than dynamically creating nodes to hold the event. 
> Each event is its own queue node implicitly.
>
>> One thing that I experimented with was to have the ServiceThread take 
>> ownership of the queue in it's local thread queue and post them all, 
>> which could be a future enhancement.? It didn't help my OOM situation.
>
> Your OOM situation seems to be a basic case of overwhelming the 
> ServiceThread. A single serviceThread will always have a limit on how 
> many events it can handle. Maybe this test is being too unrealistic in 
> its expectations of the current design?

I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD for 
all the events in the queue is going to be overwhelming unless it waits 
for the events to be posted.
>
>> Deleting the queue after all the events are posted allows 
>> JavaThread::oops_do and nmethods_do only a null check to deal with 
>> this jvmti wart.
>
> If the nodes are not dynamically allocated you don't need to delete 
> you just set the queue-head pointer to NULL - actually it will already 
> be NULL once the last event has been processed.

I could revisit the data structure as a future RFE.? The goal was to 
reuse code that's already there, and I don't think there's a significant 
difference in performance.? I did some measurement of the stress case 
and the times were equivalent, actually better in the new code.

Thanks,
Coleen
>
> David
> -----
>
>> Thanks,
>> Coleen
>>>
>>> Just some thoughts.
>>>
>>> Thanks,
>>> David
>>>
>>>> I did write comments to this effect here:
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>>>
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>>
>>>>> David
>>>>>
>>>>>> Also, the jmethod_id field in nmethod was only used as a boolean 
>>>>>> so don't create a jmethod_id until needed for 
>>>>>> post_compiled_method_unload.
>>>>>>
>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>> crashed in the original bug report.
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>
>>


From coleen.phillimore at oracle.com  Tue Dec  3 18:21:15 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 3 Dec 2019 13:21:15 -0500
Subject: RFR (XS) 8235273: nmethodLocker not needed for COMPILED_METHOD_UNLOAD
 events
Message-ID: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com>

Summary: remove unnecessary nmethodLocker

See bug for more details.? Tested with tier2-8.

open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235273.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8235273

(Note, this has a trivial merge with the change for JDK-8212160).

Thanks,
Coleen

From serguei.spitsyn at oracle.com  Tue Dec  3 19:29:16 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 3 Dec 2019 11:29:16 -0800
Subject: RFR (XXS): 8235280: UnProblemList
 vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java
Message-ID: <40efdaca-7402-a809-bd81-ae806a9b3f9b@oracle.com>

Please, review a trivial fix for sub-task:
 ? https://bugs.openjdk.java.net/browse/JDK-8235280

The fix is to remove the test from the ProblemList.txt:

diff --git a/test/hotspot/jtreg/ProblemList.txt 
b/test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt
+++ b/test/hotspot/jtreg/ProblemList.txt
@@ -182,7 +182,6 @@
 ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java 8219652 aix-ppc64
 ?vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java 8219652 aix-ppc64
 ?vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java 8219652 aix-ppc64
-vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java 
8221372 windows-x64

 ?vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8208243,8192647 
generic-all


Thanks,
Serguei

From igor.ignatyev at oracle.com  Tue Dec  3 19:32:17 2019
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 3 Dec 2019 11:32:17 -0800
Subject: RFR (XXS): 8235280: UnProblemList
 vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java
In-Reply-To: <40efdaca-7402-a809-bd81-ae806a9b3f9b@oracle.com>
References: <40efdaca-7402-a809-bd81-ae806a9b3f9b@oracle.com>
Message-ID: <83A886BE-4F8A-4E4D-AD4A-7073AA7058A1@oracle.com>

LGTM
-- Igor

> On Dec 3, 2019, at 11:29 AM, serguei.spitsyn at oracle.com wrote:
> 
> Please, review a trivial fix for sub-task:
>   https://bugs.openjdk.java.net/browse/JDK-8235280
> 
> The fix is to remove the test from the ProblemList.txt:
> 
> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt
> +++ b/test/hotspot/jtreg/ProblemList.txt
> @@ -182,7 +182,6 @@
>  vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java 8219652 aix-ppc64
>  vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java 8219652 aix-ppc64
>  vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java 8219652 aix-ppc64
> -vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java 8221372 windows-x64
> 
>  vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8208243,8192647 generic-all
> 
> 
> Thanks,
> Serguei


From serguei.spitsyn at oracle.com  Tue Dec  3 19:33:02 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 3 Dec 2019 11:33:02 -0800
Subject: RFR (XXS): 8235280: UnProblemList
 vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java
In-Reply-To: <83A886BE-4F8A-4E4D-AD4A-7073AA7058A1@oracle.com>
References: <40efdaca-7402-a809-bd81-ae806a9b3f9b@oracle.com>
 <83A886BE-4F8A-4E4D-AD4A-7073AA7058A1@oracle.com>
Message-ID: <0597c486-1905-97ba-8f81-81a9923db6e5@oracle.com>

Thanks, Igor!
Serguei

On 12/3/19 11:32 AM, Igor Ignatyev wrote:
> LGTM
> -- Igor
>
>> On Dec 3, 2019, at 11:29 AM, serguei.spitsyn at oracle.com wrote:
>>
>> Please, review a trivial fix for sub-task:
>>    https://bugs.openjdk.java.net/browse/JDK-8235280
>>
>> The fix is to remove the test from the ProblemList.txt:
>>
>> diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt
>> +++ b/test/hotspot/jtreg/ProblemList.txt
>> @@ -182,7 +182,6 @@
>>   vmTestbase/nsk/jvmti/scenarios/jni_interception/JI05/ji05t001/TestDescription.java 8219652 aix-ppc64
>>   vmTestbase/nsk/jvmti/scenarios/jni_interception/JI06/ji06t001/TestDescription.java 8219652 aix-ppc64
>>   vmTestbase/nsk/jvmti/SetJNIFunctionTable/setjniftab001/TestDescription.java 8219652 aix-ppc64
>> -vmTestbase/nsk/jvmti/GetThreadState/thrstat001/TestDescription.java 8221372 windows-x64
>>
>>   vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8208243,8192647 generic-all
>>
>>
>> Thanks,
>> Serguei


From daniil.x.titov at oracle.com  Tue Dec  3 19:42:54 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 03 Dec 2019 11:42:54 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
Message-ID: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>

Please review the change that makes OperatingSystemMXBean methods return container specific information
rather than the host based data.

The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined
 with the spec update David made [3].

The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total
and free memory from the returned values.

It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active.

The webrev also takes into account the case when java.security.AccessControlException exception is thrown
during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to
 "/proc/self/mountinfo" file).

CSR for the spec changes [3] is approved.

Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .

[1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
[2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
[3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 

Thank you,
-Daniil


From chris.plummer at oracle.com  Tue Dec  3 20:45:55 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 3 Dec 2019 12:45:55 -0800
Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions and
 do a better job of detecting SA failures
Message-ID: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>

Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8234277
http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/

No longer redirect stderr for the jhsdb/clhsdb process. It results in 
not seeing attach failures in the output, so OutputAnalyer can't check 
for them.

Execute "verbose true" as the first clhsdb command after launching. This 
will result in verboseExceptions being true in CommandProcessor.java, so 
full exception traces will appear in the output. This will make 
debugging future SA test failures a lot easier.

Add an extra check for any DebuggerException. This is mainly for 
detecting that the attached failed. This previously was going 
un-noticed, and instead the test would later fail because it noticed 
some other issue, like missing output, which isn't very informative.

Add checks for other unexpected SA exceptions that are caught and 
printed by CommandProcessor. These will always have an "Error: " prefix, 
making them easy to detect.

Problem list ClhsdbScanOops.java. With the new error checking, it will 
now always fail on windows due to JDK-8230731 and on macos and linux due 
to JDK-8235220. These failures are not "new" per se, but are just now 
being properly detected.

thanks,

Chris

From chris.plummer at oracle.com  Tue Dec  3 20:56:34 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 3 Dec 2019 12:56:34 -0800
Subject: RFR(XS): 8235221: Fix ProblemList.txt for
 sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java
Message-ID: <ca78804c-eda4-16e4-30c3-ef179299f15e@oracle.com>

Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8235221

diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt
--- a/test/jdk/ProblemList.txt
+++ b/test/jdk/ProblemList.txt
@@ -914,8 +914,7 @@

 ?sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 
solaris-all,linux-ppc64,linux-ppc64le
 ?sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all
-sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 
windows-all
-sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231635,8231634 
generic-all
+sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 
8231634,8230731,8001227 generic-all,windows-all


Listing the same test on multiple lines just result in the last entry 
being used, so merge into one line. Also JDK-8231635 has been fixed.

thanks,

Chris


From igor.ignatyev at oracle.com  Tue Dec  3 21:00:00 2019
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 3 Dec 2019 13:00:00 -0800
Subject: RFR(XS): 8235221: Fix ProblemList.txt for
 sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java
In-Reply-To: <ca78804c-eda4-16e4-30c3-ef179299f15e@oracle.com>
References: <ca78804c-eda4-16e4-30c3-ef179299f15e@oracle.com>
Message-ID: <C2152CBE-C604-441A-B49C-A35325D10BEB@oracle.com>

LGTM,

-- Igor

> On Dec 3, 2019, at 12:56 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
> 
> Hello,
> 
> Please review the following:
> 
> https://bugs.openjdk.java.net/browse/JDK-8235221
> 
> diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt
> --- a/test/jdk/ProblemList.txt
> +++ b/test/jdk/ProblemList.txt
> @@ -914,8 +914,7 @@
> 
>  sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 solaris-all,linux-ppc64,linux-ppc64le
>  sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all
> -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 windows-all
> -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231635,8231634 generic-all
> +sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231634,8230731,8001227 generic-all,windows-all
> 
> 
> Listing the same test on multiple lines just result in the last entry being used, so merge into one line. Also JDK-8231635 has been fixed.
> 
> thanks,
> 
> Chris
> 


From serguei.spitsyn at oracle.com  Tue Dec  3 21:10:07 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 3 Dec 2019 13:10:07 -0800
Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions
 and do a better job of detecting SA failures
In-Reply-To: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
References: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
Message-ID: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com>

Hi Chris,

It looks good.

Thanks,
Serguei

On 12/3/19 12:45 PM, Chris Plummer wrote:
> Hello,
>
> Please review the following:
>
> https://bugs.openjdk.java.net/browse/JDK-8234277
> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/
>
> No longer redirect stderr for the jhsdb/clhsdb process. It results in 
> not seeing attach failures in the output, so OutputAnalyer can't check 
> for them.
>
> Execute "verbose true" as the first clhsdb command after launching. 
> This will result in verboseExceptions being true in 
> CommandProcessor.java, so full exception traces will appear in the 
> output. This will make debugging future SA test failures a lot easier.
>
> Add an extra check for any DebuggerException. This is mainly for 
> detecting that the attached failed. This previously was going 
> un-noticed, and instead the test would later fail because it noticed 
> some other issue, like missing output, which isn't very informative.
>
> Add checks for other unexpected SA exceptions that are caught and 
> printed by CommandProcessor. These will always have an "Error: " 
> prefix, making them easy to detect.
>
> Problem list ClhsdbScanOops.java. With the new error checking, it will 
> now always fail on windows due to JDK-8230731 and on macos and linux 
> due to JDK-8235220. These failures are not "new" per se, but are just 
> now being properly detected.
>
> thanks,
>
> Chris


From serguei.spitsyn at oracle.com  Tue Dec  3 21:16:38 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 3 Dec 2019 13:16:38 -0800
Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions
 and do a better job of detecting SA failures
In-Reply-To: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
References: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
Message-ID: <7b22e878-241e-00b7-6434-ed2987497f2f@oracle.com>

Hi Chris,

It looks good.
I'm in favor to always run tests in verbose mode.
It is not a good idea in general to optimize on it.

Thanks,
Serguei


On 12/3/19 12:45 PM, Chris Plummer wrote:
> Hello,
>
> Please review the following:
>
> https://bugs.openjdk.java.net/browse/JDK-8234277
> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/
>
> No longer redirect stderr for the jhsdb/clhsdb process. It results in 
> not seeing attach failures in the output, so OutputAnalyer can't check 
> for them.
>
> Execute "verbose true" as the first clhsdb command after launching. 
> This will result in verboseExceptions being true in 
> CommandProcessor.java, so full exception traces will appear in the 
> output. This will make debugging future SA test failures a lot easier.
>
> Add an extra check for any DebuggerException. This is mainly for 
> detecting that the attached failed. This previously was going 
> un-noticed, and instead the test would later fail because it noticed 
> some other issue, like missing output, which isn't very informative.
>
> Add checks for other unexpected SA exceptions that are caught and 
> printed by CommandProcessor. These will always have an "Error: " 
> prefix, making them easy to detect.
>
> Problem list ClhsdbScanOops.java. With the new error checking, it will 
> now always fail on windows due to JDK-8230731 and on macos and linux 
> due to JDK-8235220. These failures are not "new" per se, but are just 
> now being properly detected.
>
> thanks,
>
> Chris


From bob.vandette at oracle.com  Tue Dec  3 21:30:17 2019
From: bob.vandette at oracle.com (Bob Vandette)
Date: Tue, 3 Dec 2019 16:30:17 -0500
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
Message-ID: <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com>

Daniil,

Looks good to me.

If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will 
alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the
container detection to report containerized.

It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
getSystemCpuLoad0. 


Bob.


> On Dec 3, 2019, at 2:42 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
> 
> Please review the change that makes OperatingSystemMXBean methods return container specific information
> rather than the host based data.
> 
> The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined
> with the spec update David made [3].
> 
> The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total
> and free memory from the returned values.
> 
> It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active.
> 
> The webrev also takes into account the case when java.security.AccessControlException exception is thrown
> during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to
> "/proc/self/mountinfo" file).
> 
> CSR for the spec changes [3] is approved.
> 
> Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .
> 
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
> [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
> [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 
> 
> Thank you,
> -Daniil
> 
> 


From mandy.chung at oracle.com  Wed Dec  4 00:10:13 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Tue, 3 Dec 2019 16:10:13 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
Message-ID: <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>


On 12/3/19 11:42 AM, Daniil Titov wrote:
> Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data.
>
> The webrev also takes into account the case when java.security.AccessControlException exception is thrown
> during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file).

Instead of failing to access /proc/self/mountinfo, I expect this to wrap 
the call with doPrivileged so that it can report the metrics independent 
of the security policy.? The jdk default security policy should grant 
proper permission to do so.
>
> CSR for the spec changes [3] is approved.
>
> Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .
>
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/
> [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575
> [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428
>
>

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java
 ??? this should wrap the security-sensitive operations with 
doPrivileged.? jdk.management is trusted and it has all permissions.

src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
 ??? Formatting nit:? line 346-355: JDK native source uses 4-space 
identation convention.? A space is missing between "if" and "(".

src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java

59 if (limit >= 0 && memLimit >= 0) {
60 return limit - memLimit;
61 }


Under what circumstance that limit or memLimit is < 0??? It fallbacks to 
return the system's total swap space size - this is not really what it 
should report.?? Is it worth specifying this case?Similarly, 
getFreeMemorySize and getTotalMemorySize and getCpuLoad.

getFreeSwapSpaceSize retry for a few times.? What special about this 
method but not others like getFreeMemorySize?

src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
 ???? There is no strong need to make the deprecated methods as default 
methods.? If they were default methods, they only need to be implemented 
once as opposed to in all OS-specific implementations.

CheckOperatingSystemMXBean.java
 ???? System.out.println(String.format(...)) can simply be replaced with 
System.out.format.

Mandy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191203/b03b18c5/attachment.html>

From suenaga at oss.nttdata.com  Wed Dec  4 00:54:41 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 4 Dec 2019 09:54:41 +0900
Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
Message-ID: <de7b6cb5-88c3-dc0c-48d3-a3ecf66eff9f@oss.nttdata.com>

PING: Could you review it?

   JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/

This bug is targeted to JDK 14.


Thanks,

Yasumasa


On 2019/11/28 21:39, Yasumasa Suenaga wrote:
> Hi,
> 
> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
> Could you review new webrev?
> 
>  ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
> 
> The diff from previous webrev is here:
>  ? http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> Please review this change:
>>
>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>
>>
>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>> for stack unwinding.
>>
>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>> library (e.g. libc) might be compiled with this feature.
>>
>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>> So it might be lack of stack frames.
>>
>> I guess JDK-8219201 is caused by same issue.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf

From daniil.x.titov at oracle.com  Wed Dec  4 02:00:28 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 03 Dec 2019 18:00:28 -0800
Subject: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com>
Message-ID: <CD4410BC-4F48-4A8D-AC53-E6D9B4786FC6@oracle.com>

Hi Bob,

>>    It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
>>   getSystemCpuLoad0.

I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns
to the number of the CPUs configured on the host and returned by  sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native
 method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and
 inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected.

JNIEXPORT jint JNICALL
Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0
(JNIEnv *env, jobject mbean)
{
  if(perfInit() == 0) {
    return counters.nProcs;
  } else {
    return -1;
  }
}


If there is no objection I will include this change in the new webrev.

Thank you,
Daniil

?On 12/3/19, 1:30 PM, "Bob Vandette" <bob.vandette at oracle.com> wrote:

    Daniil,
    
    Looks good to me.
    
    If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will 
    alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the
    container detection to report containerized.
    
    It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
    getSystemCpuLoad0. 
    
    
    Bob.
    
    
    > On Dec 3, 2019, at 2:42 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
    > 
    > Please review the change that makes OperatingSystemMXBean methods return container specific information
    > rather than the host based data.
    > 
    > The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined
    > with the spec update David made [3].
    > 
    > The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total
    > and free memory from the returned values.
    > 
    > It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active.
    > 
    > The webrev also takes into account the case when java.security.AccessControlException exception is thrown
    > during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to
    > "/proc/self/mountinfo" file).
    > 
    > CSR for the spec changes [3] is approved.
    > 
    > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .
    > 
    > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
    > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
    > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 
    > 
    > Thank you,
    > -Daniil
    > 
    > 
    
    
From daniil.x.titov at oracle.com  Wed Dec  4 03:34:23 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 03 Dec 2019 19:34:23 -0800
Subject: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
Message-ID: <A30E72E0-3C91-4FAD-9508-037A2B37024C@oracle.com>

Hi Mandy,

Thank you for your comments, please find my answers below.

>> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java
>>   this should wrap the security-sensitive operations with doPrivileged.  jdk.management is trusted and it has all permissions.

I will include this change in the next webrev, thank you.

>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>>    Formatting nit:  line 346-355: JDK native source uses 4-space identation convention.  A space is missing between "if" and "(".
I will correct this, thanks.

>>Under what circumstance that limit or memLimit is < 0?   
The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without
 specifying a memory limit ( without '--memory='  Docker option) . In latter there is no limit on how much memory the container can use and
 it can use as much memory as the host's OS allows.

>> Is it worth  specifying this case?
I believe yes, since it covers the cases when JVM runs  on a Linux host or a docker container was started without memory limitation.

>> It fallbacks to return the system's total swap space size - this is not really what it should report.   
For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set.
However, I am not sure how we could differentiate these 2 cases.

>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.   
For  getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available.
For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0).
Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result.

For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod,  CpuNumPeriods , or getCpuUsage are unavailable or if a valid  CPU load for some CPU was
 not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just  return -1 in these cases rather then falling back to getSystemCpuLoad0()

>>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
 >>    There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
I could make these methods defaults if you feel it is a better approach here.


>>CheckOperatingSystemMXBean.java
>>     System.out.println(String.format(...)) can simply be replaced with System.out.format.
I will include this change in the next webrev, thank you!

Best regards,
Daniil

From: Mandy Chung <mandy.chung at oracle.com>
Date: Tuesday, December 3, 2019 at 4:10 PM
To: Daniil Titov <daniil.x.titov at oracle.com>
Cc: OpenJDK Serviceability <serviceability-dev at openjdk.java.net>, "jmx-dev at openjdk.java.net" <jmx-dev at openjdk.java.net>, Bob Vandette <bob.vandette at oracle.com>
Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware


On 12/3/19 11:42 AM, Daniil Titov wrote:
Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data.

The webrev also takes into account the case when java.security.AccessControlException exception is thrown
during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file).

Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy.? The jdk default security policy should grant proper permission to do so.


CSR for the spec changes [3] is approved.

Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .

[1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
[2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
[3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 


src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java
??? this should wrap the security-sensitive operations with doPrivileged.? jdk.management is trusted and it has all permissions.

src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
??? Formatting nit:? line 346-355: JDK native source uses 4-space identation convention.? A space is missing between "if" and "(".

src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java
  59             if (limit >= 0 && memLimit >= 0) {
  60                 return limit - memLimit;
  61             }


Under what circumstance that limit or memLimit is < 0??? It fallbacks to return the system's total swap space size - this is not really what it should report.?? Is it worth? specifying this case? Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.?? 

getFreeSwapSpaceSize retry for a few times.? What special about this method but not others like getFreeMemorySize?

src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
???? There is no strong need to make the deprecated methods as default methods.? If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.

CheckOperatingSystemMXBean.java
???? System.out.println(String.format(...)) can simply be replaced with System.out.format.

Mandy


From chris.plummer at oracle.com  Wed Dec  4 04:24:40 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 3 Dec 2019 20:24:40 -0800
Subject: RFR(XS): 8235221: Fix ProblemList.txt for
 sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java
In-Reply-To: <C2152CBE-C604-441A-B49C-A35325D10BEB@oracle.com>
References: <ca78804c-eda4-16e4-30c3-ef179299f15e@oracle.com>
 <C2152CBE-C604-441A-B49C-A35325D10BEB@oracle.com>
Message-ID: <ff588009-7646-752b-f43d-a1e22249d3a9@oracle.com>

Thanks Igor!

Chris

On 12/3/19 1:00 PM, Igor Ignatyev wrote:
> LGTM,
>
> -- Igor
>
>> On Dec 3, 2019, at 12:56 PM, Chris Plummer <chris.plummer at oracle.com> wrote:
>>
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8235221
>>
>> diff --git a/test/jdk/ProblemList.txt b/test/jdk/ProblemList.txt
>> --- a/test/jdk/ProblemList.txt
>> +++ b/test/jdk/ProblemList.txt
>> @@ -914,8 +914,7 @@
>>
>>   sun/tools/jhsdb/BasicLauncherTest.java 8193639,8211767 solaris-all,linux-ppc64,linux-ppc64le
>>   sun/tools/jhsdb/HeapDumpTest.java 8193639 solaris-all
>> -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8230731,8001227 windows-all
>> -sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231635,8231634 generic-all
>> +sun/tools/jhsdb/HeapDumpTestWithActiveProcess.java 8231634,8230731,8001227 generic-all,windows-all
>>
>>
>> Listing the same test on multiple lines just result in the last entry being used, so merge into one line. Also JDK-8231635 has been fixed.
>>
>> thanks,
>>
>> Chris
>>


From chris.plummer at oracle.com  Wed Dec  4 04:25:03 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 3 Dec 2019 20:25:03 -0800
Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions
 and do a better job of detecting SA failures
In-Reply-To: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com>
References: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
 <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com>
Message-ID: <b45f4070-f548-5ba1-1367-486312ba237f@oracle.com>

Thanks Serguei!

Chris

On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
> It looks good.
>
> Thanks,
> Serguei
>
> On 12/3/19 12:45 PM, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8234277
>> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/
>>
>> No longer redirect stderr for the jhsdb/clhsdb process. It results in 
>> not seeing attach failures in the output, so OutputAnalyer can't 
>> check for them.
>>
>> Execute "verbose true" as the first clhsdb command after launching. 
>> This will result in verboseExceptions being true in 
>> CommandProcessor.java, so full exception traces will appear in the 
>> output. This will make debugging future SA test failures a lot easier.
>>
>> Add an extra check for any DebuggerException. This is mainly for 
>> detecting that the attached failed. This previously was going 
>> un-noticed, and instead the test would later fail because it noticed 
>> some other issue, like missing output, which isn't very informative.
>>
>> Add checks for other unexpected SA exceptions that are caught and 
>> printed by CommandProcessor. These will always have an "Error: " 
>> prefix, making them easy to detect.
>>
>> Problem list ClhsdbScanOops.java. With the new error checking, it 
>> will now always fail on windows due to JDK-8230731 and on macos and 
>> linux due to JDK-8235220. These failures are not "new" per se, but 
>> are just now being properly detected.
>>
>> thanks,
>>
>> Chris
>


From david.holmes at oracle.com  Wed Dec  4 04:39:20 2019
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 4 Dec 2019 14:39:20 +1000
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <a867c025-673d-999c-4690-cdf7f3099c19@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
 <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>
 <b5634ad7-3e62-ad72-7458-0b73e26ba59a@oracle.com>
 <a867c025-673d-999c-4690-cdf7f3099c19@oracle.com>
Message-ID: <b1b0cd64-f5e4-8a7a-8e92-9b442339beee@oracle.com>


On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote:
> 
> 
> On 12/3/19 8:31 AM, David Holmes wrote:
>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 12/2/19 11:52 PM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 11/26/19 7:03 PM, David Holmes wrote:
>>>>>> (adding runtime as well)
>>>>>>
>>>>>> Hi Coleen,
>>>>>>
>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>>>>> Summary: Add local deferred event list to thread to post events 
>>>>>>> outside CodeCache_lock.
>>>>>>>
>>>>>>> This patch builds on the patch for JDK-8173361.? With this patch, 
>>>>>>> I made the JvmtiDeferredEventQueue an instance class (not 
>>>>>>> AllStatic) and have one per thread. The CodeBlob event that used 
>>>>>>> to drop the CodeCache_lock and raced with the sweeper thread, 
>>>>>>> adds the events it wants to post to its thread local list, and 
>>>>>>> processes it outside the lock.? The list is walked in GC and by 
>>>>>>> the sweeper to keep the nmethods from being unloaded and zombied, 
>>>>>>> respectively.
>>>>>>
>>>>>> Sorry I don't understand why we would want/need a deferred event 
>>>>>> queue for every JavaThread? Isn't this only relevant for 
>>>>>> non-JavaThreads that need to have the ServiceThread process the 
>>>>>> deferred event?
>>>>>
>>>>> I thought I'd written this in the bug but I had only discussed this 
>>>>> with Erik.? I've added a comment to the bug to explain why I added 
>>>>> the per-JavaThread queue.? In order to process these events after 
>>>>> the CodeCache_lock is dropped, I have to queue them somewhere safe. 
>>>>> The ServiceThread queue is safe, *but* the ServiceThread can't keep 
>>>>> up with the events, especially from this test case.? So the test 
>>>>> case gets a native OOM.
>>>>>
>>>>> So I've added the safe queue as a field to each JavaThread because 
>>>>> multiple JavaThreads could be posting these events at the same 
>>>>> time, and there didn't seem to be a better safe place to cache 
>>>>> them, without adding another layer of queuing code.
>>>>
>>>> I think I'm getting the picture now. At the time the events are 
>>>> generated we can't post them directly because the current thread is 
>>>> inside compiler code. Hence the events must be deferred. Using the 
>>>> ServiceThread to handle the deferred events is one way to deal with 
>>>> this - but it can't keep up in this scenario. So instead we store 
>>>> the events in the current thread and when the current thread returns 
>>>> to code where it is safe to post the events, it does so itself. Is 
>>>> that generally correct?
>>>
>>> Yes.
>>>>
>>>> I admit I'm not keen on adding this additional field per-thread just 
>>>> for a temporary usage. Some kind of stack allocated helper would be 
>>>> preferable, but would need to be passed through the call chain so 
>>>> that the events could be added to it.
>>>
>>> Right, and the GC and nmethods_do has to find it somehow.? It wasn't 
>>> my first choice of where to put it also because there is too many 
>>> things in JavaThread.? Might be time for a future cleanup of Thread.
>>
>> I see.
>>
>>>>
>>>> Also I'm not clear why we aggressively delete the _jvmti_event_queue 
>>>> after posting the events. I'd be worried about the overhead we are 
>>>> introducing for creating and deleting this queue. When the 
>>>> JvmtiDeferredEventQueue data structure was intended only for use by 
>>>> the ServiceThread its dynamic node allocation may have made more 
>>>> sense. But now that seems like a liability to me - if 
>>>> JvmtiDeferredEvents could be linked directly we wouldn't need 
>>>> dynamic nodes, nor dynamic per-thread queues (just a per-thread 
>>>> pointer).
>>>
>>> I'm not following.? The queue is for multiple events that might be 
>>> posted while in the CodeCache_lock, so they need to be in order and 
>>> linked together.? While we post them and take them off, if the 
>>> callback safepoints (maybe calls back into the JVM), we don't want to 
>>> have GC or nmethods_do walk the one that's been posted already. So a 
>>> queue seems to make sense.
>>
>> Yes but you can make a queue just by having each event have a _next 
>> pointer, rather than dynamically creating nodes to hold the event. 
>> Each event is its own queue node implicitly.
>>
>>> One thing that I experimented with was to have the ServiceThread take 
>>> ownership of the queue in it's local thread queue and post them all, 
>>> which could be a future enhancement.? It didn't help my OOM situation.
>>
>> Your OOM situation seems to be a basic case of overwhelming the 
>> ServiceThread. A single serviceThread will always have a limit on how 
>> many events it can handle. Maybe this test is being too unrealistic in 
>> its expectations of the current design?
> 
> I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD for 
> all the events in the queue is going to be overwhelming unless it waits 
> for the events to be posted.

Taking things off the service thread would seem to be a good thing then :)

>>
>>> Deleting the queue after all the events are posted allows 
>>> JavaThread::oops_do and nmethods_do only a null check to deal with 
>>> this jvmti wart.
>>
>> If the nodes are not dynamically allocated you don't need to delete 
>> you just set the queue-head pointer to NULL - actually it will already 
>> be NULL once the last event has been processed.
> 
> I could revisit the data structure as a future RFE.? The goal was to 
> reuse code that's already there, and I don't think there's a significant 
> difference in performance.? I did some measurement of the stress case 
> and the times were equivalent, actually better in the new code.

Okay.

Thanks,
David

> 
> Thanks,
> Coleen
>>
>> David
>> -----
>>
>>> Thanks,
>>> Coleen
>>>>
>>>> Just some thoughts.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> I did write comments to this effect here:
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> Also, the jmethod_id field in nmethod was only used as a boolean 
>>>>>>> so don't create a jmethod_id until needed for 
>>>>>>> post_compiled_method_unload.
>>>>>>>
>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>>> crashed in the original bug report.
>>>>>>>
>>>>>>> open webrev at 
>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>
>>>
> 

From david.holmes at oracle.com  Wed Dec  4 04:49:56 2019
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 4 Dec 2019 14:49:56 +1000
Subject: RFR (XS) 8235273: nmethodLocker not needed for
 COMPILED_METHOD_UNLOAD events
In-Reply-To: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com>
References: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com>
Message-ID: <8530df0a-c67c-3159-4b96-8bde4e925434@oracle.com>

Hi Coleen,

That all seems fine to me.

Thanks,
David

On 4/12/2019 4:21 am, coleen.phillimore at oracle.com wrote:
> Summary: remove unnecessary nmethodLocker
> 
> See bug for more details.? Tested with tier2-8.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235273.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8235273
> 
> (Note, this has a trivial merge with the change for JDK-8212160).
> 
> Thanks,
> Coleen

From daniil.x.titov at oracle.com  Wed Dec  4 05:37:10 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 03 Dec 2019 21:37:10 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <0A51834F-622B-42DD-A0DA-AFAD59B23D29@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com>
 <0A51834F-622B-42DD-A0DA-AFAD59B23D29@oracle.com>
Message-ID: <1EB171CE-2582-4A42-A7F6-3D37C33DFBDD@oracle.com>

Resending with the corrected title, "RFR" was somehow stripped from it that breaks the sorting by subject...

    Hi Bob,
    
    >>    It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
    >>   getSystemCpuLoad0.
    
    I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns
    to the number of the CPUs configured on the host and returned by  sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native
     method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and
     inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected.
    
    JNIEXPORT jint JNICALL
    Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0
    (JNIEnv *env, jobject mbean)
    {
      if(perfInit() == 0) {
        return counters.nProcs;
      } else {
        return -1;
      }
    }
    
    
    If there is no objection I will include this change in the new webrev.
    
    Thank you,
    Daniil
    
    ?On 12/3/19, 1:30 PM, "Bob Vandette" <bob.vandette at oracle.com> wrote:
    
        Daniil,
        
        Looks good to me.
        
        If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will 
        alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the
        container detection to report containerized.
        
        It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
        getSystemCpuLoad0. 
        
        
        Bob.
        
        
        > On Dec 3, 2019, at 2:42 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
        > 
        > Please review the change that makes OperatingSystemMXBean methods return container specific information
        > rather than the host based data.
        > 
        > The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined
        > with the spec update David made [3].
        > 
        > The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total
        > and free memory from the returned values.
        > 
        > It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active.
        > 
        > The webrev also takes into account the case when java.security.AccessControlException exception is thrown
        > during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to
        > "/proc/self/mountinfo" file).
        > 
        > CSR for the spec changes [3] is approved.
        > 
        > Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .
        > 
        > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
        > [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
        > [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 
        > 
        > Thank you,
        > -Daniil
        > 
        > 
        
        
From daniil.x.titov at oracle.com  Wed Dec  4 05:40:32 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 03 Dec 2019 21:40:32 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
Message-ID: <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>

Resending with the corrected subject. "RFR" was somehow stripped from it and that  breaks the sorting by subject...

?
Hi Mandy,
    
Thank you for your comments, please find my answers below.
    
>> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java
>>   this should wrap the security-sensitive operations with doPrivileged.  jdk.management is trusted and it has all permissions.
    
I will include this change in the next webrev, thank you.
    
>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>>    Formatting nit:  line 346-355: JDK native source uses 4-space identation convention.  A space is missing between "if" and "(".
I will correct this, thanks.
    
>>Under what circumstance that limit or memLimit is < 0?   
The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without
specifying a memory limit ( without '--memory='  Docker option) . In latter there is no limit on how much memory the container can use and
it can use as much memory as the host's OS allows.
    
>> Is it worth  specifying this case?
I believe yes, since it covers the cases when JVM runs  on a Linux host or a docker container was started without memory limitation.
    
>> It fallbacks to return the system's total swap space size - this is not really what it should report.   
For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set.
However, I am not sure how we could differentiate these 2 cases.
    
>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.   
For  getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available.
For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0).
Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result.
    
For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod,  CpuNumPeriods , or getCpuUsage are unavailable or if a valid  CPU load for some CPU was
not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just  return -1 in these cases rather then falling back to getSystemCpuLoad0()
    
>>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
>>    There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
I could make these methods defaults if you feel it is a better approach here.
    
    
>>CheckOperatingSystemMXBean.java
>>     System.out.println(String.format(...)) can simply be replaced with System.out.format.
I will include this change in the next webrev, thank you!
    
Best regards,
Daniil
    
    From: Mandy Chung <mandy.chung at oracle.com>
    Date: Tuesday, December 3, 2019 at 4:10 PM
    To: Daniil Titov <daniil.x.titov at oracle.com>
    Cc: OpenJDK Serviceability <serviceability-dev at openjdk.java.net>, "jmx-dev at openjdk.java.net" <jmx-dev at openjdk.java.net>, Bob Vandette <bob.vandette at oracle.com>
    Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
    
    
    On 12/3/19 11:42 AM, Daniil Titov wrote:
    Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data.
    
    The webrev also takes into account the case when java.security.AccessControlException exception is thrown
    during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file).
    
    Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy.  The jdk default security policy should grant proper permission to do so.
    
    
    CSR for the spec changes [3] is approved.
    
    Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .
    
    [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
    [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
    [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 
    
    
    src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java
        this should wrap the security-sensitive operations with doPrivileged.  jdk.management is trusted and it has all permissions.
    
    src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
        Formatting nit:  line 346-355: JDK native source uses 4-space identation convention.  A space is missing between "if" and "(".
    
    src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java
      59             if (limit >= 0 && memLimit >= 0) {
      60                 return limit - memLimit;
      61             }
    
    
    Under what circumstance that limit or memLimit is < 0?   It fallbacks to return the system's total swap space size - this is not really what it should report.   Is it worth  specifying this case? Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.   
    
    getFreeSwapSpaceSize retry for a few times.  What special about this method but not others like getFreeMemorySize?
    
    src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
         There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
    
    CheckOperatingSystemMXBean.java
         System.out.println(String.format(...)) can simply be replaced with System.out.format.
    
    Mandy
    
    
From daniil.x.titov at oracle.com  Wed Dec  4 06:36:14 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 03 Dec 2019 22:36:14 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
Message-ID: <F495C679-866E-4401-85AD-3CDA9EBC4B18@oracle.com>

Hi Mandy,

I think in my previous reply I missed to answer one of the questions from your email.

>> getFreeSwapSpaceSize retry for a few times.  What special about this method but not others like getFreeMemorySize?

The specific of method  getFreeSwapSpaceSize is that MemoryAndSwapUsage and MemoryUsage metrics it reads are related 
( MemoryAndSwapUsage  includes MemoryUsage) and they are not constant. Since these metrics are not read atomically it could be
 that they change  their values between these 2 reads.  On the contrary, some other metrics, such as MemoryLimit,  are  constant.  They are set
 when the container  starts and are  supposed to return the same value over the whole time the JVM runs. The other methods don't use 
more than one such  nonconstant  metric, so the only place where this potential issue with not atomic reads could happen is getFreeSwapSpaceSize method.

Best regards,
Daniil


?On 12/3/19, 7:34 PM, "Daniil Titov" <daniil.x.titov at oracle.com> wrote:

    Hi Mandy,
    
    Thank you for your comments, please find my answers below.
    
    >> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java
    >>   this should wrap the security-sensitive operations with doPrivileged.  jdk.management is trusted and it has all permissions.
    
    I will include this change in the next webrev, thank you.
    
    >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >>    Formatting nit:  line 346-355: JDK native source uses 4-space identation convention.  A space is missing between "if" and "(".
    I will correct this, thanks.
    
    >>Under what circumstance that limit or memLimit is < 0?   
    The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without
     specifying a memory limit ( without '--memory='  Docker option) . In latter there is no limit on how much memory the container can use and
     it can use as much memory as the host's OS allows.
    
    >> Is it worth  specifying this case?
    I believe yes, since it covers the cases when JVM runs  on a Linux host or a docker container was started without memory limitation.
    
    >> It fallbacks to return the system's total swap space size - this is not really what it should report.   
    For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set.
    However, I am not sure how we could differentiate these 2 cases.
    
    >> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.   
    For  getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available.
    For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0).
    Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result.
    
    For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod,  CpuNumPeriods , or getCpuUsage are unavailable or if a valid  CPU load for some CPU was
     not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just  return -1 in these cases rather then falling back to getSystemCpuLoad0()
    
    >>src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
     >>    There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
    I could make these methods defaults if you feel it is a better approach here.
    
    
    >>CheckOperatingSystemMXBean.java
    >>     System.out.println(String.format(...)) can simply be replaced with System.out.format.
    I will include this change in the next webrev, thank you!
    
    Best regards,
    Daniil
    
    From: Mandy Chung <mandy.chung at oracle.com>
    Date: Tuesday, December 3, 2019 at 4:10 PM
    To: Daniil Titov <daniil.x.titov at oracle.com>
    Cc: OpenJDK Serviceability <serviceability-dev at openjdk.java.net>, "jmx-dev at openjdk.java.net" <jmx-dev at openjdk.java.net>, Bob Vandette <bob.vandette at oracle.com>
    Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
    
    
    On 12/3/19 11:42 AM, Daniil Titov wrote:
    Please review the change that makes OperatingSystemMXBean methods return container specific informationrather than the host based data.
    
    The webrev also takes into account the case when java.security.AccessControlException exception is thrown
    during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to "/proc/self/mountinfo" file).
    
    Instead of failing to access /proc/self/mountinfo, I expect this to wrap the call with doPrivileged so that it can report the metrics independent of the security policy.  The jdk default security policy should grant proper permission to do so.
    
    
    CSR for the spec changes [3] is approved.
    
    Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .
    
    [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
    [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
    [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 
    
    
    src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java
        this should wrap the security-sensitive operations with doPrivileged.  jdk.management is trusted and it has all permissions.
    
    src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
        Formatting nit:  line 346-355: JDK native source uses 4-space identation convention.  A space is missing between "if" and "(".
    
    src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java
      59             if (limit >= 0 && memLimit >= 0) {
      60                 return limit - memLimit;
      61             }
    
    
    Under what circumstance that limit or memLimit is < 0?   It fallbacks to return the system's total swap space size - this is not really what it should report.   Is it worth  specifying this case? Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.   
    
    getFreeSwapSpaceSize retry for a few times.  What special about this method but not others like getFreeMemorySize?
    
    src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
         There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
    
    CheckOperatingSystemMXBean.java
         System.out.println(String.format(...)) can simply be replaced with System.out.format.
    
    Mandy
    
    
From serguei.spitsyn at oracle.com  Wed Dec  4 10:14:26 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 4 Dec 2019 02:14:26 -0800
Subject: RFR (XS) 8235273: nmethodLocker not needed for
 COMPILED_METHOD_UNLOAD events
In-Reply-To: <8530df0a-c67c-3159-4b96-8bde4e925434@oracle.com>
References: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com>
 <8530df0a-c67c-3159-4b96-8bde4e925434@oracle.com>
Message-ID: <7cf9f5a3-f8c5-1981-b7b9-b83b5cdf7fe9@oracle.com>

Hi Coleen,

+1

Thanks,
Serguei


On 12/3/19 20:49, David Holmes wrote:
> Hi Coleen,
>
> That all seems fine to me.
>
> Thanks,
> David
>
> On 4/12/2019 4:21 am, coleen.phillimore at oracle.com wrote:
>> Summary: remove unnecessary nmethodLocker
>>
>> See bug for more details.? Tested with tier2-8.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8235273.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8235273
>>
>> (Note, this has a trivial merge with the change for JDK-8212160).
>>
>> Thanks,
>> Coleen


From coleen.phillimore at oracle.com  Wed Dec  4 12:21:49 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Dec 2019 07:21:49 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <b1b0cd64-f5e4-8a7a-8e92-9b442339beee@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
 <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>
 <b5634ad7-3e62-ad72-7458-0b73e26ba59a@oracle.com>
 <a867c025-673d-999c-4690-cdf7f3099c19@oracle.com>
 <b1b0cd64-f5e4-8a7a-8e92-9b442339beee@oracle.com>
Message-ID: <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com>


On 12/3/19 11:39 PM, David Holmes wrote:
>
>
> On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/3/19 8:31 AM, David Holmes wrote:
>>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 12/2/19 11:52 PM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>>
>>>>>> On 11/26/19 7:03 PM, David Holmes wrote:
>>>>>>> (adding runtime as well)
>>>>>>>
>>>>>>> Hi Coleen,
>>>>>>>
>>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>>>>>> Summary: Add local deferred event list to thread to post events 
>>>>>>>> outside CodeCache_lock.
>>>>>>>>
>>>>>>>> This patch builds on the patch for JDK-8173361. With this 
>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class 
>>>>>>>> (not AllStatic) and have one per thread. The CodeBlob event 
>>>>>>>> that used to drop the CodeCache_lock and raced with the sweeper 
>>>>>>>> thread, adds the events it wants to post to its thread local 
>>>>>>>> list, and processes it outside the lock.? The list is walked in 
>>>>>>>> GC and by the sweeper to keep the nmethods from being unloaded 
>>>>>>>> and zombied, respectively.
>>>>>>>
>>>>>>> Sorry I don't understand why we would want/need a deferred event 
>>>>>>> queue for every JavaThread? Isn't this only relevant for 
>>>>>>> non-JavaThreads that need to have the ServiceThread process the 
>>>>>>> deferred event?
>>>>>>
>>>>>> I thought I'd written this in the bug but I had only discussed 
>>>>>> this with Erik.? I've added a comment to the bug to explain why I 
>>>>>> added the per-JavaThread queue.? In order to process these events 
>>>>>> after the CodeCache_lock is dropped, I have to queue them 
>>>>>> somewhere safe. The ServiceThread queue is safe, *but* the 
>>>>>> ServiceThread can't keep up with the events, especially from this 
>>>>>> test case.? So the test case gets a native OOM.
>>>>>>
>>>>>> So I've added the safe queue as a field to each JavaThread 
>>>>>> because multiple JavaThreads could be posting these events at the 
>>>>>> same time, and there didn't seem to be a better safe place to 
>>>>>> cache them, without adding another layer of queuing code.
>>>>>
>>>>> I think I'm getting the picture now. At the time the events are 
>>>>> generated we can't post them directly because the current thread 
>>>>> is inside compiler code. Hence the events must be deferred. Using 
>>>>> the ServiceThread to handle the deferred events is one way to deal 
>>>>> with this - but it can't keep up in this scenario. So instead we 
>>>>> store the events in the current thread and when the current thread 
>>>>> returns to code where it is safe to post the events, it does so 
>>>>> itself. Is that generally correct?
>>>>
>>>> Yes.
>>>>>
>>>>> I admit I'm not keen on adding this additional field per-thread 
>>>>> just for a temporary usage. Some kind of stack allocated helper 
>>>>> would be preferable, but would need to be passed through the call 
>>>>> chain so that the events could be added to it.
>>>>
>>>> Right, and the GC and nmethods_do has to find it somehow. It wasn't 
>>>> my first choice of where to put it also because there is too many 
>>>> things in JavaThread.? Might be time for a future cleanup of Thread.
>>>
>>> I see.
>>>
>>>>>
>>>>> Also I'm not clear why we aggressively delete the 
>>>>> _jvmti_event_queue after posting the events. I'd be worried about 
>>>>> the overhead we are introducing for creating and deleting this 
>>>>> queue. When the JvmtiDeferredEventQueue data structure was 
>>>>> intended only for use by the ServiceThread its dynamic node 
>>>>> allocation may have made more sense. But now that seems like a 
>>>>> liability to me - if JvmtiDeferredEvents could be linked directly 
>>>>> we wouldn't need dynamic nodes, nor dynamic per-thread queues 
>>>>> (just a per-thread pointer).
>>>>
>>>> I'm not following.? The queue is for multiple events that might be 
>>>> posted while in the CodeCache_lock, so they need to be in order and 
>>>> linked together.? While we post them and take them off, if the 
>>>> callback safepoints (maybe calls back into the JVM), we don't want 
>>>> to have GC or nmethods_do walk the one that's been posted already. 
>>>> So a queue seems to make sense.
>>>
>>> Yes but you can make a queue just by having each event have a _next 
>>> pointer, rather than dynamically creating nodes to hold the event. 
>>> Each event is its own queue node implicitly.
>>>
>>>> One thing that I experimented with was to have the ServiceThread 
>>>> take ownership of the queue in it's local thread queue and post 
>>>> them all, which could be a future enhancement.? It didn't help my 
>>>> OOM situation.
>>>
>>> Your OOM situation seems to be a basic case of overwhelming the 
>>> ServiceThread. A single serviceThread will always have a limit on 
>>> how many events it can handle. Maybe this test is being too 
>>> unrealistic in its expectations of the current design?
>>
>> I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD 
>> for all the events in the queue is going to be overwhelming unless it 
>> waits for the events to be posted.
>
> Taking things off the service thread would seem to be a good thing 
> then :)
>
>>>
>>>> Deleting the queue after all the events are posted allows 
>>>> JavaThread::oops_do and nmethods_do only a null check to deal with 
>>>> this jvmti wart.
>>>
>>> If the nodes are not dynamically allocated you don't need to delete 
>>> you just set the queue-head pointer to NULL - actually it will 
>>> already be NULL once the last event has been processed.
>>
>> I could revisit the data structure as a future RFE.? The goal was to 
>> reuse code that's already there, and I don't think there's a 
>> significant difference in performance.? I did some measurement of the 
>> stress case and the times were equivalent, actually better in the new 
>> code.
>
> Okay.

Is this a code review then?? I think Serguei promised to review the code 
too.

thanks,
Coleen
>
> Thanks,
> David
>
>>
>> Thanks,
>> Coleen
>>>
>>> David
>>> -----
>>>
>>>> Thanks,
>>>> Coleen
>>>>>
>>>>> Just some thoughts.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> I did write comments to this effect here:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>> Also, the jmethod_id field in nmethod was only used as a 
>>>>>>>> boolean so don't create a jmethod_id until needed for 
>>>>>>>> post_compiled_method_unload.
>>>>>>>>
>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>>>> crashed in the original bug report.
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>
>>>>
>>


From coleen.phillimore at oracle.com  Wed Dec  4 12:24:36 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Dec 2019 07:24:36 -0500
Subject: RFR (XS) 8235273: nmethodLocker not needed for
 COMPILED_METHOD_UNLOAD events
In-Reply-To: <7cf9f5a3-f8c5-1981-b7b9-b83b5cdf7fe9@oracle.com>
References: <3630a545-2b2c-e17c-bbe5-98ae8508ae38@oracle.com>
 <8530df0a-c67c-3159-4b96-8bde4e925434@oracle.com>
 <7cf9f5a3-f8c5-1981-b7b9-b83b5cdf7fe9@oracle.com>
Message-ID: <332991c8-acdb-9f7d-dc58-7f5e95b28770@oracle.com>

Thanks David and Serguei!
Coleen

On 12/4/19 5:14 AM, serguei.spitsyn at oracle.com wrote:
> Hi Coleen,
>
> +1
>
> Thanks,
> Serguei
>
>
> On 12/3/19 20:49, David Holmes wrote:
>> Hi Coleen,
>>
>> That all seems fine to me.
>>
>> Thanks,
>> David
>>
>> On 4/12/2019 4:21 am, coleen.phillimore at oracle.com wrote:
>>> Summary: remove unnecessary nmethodLocker
>>>
>>> See bug for more details.? Tested with tier2-8.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2019/8235273.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235273
>>>
>>> (Note, this has a trivial merge with the change for JDK-8212160).
>>>
>>> Thanks,
>>> Coleen
>


From coleen.phillimore at oracle.com  Wed Dec  4 13:00:53 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Dec 2019 08:00:53 -0500
Subject: RFR (T) 8234355: Buffer overflow in jcmd GC.class_stats due to too
 many classes
Message-ID: <082ecca7-39ff-e027-2090-c3331650cd07@oracle.com>

Summary: Remove use of GC.class_stats in testing and failure analysis 
(plan to deprecate)

See bug for more details.

open webrev at http://cr.openjdk.java.net/~coleenp/2019/8234355.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8234355

Ran tier8 overnight.

Thanks,
Coleen

From david.holmes at oracle.com  Wed Dec  4 13:06:16 2019
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 4 Dec 2019 23:06:16 +1000
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
 <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>
 <b5634ad7-3e62-ad72-7458-0b73e26ba59a@oracle.com>
 <a867c025-673d-999c-4690-cdf7f3099c19@oracle.com>
 <b1b0cd64-f5e4-8a7a-8e92-9b442339beee@oracle.com>
 <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com>
Message-ID: <7a6f73fd-ff7f-974d-d213-dc7ead799ead@oracle.com>


On 4/12/2019 10:21 pm, coleen.phillimore at oracle.com wrote:
> 
> 
> On 12/3/19 11:39 PM, David Holmes wrote:
>>
>>
>> On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 12/3/19 8:31 AM, David Holmes wrote:
>>>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 12/2/19 11:52 PM, David Holmes wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 11/26/19 7:03 PM, David Holmes wrote:
>>>>>>>> (adding runtime as well)
>>>>>>>>
>>>>>>>> Hi Coleen,
>>>>>>>>
>>>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>>>>>>> Summary: Add local deferred event list to thread to post events 
>>>>>>>>> outside CodeCache_lock.
>>>>>>>>>
>>>>>>>>> This patch builds on the patch for JDK-8173361. With this 
>>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class 
>>>>>>>>> (not AllStatic) and have one per thread. The CodeBlob event 
>>>>>>>>> that used to drop the CodeCache_lock and raced with the sweeper 
>>>>>>>>> thread, adds the events it wants to post to its thread local 
>>>>>>>>> list, and processes it outside the lock.? The list is walked in 
>>>>>>>>> GC and by the sweeper to keep the nmethods from being unloaded 
>>>>>>>>> and zombied, respectively.
>>>>>>>>
>>>>>>>> Sorry I don't understand why we would want/need a deferred event 
>>>>>>>> queue for every JavaThread? Isn't this only relevant for 
>>>>>>>> non-JavaThreads that need to have the ServiceThread process the 
>>>>>>>> deferred event?
>>>>>>>
>>>>>>> I thought I'd written this in the bug but I had only discussed 
>>>>>>> this with Erik.? I've added a comment to the bug to explain why I 
>>>>>>> added the per-JavaThread queue.? In order to process these events 
>>>>>>> after the CodeCache_lock is dropped, I have to queue them 
>>>>>>> somewhere safe. The ServiceThread queue is safe, *but* the 
>>>>>>> ServiceThread can't keep up with the events, especially from this 
>>>>>>> test case.? So the test case gets a native OOM.
>>>>>>>
>>>>>>> So I've added the safe queue as a field to each JavaThread 
>>>>>>> because multiple JavaThreads could be posting these events at the 
>>>>>>> same time, and there didn't seem to be a better safe place to 
>>>>>>> cache them, without adding another layer of queuing code.
>>>>>>
>>>>>> I think I'm getting the picture now. At the time the events are 
>>>>>> generated we can't post them directly because the current thread 
>>>>>> is inside compiler code. Hence the events must be deferred. Using 
>>>>>> the ServiceThread to handle the deferred events is one way to deal 
>>>>>> with this - but it can't keep up in this scenario. So instead we 
>>>>>> store the events in the current thread and when the current thread 
>>>>>> returns to code where it is safe to post the events, it does so 
>>>>>> itself. Is that generally correct?
>>>>>
>>>>> Yes.
>>>>>>
>>>>>> I admit I'm not keen on adding this additional field per-thread 
>>>>>> just for a temporary usage. Some kind of stack allocated helper 
>>>>>> would be preferable, but would need to be passed through the call 
>>>>>> chain so that the events could be added to it.
>>>>>
>>>>> Right, and the GC and nmethods_do has to find it somehow. It wasn't 
>>>>> my first choice of where to put it also because there is too many 
>>>>> things in JavaThread.? Might be time for a future cleanup of Thread.
>>>>
>>>> I see.
>>>>
>>>>>>
>>>>>> Also I'm not clear why we aggressively delete the 
>>>>>> _jvmti_event_queue after posting the events. I'd be worried about 
>>>>>> the overhead we are introducing for creating and deleting this 
>>>>>> queue. When the JvmtiDeferredEventQueue data structure was 
>>>>>> intended only for use by the ServiceThread its dynamic node 
>>>>>> allocation may have made more sense. But now that seems like a 
>>>>>> liability to me - if JvmtiDeferredEvents could be linked directly 
>>>>>> we wouldn't need dynamic nodes, nor dynamic per-thread queues 
>>>>>> (just a per-thread pointer).
>>>>>
>>>>> I'm not following.? The queue is for multiple events that might be 
>>>>> posted while in the CodeCache_lock, so they need to be in order and 
>>>>> linked together.? While we post them and take them off, if the 
>>>>> callback safepoints (maybe calls back into the JVM), we don't want 
>>>>> to have GC or nmethods_do walk the one that's been posted already. 
>>>>> So a queue seems to make sense.
>>>>
>>>> Yes but you can make a queue just by having each event have a _next 
>>>> pointer, rather than dynamically creating nodes to hold the event. 
>>>> Each event is its own queue node implicitly.
>>>>
>>>>> One thing that I experimented with was to have the ServiceThread 
>>>>> take ownership of the queue in it's local thread queue and post 
>>>>> them all, which could be a future enhancement.? It didn't help my 
>>>>> OOM situation.
>>>>
>>>> Your OOM situation seems to be a basic case of overwhelming the 
>>>> ServiceThread. A single serviceThread will always have a limit on 
>>>> how many events it can handle. Maybe this test is being too 
>>>> unrealistic in its expectations of the current design?
>>>
>>> I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD 
>>> for all the events in the queue is going to be overwhelming unless it 
>>> waits for the events to be posted.
>>
>> Taking things off the service thread would seem to be a good thing 
>> then :)
>>
>>>>
>>>>> Deleting the queue after all the events are posted allows 
>>>>> JavaThread::oops_do and nmethods_do only a null check to deal with 
>>>>> this jvmti wart.
>>>>
>>>> If the nodes are not dynamically allocated you don't need to delete 
>>>> you just set the queue-head pointer to NULL - actually it will 
>>>> already be NULL once the last event has been processed.
>>>
>>> I could revisit the data structure as a future RFE.? The goal was to 
>>> reuse code that's already there, and I don't think there's a 
>>> significant difference in performance.? I did some measurement of the 
>>> stress case and the times were equivalent, actually better in the new 
>>> code.
>>
>> Okay.
> 
> Is this a code review then?? I think Serguei promised to review the code 
> too.

Yes this is a review.

Thanks,
David

> thanks,
> Coleen
>>
>> Thanks,
>> David
>>
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>>
>>>>>> Just some thoughts.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> I did write comments to this effect here:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Also, the jmethod_id field in nmethod was only used as a 
>>>>>>>>> boolean so don't create a jmethod_id until needed for 
>>>>>>>>> post_compiled_method_unload.
>>>>>>>>>
>>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>>>>> crashed in the original bug report.
>>>>>>>>>
>>>>>>>>> open webrev at 
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>
>>>>>
>>>
> 

From bob.vandette at oracle.com  Wed Dec  4 13:32:05 2019
From: bob.vandette at oracle.com (Bob Vandette)
Date: Wed, 4 Dec 2019 08:32:05 -0500
Subject: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <CD4410BC-4F48-4A8D-AC53-E6D9B4786FC6@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com>
 <CD4410BC-4F48-4A8D-AC53-E6D9B4786FC6@oracle.com>
Message-ID: <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com>


> On Dec 3, 2019, at 9:00 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
> 
> Hi Bob,
> 
>>>   It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
>>>  getSystemCpuLoad0.
> 
> I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns
> to the number of the CPUs configured on the host and returned by  sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native
> method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and
> inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected.
> 
> JNIEXPORT jint JNICALL
> Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0
> (JNIEnv *env, jobject mbean)
> {
>  if(perfInit() == 0) {
>    return counters.nProcs;
>  } else {
>    return -1;
>  }
> }
> 
> 
> If there is no objection I will include this change in the new webrev.

I don?t think this approach will work.  Both the array returned and the sysconf(_SC_NPROCESSORS_CONF) 
report the containers cpuset value so they will be equal causing you to always fallback.

You can try to use containerMetrics.getPerCpuUsage() instead of containerMetrics.getEffectiveCpuSetCpus().
The length of the array returned is the number of host cpus.  Maybe Severin can confirm if this true in cgroupv2 as
well.

Bob.


> 
> Thank you,
> Daniil
> 
> ?On 12/3/19, 1:30 PM, "Bob Vandette" <bob.vandette at oracle.com> wrote:
> 
>    Daniil,
> 
>    Looks good to me.
> 
>    If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will 
>    alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the
>    container detection to report containerized.
> 
>    It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
>    getSystemCpuLoad0. 
> 
> 
>    Bob.
> 
> 
>> On Dec 3, 2019, at 2:42 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
>> 
>> Please review the change that makes OperatingSystemMXBean methods return container specific information
>> rather than the host based data.
>> 
>> The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined
>> with the spec update David made [3].
>> 
>> The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total
>> and free memory from the returned values.
>> 
>> It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active.
>> 
>> The webrev also takes into account the case when java.security.AccessControlException exception is thrown
>> during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to
>> "/proc/self/mountinfo" file).
>> 
>> CSR for the spec changes [3] is approved.
>> 
>> Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .
>> 
>> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
>> [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
>> [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 
>> 
>> Thank you,
>> -Daniil
>> 
>> 
> 
> 
> 
> 


From coleen.phillimore at oracle.com  Wed Dec  4 13:45:12 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Dec 2019 08:45:12 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <7a6f73fd-ff7f-974d-d213-dc7ead799ead@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
 <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>
 <b5634ad7-3e62-ad72-7458-0b73e26ba59a@oracle.com>
 <a867c025-673d-999c-4690-cdf7f3099c19@oracle.com>
 <b1b0cd64-f5e4-8a7a-8e92-9b442339beee@oracle.com>
 <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com>
 <7a6f73fd-ff7f-974d-d213-dc7ead799ead@oracle.com>
Message-ID: <2c8b8751-bea1-bfd6-10fa-9170adbb2dd3@oracle.com>

Thanks, David!
Coleen

On 12/4/19 8:06 AM, David Holmes wrote:
>
> On 4/12/2019 10:21 pm, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/3/19 11:39 PM, David Holmes wrote:
>>>
>>>
>>> On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 12/3/19 8:31 AM, David Holmes wrote:
>>>>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>>
>>>>>> On 12/2/19 11:52 PM, David Holmes wrote:
>>>>>>> Hi Coleen,
>>>>>>>
>>>>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/26/19 7:03 PM, David Holmes wrote:
>>>>>>>>> (adding runtime as well)
>>>>>>>>>
>>>>>>>>> Hi Coleen,
>>>>>>>>>
>>>>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>>>>>>>> Summary: Add local deferred event list to thread to post 
>>>>>>>>>> events outside CodeCache_lock.
>>>>>>>>>>
>>>>>>>>>> This patch builds on the patch for JDK-8173361. With this 
>>>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class 
>>>>>>>>>> (not AllStatic) and have one per thread. The CodeBlob event 
>>>>>>>>>> that used to drop the CodeCache_lock and raced with the 
>>>>>>>>>> sweeper thread, adds the events it wants to post to its 
>>>>>>>>>> thread local list, and processes it outside the lock.? The 
>>>>>>>>>> list is walked in GC and by the sweeper to keep the nmethods 
>>>>>>>>>> from being unloaded and zombied, respectively.
>>>>>>>>>
>>>>>>>>> Sorry I don't understand why we would want/need a deferred 
>>>>>>>>> event queue for every JavaThread? Isn't this only relevant for 
>>>>>>>>> non-JavaThreads that need to have the ServiceThread process 
>>>>>>>>> the deferred event?
>>>>>>>>
>>>>>>>> I thought I'd written this in the bug but I had only discussed 
>>>>>>>> this with Erik.? I've added a comment to the bug to explain why 
>>>>>>>> I added the per-JavaThread queue.? In order to process these 
>>>>>>>> events after the CodeCache_lock is dropped, I have to queue 
>>>>>>>> them somewhere safe. The ServiceThread queue is safe, *but* the 
>>>>>>>> ServiceThread can't keep up with the events, especially from 
>>>>>>>> this test case.? So the test case gets a native OOM.
>>>>>>>>
>>>>>>>> So I've added the safe queue as a field to each JavaThread 
>>>>>>>> because multiple JavaThreads could be posting these events at 
>>>>>>>> the same time, and there didn't seem to be a better safe place 
>>>>>>>> to cache them, without adding another layer of queuing code.
>>>>>>>
>>>>>>> I think I'm getting the picture now. At the time the events are 
>>>>>>> generated we can't post them directly because the current thread 
>>>>>>> is inside compiler code. Hence the events must be deferred. 
>>>>>>> Using the ServiceThread to handle the deferred events is one way 
>>>>>>> to deal with this - but it can't keep up in this scenario. So 
>>>>>>> instead we store the events in the current thread and when the 
>>>>>>> current thread returns to code where it is safe to post the 
>>>>>>> events, it does so itself. Is that generally correct?
>>>>>>
>>>>>> Yes.
>>>>>>>
>>>>>>> I admit I'm not keen on adding this additional field per-thread 
>>>>>>> just for a temporary usage. Some kind of stack allocated helper 
>>>>>>> would be preferable, but would need to be passed through the 
>>>>>>> call chain so that the events could be added to it.
>>>>>>
>>>>>> Right, and the GC and nmethods_do has to find it somehow. It 
>>>>>> wasn't my first choice of where to put it also because there is 
>>>>>> too many things in JavaThread. Might be time for a future cleanup 
>>>>>> of Thread.
>>>>>
>>>>> I see.
>>>>>
>>>>>>>
>>>>>>> Also I'm not clear why we aggressively delete the 
>>>>>>> _jvmti_event_queue after posting the events. I'd be worried 
>>>>>>> about the overhead we are introducing for creating and deleting 
>>>>>>> this queue. When the JvmtiDeferredEventQueue data structure was 
>>>>>>> intended only for use by the ServiceThread its dynamic node 
>>>>>>> allocation may have made more sense. But now that seems like a 
>>>>>>> liability to me - if JvmtiDeferredEvents could be linked 
>>>>>>> directly we wouldn't need dynamic nodes, nor dynamic per-thread 
>>>>>>> queues (just a per-thread pointer).
>>>>>>
>>>>>> I'm not following.? The queue is for multiple events that might 
>>>>>> be posted while in the CodeCache_lock, so they need to be in 
>>>>>> order and linked together.? While we post them and take them off, 
>>>>>> if the callback safepoints (maybe calls back into the JVM), we 
>>>>>> don't want to have GC or nmethods_do walk the one that's been 
>>>>>> posted already. So a queue seems to make sense.
>>>>>
>>>>> Yes but you can make a queue just by having each event have a 
>>>>> _next pointer, rather than dynamically creating nodes to hold the 
>>>>> event. Each event is its own queue node implicitly.
>>>>>
>>>>>> One thing that I experimented with was to have the ServiceThread 
>>>>>> take ownership of the queue in it's local thread queue and post 
>>>>>> them all, which could be a future enhancement.? It didn't help my 
>>>>>> OOM situation.
>>>>>
>>>>> Your OOM situation seems to be a basic case of overwhelming the 
>>>>> ServiceThread. A single serviceThread will always have a limit on 
>>>>> how many events it can handle. Maybe this test is being too 
>>>>> unrealistic in its expectations of the current design?
>>>>
>>>> I think the JVMTI API where you can generate an 
>>>> COMPILED_METHOD_LOAD for all the events in the queue is going to be 
>>>> overwhelming unless it waits for the events to be posted.
>>>
>>> Taking things off the service thread would seem to be a good thing 
>>> then :)
>>>
>>>>>
>>>>>> Deleting the queue after all the events are posted allows 
>>>>>> JavaThread::oops_do and nmethods_do only a null check to deal 
>>>>>> with this jvmti wart.
>>>>>
>>>>> If the nodes are not dynamically allocated you don't need to 
>>>>> delete you just set the queue-head pointer to NULL - actually it 
>>>>> will already be NULL once the last event has been processed.
>>>>
>>>> I could revisit the data structure as a future RFE.? The goal was 
>>>> to reuse code that's already there, and I don't think there's a 
>>>> significant difference in performance.? I did some measurement of 
>>>> the stress case and the times were equivalent, actually better in 
>>>> the new code.
>>>
>>> Okay.
>>
>> Is this a code review then?? I think Serguei promised to review the 
>> code too.
>
> Yes this is a review.
>
> Thanks,
> David
>
>> thanks,
>> Coleen
>>>
>>> Thanks,
>>> David
>>>
>>>>
>>>> Thanks,
>>>> Coleen
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>>
>>>>>>> Just some thoughts.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> I did write comments to this effect here:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Also, the jmethod_id field in nmethod was only used as a 
>>>>>>>>>> boolean so don't create a jmethod_id until needed for 
>>>>>>>>>> post_compiled_method_unload.
>>>>>>>>>>
>>>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>>>>>> crashed in the original bug report.
>>>>>>>>>>
>>>>>>>>>> open webrev at 
>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>
>>>>>>
>>>>
>>


From bob.vandette at oracle.com  Wed Dec  4 14:13:51 2019
From: bob.vandette at oracle.com (Bob Vandette)
Date: Wed, 4 Dec 2019 09:13:51 -0500
Subject: jmx-dev 8226575: OperatingSystemMXBean should be made container
 aware
In-Reply-To: <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com>
 <CD4410BC-4F48-4A8D-AC53-E6D9B4786FC6@oracle.com>
 <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com>
Message-ID: <C9CFDCB8-B10D-4B44-A52E-6BFA2C1728BB@oracle.com>


> On Dec 4, 2019, at 8:32 AM, Bob Vandette <bob.vandette at oracle.com> wrote:
> 
> 
>> On Dec 3, 2019, at 9:00 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
>> 
>> Hi Bob,
>> 
>>>>  It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
>>>> getSystemCpuLoad0.
>> 
>> I think we can detect that the cpuset is identical to the host's one by comparing the length of the array containerMetrics.getEffectiveCpuSetCpus() returns
>> to the number of the CPUs configured on the host and returned by  sysconf(_SC_NPROCESSORS_CONF) . The latter could be retrieved by adding new native
>> method to OperatingSystemImpl getConfiguredCpuCount0. If they match then we just fallback to getSystemCpuLoad0(). I did some testing on Linux host and
>> inside Docker container with different '--cpuset-cpus' settings and it seems to work as expected.
>> 
>> JNIEXPORT jint JNICALL
>> Java_com_sun_management_internal_OperatingSystemImpl_getConfiguredCpuCount0
>> (JNIEnv *env, jobject mbean)
>> {
>> if(perfInit() == 0) {
>>   return counters.nProcs;
>> } else {
>>   return -1;
>> }
>> }
>> 
>> 
>> If there is no objection I will include this change in the new webrev.
> 
> I don?t think this approach will work.  Both the array returned and the sysconf(_SC_NPROCESSORS_CONF) 
> report the containers cpuset value so they will be equal causing you to always fallback.
> 
> You can try to use containerMetrics.getPerCpuUsage() instead of containerMetrics.getEffectiveCpuSetCpus().
> The length of the array returned is the number of host cpus.  Maybe Severin can confirm if this true in cgroupv2 as
> well.

I just checked the webrev for the cgroupv2 implementation and getPerCpuUsage is not supported.
I still think it?s worth implementing this optimization but it won?t be used on cgroupv2 since the
array length (0) won?t be equal to _SC_NPROCESSORS_CONF.

Here?s the cgroupv2 implementation of this method.

  64     @Override
  65     public long[] getPerCpuUsage() {
  66         // Not supported
  67         return new long[0];
  68     }
Bob.

> 
> Bob.
> 
> 
>> 
>> Thank you,
>> Daniil
>> 
>> ?On 12/3/19, 1:30 PM, "Bob Vandette" <bob.vandette at oracle.com> wrote:
>> 
>>   Daniil,
>> 
>>   Looks good to me.
>> 
>>   If there are any management jtreg tests, I?d run these since your changes to OperatingSystemMXBean will 
>>   alter the behavior of these methods even for Linux hosts since cgroups is typically enabled causing the
>>   container detection to report containerized.
>> 
>>   It?s too bad getCpuLoad can?t detect that the cpuset is identical to the hosts in order to allow you to fallback to
>>   getSystemCpuLoad0. 
>> 
>> 
>>   Bob.
>> 
>> 
>>> On Dec 3, 2019, at 2:42 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
>>> 
>>> Please review the change that makes OperatingSystemMXBean methods return container specific information
>>> rather than the host based data.
>>> 
>>> The webrev [1] is based on the code Andrew and Severin initially provided with some additional changes and combined
>>> with the spec update David made [3].
>>> 
>>> The webrev corrects the implementation for the free/total swap methods as Bob noted to subtract the total
>>> and free memory from the returned values.
>>> 
>>> It also corrects getCpuLoad() implementation, as Bob advised, to cover the case when CPU quotas are not active.
>>> 
>>> The webrev also takes into account the case when java.security.AccessControlException exception is thrown
>>> during the initialization of the container subsystem ( e.g.  when java.policy doesn?t grant "read" access to
>>> "/proc/self/mountinfo" file).
>>> 
>>> CSR for the spec changes [3] is approved.
>>> 
>>> Testing: Mach5 tiers1, tiers2, tiers3, tier4, tier5 (including open/test/hotspot/jtreg/containers/docker),  and tier6 tests passed .
>>> 
>>> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.02/ 
>>> [2] Jira issue :https://bugs.openjdk.java.net/browse/JDK-8226575 
>>> [3] CSR https://bugs.openjdk.java.net/browse/JDK-8228428 
>>> 
>>> Thank you,
>>> -Daniil
>>> 
>>> 
>> 
>> 
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191204/f31868fc/attachment.html>

From daniel.daugherty at oracle.com  Wed Dec  4 15:46:49 2019
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 4 Dec 2019 10:46:49 -0500
Subject: RFR (T) 8234355: Buffer overflow in jcmd GC.class_stats due to
 too many classes
In-Reply-To: <082ecca7-39ff-e027-2090-c3331650cd07@oracle.com>
References: <082ecca7-39ff-e027-2090-c3331650cd07@oracle.com>
Message-ID: <1a6df0bc-25fe-3895-2547-8a858fee18e4@oracle.com>

On 12/4/19 8:00 AM, coleen.phillimore at oracle.com wrote:
> Summary: Remove use of GC.class_stats in testing and failure analysis 
> (plan to deprecate)
>
> See bug for more details.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8234355.01/webrev

test/failure_handler/src/share/conf/common.properties
 ??? No comments.

Thumbs up. I agree that this change is trivial.

Dan


> bug link https://bugs.openjdk.java.net/browse/JDK-8234355
>
> Ran tier8 overnight.
>
> Thanks,
> Coleen


From coleen.phillimore at oracle.com  Wed Dec  4 16:05:14 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Dec 2019 11:05:14 -0500
Subject: RFR (T) 8234355: Buffer overflow in jcmd GC.class_stats due to
 too many classes
In-Reply-To: <1a6df0bc-25fe-3895-2547-8a858fee18e4@oracle.com>
References: <082ecca7-39ff-e027-2090-c3331650cd07@oracle.com>
 <1a6df0bc-25fe-3895-2547-8a858fee18e4@oracle.com>
Message-ID: <4ca6c9ce-9a3b-02ed-f2a9-bb37fcc562ee@oracle.com>

Thanks, Dan!
Coleen

On 12/4/19 10:46 AM, Daniel D. Daugherty wrote:
> On 12/4/19 8:00 AM, coleen.phillimore at oracle.com wrote:
>> Summary: Remove use of GC.class_stats in testing and failure analysis 
>> (plan to deprecate)
>>
>> See bug for more details.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8234355.01/webrev
>
> test/failure_handler/src/share/conf/common.properties
> ??? No comments.
>
> Thumbs up. I agree that this change is trivial.
>
> Dan
>
>
>> bug link https://bugs.openjdk.java.net/browse/JDK-8234355
>>
>> Ran tier8 overnight.
>>
>> Thanks,
>> Coleen
>


From sgehwolf at redhat.com  Wed Dec  4 18:22:30 2019
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Wed, 04 Dec 2019 19:22:30 +0100
Subject: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <48599FD5-583F-4CB7-80CA-09E91AA24D4A@oracle.com>
 <CD4410BC-4F48-4A8D-AC53-E6D9B4786FC6@oracle.com>
 <133A9A62-7D4E-4810-A9E4-B51A6D222585@oracle.com>
Message-ID: <2a47a93f7ee2f479a73b1e7a3dd829a244358095.camel@redhat.com>

On Wed, 2019-12-04 at 08:32 -0500, Bob Vandette wrote:
> You can try to use containerMetrics.getPerCpuUsage() instead of containerMetrics.getEffectiveCpuSetCpus().
> The length of the array returned is the number of host cpus.  Maybe Severin can confirm if this true in cgroupv2 as
> well.

If I'm not mistaken getPerCpuUsage() is not supported in cgroupv2.

Thanks,
Severin


From serguei.spitsyn at oracle.com  Wed Dec  4 19:17:09 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 4 Dec 2019 11:17:09 -0800
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <32889ba8-b9e1-6a38-deaf-a16cb6d2a9c6@oracle.com>
 <21fa558b-dd6d-e7a2-0633-625716b5f59e@oracle.com>
 <df7f9aeb-7b42-0fd5-375b-db7ade297bdb@oracle.com>
 <af198cb0-8c9e-0db7-76c2-cd9f06c78288@oracle.com>
 <b5634ad7-3e62-ad72-7458-0b73e26ba59a@oracle.com>
 <a867c025-673d-999c-4690-cdf7f3099c19@oracle.com>
 <b1b0cd64-f5e4-8a7a-8e92-9b442339beee@oracle.com>
 <00a78ad6-97ff-0465-a13c-0c51fff4764a@oracle.com>
Message-ID: <f260c6b7-ded8-8653-a11f-9035059b3392@oracle.com>

On 12/4/19 04:21, coleen.phillimore at oracle.com wrote:
>
>
> On 12/3/19 11:39 PM, David Holmes wrote:
>>
>>
>> On 3/12/2019 11:35 pm, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 12/3/19 8:31 AM, David Holmes wrote:
>>>> On 3/12/2019 11:08 pm, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 12/2/19 11:52 PM, David Holmes wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> On 3/12/2019 12:43 am, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 11/26/19 7:03 PM, David Holmes wrote:
>>>>>>>> (adding runtime as well)
>>>>>>>>
>>>>>>>> Hi Coleen,
>>>>>>>>
>>>>>>>> On 27/11/2019 12:22 am, coleen.phillimore at oracle.com wrote:
>>>>>>>>> Summary: Add local deferred event list to thread to post 
>>>>>>>>> events outside CodeCache_lock.
>>>>>>>>>
>>>>>>>>> This patch builds on the patch for JDK-8173361. With this 
>>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class 
>>>>>>>>> (not AllStatic) and have one per thread. The CodeBlob event 
>>>>>>>>> that used to drop the CodeCache_lock and raced with the 
>>>>>>>>> sweeper thread, adds the events it wants to post to its thread 
>>>>>>>>> local list, and processes it outside the lock.? The list is 
>>>>>>>>> walked in GC and by the sweeper to keep the nmethods from 
>>>>>>>>> being unloaded and zombied, respectively.
>>>>>>>>
>>>>>>>> Sorry I don't understand why we would want/need a deferred 
>>>>>>>> event queue for every JavaThread? Isn't this only relevant for 
>>>>>>>> non-JavaThreads that need to have the ServiceThread process the 
>>>>>>>> deferred event?
>>>>>>>
>>>>>>> I thought I'd written this in the bug but I had only discussed 
>>>>>>> this with Erik.? I've added a comment to the bug to explain why 
>>>>>>> I added the per-JavaThread queue. In order to process these 
>>>>>>> events after the CodeCache_lock is dropped, I have to queue them 
>>>>>>> somewhere safe. The ServiceThread queue is safe, *but* the 
>>>>>>> ServiceThread can't keep up with the events, especially from 
>>>>>>> this test case.? So the test case gets a native OOM.
>>>>>>>
>>>>>>> So I've added the safe queue as a field to each JavaThread 
>>>>>>> because multiple JavaThreads could be posting these events at 
>>>>>>> the same time, and there didn't seem to be a better safe place 
>>>>>>> to cache them, without adding another layer of queuing code.
>>>>>>
>>>>>> I think I'm getting the picture now. At the time the events are 
>>>>>> generated we can't post them directly because the current thread 
>>>>>> is inside compiler code. Hence the events must be deferred. Using 
>>>>>> the ServiceThread to handle the deferred events is one way to 
>>>>>> deal with this - but it can't keep up in this scenario. So 
>>>>>> instead we store the events in the current thread and when the 
>>>>>> current thread returns to code where it is safe to post the 
>>>>>> events, it does so itself. Is that generally correct?
>>>>>
>>>>> Yes.
>>>>>>
>>>>>> I admit I'm not keen on adding this additional field per-thread 
>>>>>> just for a temporary usage. Some kind of stack allocated helper 
>>>>>> would be preferable, but would need to be passed through the call 
>>>>>> chain so that the events could be added to it.
>>>>>
>>>>> Right, and the GC and nmethods_do has to find it somehow. It 
>>>>> wasn't my first choice of where to put it also because there is 
>>>>> too many things in JavaThread.? Might be time for a future cleanup 
>>>>> of Thread.
>>>>
>>>> I see.
>>>>
>>>>>>
>>>>>> Also I'm not clear why we aggressively delete the 
>>>>>> _jvmti_event_queue after posting the events. I'd be worried about 
>>>>>> the overhead we are introducing for creating and deleting this 
>>>>>> queue. When the JvmtiDeferredEventQueue data structure was 
>>>>>> intended only for use by the ServiceThread its dynamic node 
>>>>>> allocation may have made more sense. But now that seems like a 
>>>>>> liability to me - if JvmtiDeferredEvents could be linked directly 
>>>>>> we wouldn't need dynamic nodes, nor dynamic per-thread queues 
>>>>>> (just a per-thread pointer).
>>>>>
>>>>> I'm not following.? The queue is for multiple events that might be 
>>>>> posted while in the CodeCache_lock, so they need to be in order 
>>>>> and linked together.? While we post them and take them off, if the 
>>>>> callback safepoints (maybe calls back into the JVM), we don't want 
>>>>> to have GC or nmethods_do walk the one that's been posted already. 
>>>>> So a queue seems to make sense.
>>>>
>>>> Yes but you can make a queue just by having each event have a _next 
>>>> pointer, rather than dynamically creating nodes to hold the event. 
>>>> Each event is its own queue node implicitly.
>>>>
>>>>> One thing that I experimented with was to have the ServiceThread 
>>>>> take ownership of the queue in it's local thread queue and post 
>>>>> them all, which could be a future enhancement.? It didn't help my 
>>>>> OOM situation.
>>>>
>>>> Your OOM situation seems to be a basic case of overwhelming the 
>>>> ServiceThread. A single serviceThread will always have a limit on 
>>>> how many events it can handle. Maybe this test is being too 
>>>> unrealistic in its expectations of the current design?
>>>
>>> I think the JVMTI API where you can generate an COMPILED_METHOD_LOAD 
>>> for all the events in the queue is going to be overwhelming unless 
>>> it waits for the events to be posted.
>>
>> Taking things off the service thread would seem to be a good thing 
>> then :)
>>
>>>>
>>>>> Deleting the queue after all the events are posted allows 
>>>>> JavaThread::oops_do and nmethods_do only a null check to deal with 
>>>>> this jvmti wart.
>>>>
>>>> If the nodes are not dynamically allocated you don't need to delete 
>>>> you just set the queue-head pointer to NULL - actually it will 
>>>> already be NULL once the last event has been processed.
>>>
>>> I could revisit the data structure as a future RFE.? The goal was to 
>>> reuse code that's already there, and I don't think there's a 
>>> significant difference in performance.? I did some measurement of 
>>> the stress case and the times were equivalent, actually better in 
>>> the new code.
>>
>> Okay.
>
> Is this a code review then?? I think Serguei promised to review the 
> code too.

Yes, I'm close to send my review soon.
Sorry for the latency.

Thanks,
Serguei

>
> thanks,
> Coleen
>>
>> Thanks,
>> David
>>
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>>
>>>>>> Just some thoughts.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> I did write comments to this effect here:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp.udiff.html 
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Also, the jmethod_id field in nmethod was only used as a 
>>>>>>>>> boolean so don't create a jmethod_id until needed for 
>>>>>>>>> post_compiled_method_unload.
>>>>>>>>>
>>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>>>>> crashed in the original bug report.
>>>>>>>>>
>>>>>>>>> open webrev at 
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>
>>>>>
>>>
>


From igor.ignatyev at oracle.com  Wed Dec  4 19:52:27 2019
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 4 Dec 2019 11:52:27 -0800
Subject: RFR(T) : 8235353 : clean up hotspot problem lists
Message-ID: <F2800CA7-5FC0-4B32-A886-6F661694EA3F@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00
> 9 lines changed: 0 ins; 0 del; 9 mod; 

Hi all,

could you please review this small and trivial cleanup which returns serviceablility/sa tests back to execution on linux-ppc64. the tests were problem listed due to 8211767[1], which is closed as a dup of resolved 8228649[2].

[1] https://bugs.openjdk.java.net/browse/JDK-8211767
[2] https://bugs.openjdk.java.net/browse/JDK-8228649

Thanks,
-- Igor

From vladimir.kozlov at oracle.com  Wed Dec  4 20:01:21 2019
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 4 Dec 2019 12:01:21 -0800
Subject: RFR(T) : 8235353 : clean up hotspot problem lists
In-Reply-To: <F2800CA7-5FC0-4B32-A886-6F661694EA3F@oracle.com>
References: <F2800CA7-5FC0-4B32-A886-6F661694EA3F@oracle.com>
Message-ID: <a70b3cea-f759-1c5d-e73d-464a7804cffc@oracle.com>

I am fine with changes but we need to ask PPC64 supporter to verify that tests passed now.

Thanks,
Vladimir K

On 12/4/19 11:52 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00
>> 9 lines changed: 0 ins; 0 del; 9 mod;
> 
> Hi all,
> 
> could you please review this small and trivial cleanup which returns serviceablility/sa tests back to execution on linux-ppc64. the tests were problem listed due to 8211767[1], which is closed as a dup of resolved 8228649[2].
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8211767
> [2] https://bugs.openjdk.java.net/browse/JDK-8228649
> 
> Thanks,
> -- Igor
> 

From serguei.spitsyn at oracle.com  Wed Dec  4 22:15:04 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 4 Dec 2019 14:15:04 -0800
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
Message-ID: <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191204/19a19155/attachment.html>

From serguei.spitsyn at oracle.com  Wed Dec  4 22:16:28 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 4 Dec 2019 14:16:28 -0800
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
Message-ID: <b34215d5-a5cb-e3e8-1d87-7b81e526d618@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191204/c457d24e/attachment-0001.html>

From coleen.phillimore at oracle.com  Wed Dec  4 23:06:44 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 4 Dec 2019 18:06:44 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
Message-ID: <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>


Hi Serguei,

On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
> Hi Collen, (no problem)
>
> It looks good in general.
> Thank you a lot for sorting this out!
>
> Just a couple of comments.
>
>
> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html
> 1993 protected:
> 1994 // Jvmti Events that cannot be posted in their current context.
> 1995 // ServiceThread uses this to collect deferred events from 
> NonJava threads
> 1996 // that cannot post events.
> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>
> As David I also have a concern about footprint of having the 
> _jvmti_event_queue field in the Thread class.
> I'm thinking if it'd be better to move this field into the 
> JvmtiThreadState class.
> Please, see jvmti_thread_state() and 
> JvmtiThreadState::state_for(JavaThread *thread).

The reason I have it directly in JavaThread is so that the GC oops_do 
and nmethods_do code can find it easily.? I like your idea of hiding it 
in jvmti but this doesn't seem good to have this code know about 
jvmtiThreadState, which seems to be a queue of Jvmti states.? I also 
don't want to have jvmtiThreadState to have to add an oops_do() or 
nmethods_do() either.

>
>
> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this 
> method");
> 975 nmethod* nm = _event_data.compiled_method_load;
> 976 JvmtiExport::post_compiled_method_load(env, nm);
> 977 }
>
> The JvmtiDeferredEvent::post name looks too generic as it posts 
> compiled load events only.
> Do you consider this function extended in the future to support more 
> event types?
>

I don't envision an extension for this function but I do for 
JvmtiDeferredEventQueue::post().? I have a small enhancement that would 
handoff the entire queue to the ServiceThread and have it call post() to 
post all the events rather than one at a time.

So I'll rename this one post_compiled_method_load_event() and leave the 
other post() as is for now.

open webrev at 
http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev

Thanks,
Coleen


>
> Thanks,
> Serguei
>
>
> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>> Summary: Add local deferred event list to thread to post events 
>> outside CodeCache_lock.
>>
>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>> and have one per thread.? The CodeBlob event that used to drop the 
>> CodeCache_lock and raced with the sweeper thread, adds the events it 
>> wants to post to its thread local list, and processes it outside the 
>> lock.? The list is walked in GC and by the sweeper to keep the 
>> nmethods from being unloaded and zombied, respectively.
>>
>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>> don't create a jmethod_id until needed for post_compiled_method_unload.
>>
>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed in 
>> the original bug report.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>
>> Thanks,
>> Coleen
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191204/7a9cd97b/attachment.html>

From serguei.spitsyn at oracle.com  Wed Dec  4 23:27:51 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 4 Dec 2019 15:27:51 -0800
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
Message-ID: <7d51c3d1-a963-48b3-961f-d119ea9058d1@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191204/b2c51512/attachment.html>

From mandy.chung at oracle.com  Thu Dec  5 00:09:45 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Wed, 4 Dec 2019 16:09:45 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
Message-ID: <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>


On 12/3/19 9:40 PM, Daniil Titov wrote:
>      
>>> Under what circumstance that limit or memLimit is < 0?
> The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without
> specifying a memory limit ( without '--memory='  Docker option) . In latter there is no limit on how much memory the container can use and
> it can use as much memory as the host's OS allows.
>      

OK.? Please add a comment to the code.

It may worth considering adding Metrics::getSwapLimit and 
Metrics::getSwapUsage and move the computation to the implementation of 
Metrics.? Bob may have an opinion.

Also it seems correct for the memory related methods to check if 
(containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).? 
BTW what does it mean if limit == 0?

>>> Is it worth  specifying this case?
> I believe yes, since it covers the cases when JVM runs  on a Linux host or a docker container was started without memory limitation.
>      

I was wondering if the javadoc should specify that.
>>> It fallbacks to return the system's total swap space size - this is not really what it should report.
> For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set.
> However, I am not sure how we could differentiate these 2 cases.

As this is the case when the limit is not set in the container, it 
returns the system metrics which sounds appropriate.

>      
>>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.
> For  getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available.
> For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0).

Will zero memory usage happen?

> Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result.
>      

> For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod,  CpuNumPeriods , or getCpuUsage are unavailable or if a valid  CPU load for some CPU was
> not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just  return -1 in these cases rather then falling back to getSystemCpuLoad0()
>      

returning -1 sounds right.
>>> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
>>>     There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
> I could make these methods defaults if you feel it is a better approach here.
>      
>      

It's not strictly needed but I can go either way.


>>> CheckOperatingSystemMXBean.java
>>>      System.out.println(String.format(...)) can simply be replaced with System.out.format.
> I will include this change in the next webrev, thank you!
>      
>

thanks
Mandy

From daniel.daugherty at oracle.com  Thu Dec  5 00:40:05 2019
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 4 Dec 2019 19:40:05 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
Message-ID: <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>

Generally speaking, JVM/TI related things should be in JvmtiThreadState
instead of directly in the Thread class. That way the extra space is only
consumed when JVM/TI is in use and only when a Thread does something that
requires a JvmtiThreadState to be created.

Please reconsider moving _jvmti_event_queue.

Dan


On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote:
>
> Hi Serguei,
>
> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
>> Hi Collen, (no problem)
>>
>> It looks good in general.
>> Thank you a lot for sorting this out!
>>
>> Just a couple of comments.
>>
>>
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html
>> 1993 protected:
>> 1994 // Jvmti Events that cannot be posted in their current context.
>> 1995 // ServiceThread uses this to collect deferred events from 
>> NonJava threads
>> 1996 // that cannot post events.
>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>>
>> As David I also have a concern about footprint of having the 
>> _jvmti_event_queue field in the Thread class.
>> I'm thinking if it'd be better to move this field into the 
>> JvmtiThreadState class.
>> Please, see jvmti_thread_state() and 
>> JvmtiThreadState::state_for(JavaThread *thread).
>
> The reason I have it directly in JavaThread is so that the GC oops_do 
> and nmethods_do code can find it easily.? I like your idea of hiding 
> it in jvmti but this doesn't seem good to have this code know about 
> jvmtiThreadState, which seems to be a queue of Jvmti states.? I also 
> don't want to have jvmtiThreadState to have to add an oops_do() or 
> nmethods_do() either.
>
>>
>>
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this 
>> method");
>> 975 nmethod* nm = _event_data.compiled_method_load;
>> 976 JvmtiExport::post_compiled_method_load(env, nm);
>> 977 }
>>
>> The JvmtiDeferredEvent::post name looks too generic as it posts 
>> compiled load events only.
>> Do you consider this function extended in the future to support more 
>> event types?
>>
>
> I don't envision an extension for this function but I do for 
> JvmtiDeferredEventQueue::post().? I have a small enhancement that 
> would handoff the entire queue to the ServiceThread and have it call 
> post() to post all the events rather than one at a time.
>
> So I'll rename this one post_compiled_method_load_event() and leave 
> the other post() as is for now.
>
> open webrev at 
> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev
>
> Thanks,
> Coleen
>
>
>>
>> Thanks,
>> Serguei
>>
>>
>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>>> Summary: Add local deferred event list to thread to post events 
>>> outside CodeCache_lock.
>>>
>>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>>> and have one per thread.? The CodeBlob event that used to drop the 
>>> CodeCache_lock and raced with the sweeper thread, adds the events it 
>>> wants to post to its thread local list, and processes it outside the 
>>> lock.? The list is walked in GC and by the sweeper to keep the 
>>> nmethods from being unloaded and zombied, respectively.
>>>
>>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>>> don't create a jmethod_id until needed for post_compiled_method_unload.
>>>
>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed 
>>> in the original bug report.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>
>>> Thanks,
>>> Coleen
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191204/94daedef/attachment-0001.html>

From chris.plummer at oracle.com  Thu Dec  5 01:39:14 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 4 Dec 2019 17:39:14 -0800
Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions
 and do a better job of detecting SA failures
In-Reply-To: <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com>
References: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
 <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com>
Message-ID: <b67f2656-5fa9-b734-e3c4-2be38e4959d6@oracle.com>

Can I get one more review please?

thanks,

Chris

On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
> It looks good.
>
> Thanks,
> Serguei
>
> On 12/3/19 12:45 PM, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8234277
>> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/
>>
>> No longer redirect stderr for the jhsdb/clhsdb process. It results in 
>> not seeing attach failures in the output, so OutputAnalyer can't 
>> check for them.
>>
>> Execute "verbose true" as the first clhsdb command after launching. 
>> This will result in verboseExceptions being true in 
>> CommandProcessor.java, so full exception traces will appear in the 
>> output. This will make debugging future SA test failures a lot easier.
>>
>> Add an extra check for any DebuggerException. This is mainly for 
>> detecting that the attached failed. This previously was going 
>> un-noticed, and instead the test would later fail because it noticed 
>> some other issue, like missing output, which isn't very informative.
>>
>> Add checks for other unexpected SA exceptions that are caught and 
>> printed by CommandProcessor. These will always have an "Error: " 
>> prefix, making them easy to detect.
>>
>> Problem list ClhsdbScanOops.java. With the new error checking, it 
>> will now always fail on windows due to JDK-8230731 and on macos and 
>> linux due to JDK-8235220. These failures are not "new" per se, but 
>> are just now being properly detected.
>>
>> thanks,
>>
>> Chris
>


From igor.ignatyev at oracle.com  Thu Dec  5 02:08:23 2019
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 4 Dec 2019 18:08:23 -0800
Subject: RFR(T) : 8235353 : clean up hotspot problem lists
In-Reply-To: <a70b3cea-f759-1c5d-e73d-464a7804cffc@oracle.com>
References: <F2800CA7-5FC0-4B32-A886-6F661694EA3F@oracle.com>
 <a70b3cea-f759-1c5d-e73d-464a7804cffc@oracle.com>
Message-ID: <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com>

Martin, Goetz.

could you please check that these 9 tests still pass on PPC?

-- Igor

> On Dec 4, 2019, at 12:01 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> I am fine with changes but we need to ask PPC64 supporter to verify that tests passed now.
> 
> Thanks,
> Vladimir K
> 
> On 12/4/19 11:52 AM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00
>>> 9 lines changed: 0 ins; 0 del; 9 mod;
>> Hi all,
>> could you please review this small and trivial cleanup which returns serviceablility/sa tests back to execution on linux-ppc64. the tests were problem listed due to 8211767[1], which is closed as a dup of resolved 8228649[2].
>> [1] https://bugs.openjdk.java.net/browse/JDK-8211767
>> [2] https://bugs.openjdk.java.net/browse/JDK-8228649
>> Thanks,
>> -- Igor


From suenaga at oss.nttdata.com  Thu Dec  5 02:15:46 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 5 Dec 2019 11:15:46 +0900
Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions
 and do a better job of detecting SA failures
In-Reply-To: <b67f2656-5fa9-b734-e3c4-2be38e4959d6@oracle.com>
References: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
 <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com>
 <b67f2656-5fa9-b734-e3c4-2be38e4959d6@oracle.com>
Message-ID: <862c4eab-86cf-5578-dcf8-a6e4b6a995b6@oss.nttdata.com>

Looks good.

Yasumasa

On 2019/12/05 10:39, Chris Plummer wrote:
> Can I get one more review please?
> 
> thanks,
> 
> Chris
> 
> On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote:
>> Hi Chris,
>>
>> It looks good.
>>
>> Thanks,
>> Serguei
>>
>> On 12/3/19 12:45 PM, Chris Plummer wrote:
>>> Hello,
>>>
>>> Please review the following:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8234277
>>> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/
>>>
>>> No longer redirect stderr for the jhsdb/clhsdb process. It results in not seeing attach failures in the output, so OutputAnalyer can't check for them.
>>>
>>> Execute "verbose true" as the first clhsdb command after launching. This will result in verboseExceptions being true in CommandProcessor.java, so full exception traces will appear in the output. This will make debugging future SA test failures a lot easier.
>>>
>>> Add an extra check for any DebuggerException. This is mainly for detecting that the attached failed. This previously was going un-noticed, and instead the test would later fail because it noticed some other issue, like missing output, which isn't very informative.
>>>
>>> Add checks for other unexpected SA exceptions that are caught and printed by CommandProcessor. These will always have an "Error: " prefix, making them easy to detect.
>>>
>>> Problem list ClhsdbScanOops.java. With the new error checking, it will now always fail on windows due to JDK-8230731 and on macos and linux due to JDK-8235220. These failures are not "new" per se, but are just now being properly detected.
>>>
>>> thanks,
>>>
>>> Chris
>>
> 

From chris.plummer at oracle.com  Thu Dec  5 03:10:27 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 4 Dec 2019 19:10:27 -0800
Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions
 and do a better job of detecting SA failures
In-Reply-To: <862c4eab-86cf-5578-dcf8-a6e4b6a995b6@oss.nttdata.com>
References: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
 <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com>
 <b67f2656-5fa9-b734-e3c4-2be38e4959d6@oracle.com>
 <862c4eab-86cf-5578-dcf8-a6e4b6a995b6@oss.nttdata.com>
Message-ID: <045a11a4-c9d4-e55d-3102-781ed3523e81@oracle.com>

Thanks!

On 12/4/19 6:15 PM, Yasumasa Suenaga wrote:
> Looks good.
>
> Yasumasa
>
> On 2019/12/05 10:39, Chris Plummer wrote:
>> Can I get one more review please?
>>
>> thanks,
>>
>> Chris
>>
>> On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi Chris,
>>>
>>> It looks good.
>>>
>>> Thanks,
>>> Serguei
>>>
>>> On 12/3/19 12:45 PM, Chris Plummer wrote:
>>>> Hello,
>>>>
>>>> Please review the following:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8234277
>>>> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/
>>>>
>>>> No longer redirect stderr for the jhsdb/clhsdb process. It results 
>>>> in not seeing attach failures in the output, so OutputAnalyer can't 
>>>> check for them.
>>>>
>>>> Execute "verbose true" as the first clhsdb command after launching. 
>>>> This will result in verboseExceptions being true in 
>>>> CommandProcessor.java, so full exception traces will appear in the 
>>>> output. This will make debugging future SA test failures a lot easier.
>>>>
>>>> Add an extra check for any DebuggerException. This is mainly for 
>>>> detecting that the attached failed. This previously was going 
>>>> un-noticed, and instead the test would later fail because it 
>>>> noticed some other issue, like missing output, which isn't very 
>>>> informative.
>>>>
>>>> Add checks for other unexpected SA exceptions that are caught and 
>>>> printed by CommandProcessor. These will always have an "Error: " 
>>>> prefix, making them easy to detect.
>>>>
>>>> Problem list ClhsdbScanOops.java. With the new error checking, it 
>>>> will now always fail on windows due to JDK-8230731 and on macos and 
>>>> linux due to JDK-8235220. These failures are not "new" per se, but 
>>>> are just now being properly detected.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>
>>


From christoph.langer at sap.com  Thu Dec  5 11:24:14 2019
From: christoph.langer at sap.com (Langer, Christoph)
Date: Thu, 5 Dec 2019 11:24:14 +0000
Subject: RFR(T) : 8235353 : clean up hotspot problem lists
In-Reply-To: <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com>
References: <F2800CA7-5FC0-4B32-A886-6F661694EA3F@oracle.com>
 <a70b3cea-f759-1c5d-e73d-464a7804cffc@oracle.com>
 <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com>
Message-ID: <DB8PR02MB55472BC0ECAA8DA1FEE18A0A8A5C0@DB8PR02MB5547.eurprd02.prod.outlook.com>

Hi Igor,

I have added your update to our test system. I'll let you know the results by tomorrow.

Best regards
Christoph

> -----Original Message-----
> From: serviceability-dev <serviceability-dev-bounces at openjdk.java.net> On
> Behalf Of Igor Ignatyev
> Sent: Donnerstag, 5. Dezember 2019 03:08
> To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>
> Cc: serviceability-dev <serviceability-dev at openjdk.java.net>; Vladimir
> Kozlov <vladimir.kozlov at oracle.com>; hotspot-dev Source Developers
> <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR(T) : 8235353 : clean up hotspot problem lists
> 
> Martin, Goetz.
> 
> could you please check that these 9 tests still pass on PPC?
> 
> -- Igor
> 
> > On Dec 4, 2019, at 12:01 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com>
> wrote:
> >
> > I am fine with changes but we need to ask PPC64 supporter to verify that
> tests passed now.
> >
> > Thanks,
> > Vladimir K
> >
> > On 12/4/19 11:52 AM, Igor Ignatyev wrote:
> >> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00
> >>> 9 lines changed: 0 ins; 0 del; 9 mod;
> >> Hi all,
> >> could you please review this small and trivial cleanup which returns
> serviceablility/sa tests back to execution on linux-ppc64. the tests were
> problem listed due to 8211767[1], which is closed as a dup of resolved
> 8228649[2].
> >> [1] https://bugs.openjdk.java.net/browse/JDK-8211767
> >> [2] https://bugs.openjdk.java.net/browse/JDK-8228649
> >> Thanks,
> >> -- Igor


From coleen.phillimore at oracle.com  Thu Dec  5 12:08:04 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 5 Dec 2019 07:08:04 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
Message-ID: <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>


Thanks Dan.? I moved the field.? For some reason I thought that class 
did more/different things than hold per-thread information.

I've retested this version with tiers 2-6.

incr webrev at 
http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev
full? webrev at http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev

Thanks to Serguei for offline discussion.

Coleen

On 12/4/19 7:40 PM, Daniel D. Daugherty wrote:
> Generally speaking, JVM/TI related things should be in JvmtiThreadState
> instead of directly in the Thread class. That way the extra space is only
> consumed when JVM/TI is in use and only when a Thread does something that
> requires a JvmtiThreadState to be created.
>
> Please reconsider moving _jvmti_event_queue.
>
> Dan
>
>
> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote:
>>
>> Hi Serguei,
>>
>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi Collen, (no problem)
>>>
>>> It looks good in general.
>>> Thank you a lot for sorting this out!
>>>
>>> Just a couple of comments.
>>>
>>>
>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html
>>> 1993 protected:
>>> 1994 // Jvmti Events that cannot be posted in their current context.
>>> 1995 // ServiceThread uses this to collect deferred events from 
>>> NonJava threads
>>> 1996 // that cannot post events.
>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>>>
>>> As David I also have a concern about footprint of having the 
>>> _jvmti_event_queue field in the Thread class.
>>> I'm thinking if it'd be better to move this field into the 
>>> JvmtiThreadState class.
>>> Please, see jvmti_thread_state() and 
>>> JvmtiThreadState::state_for(JavaThread *thread).
>>
>> The reason I have it directly in JavaThread is so that the GC oops_do 
>> and nmethods_do code can find it easily.? I like your idea of hiding 
>> it in jvmti but this doesn't seem good to have this code know about 
>> jvmtiThreadState, which seems to be a queue of Jvmti states.? I also 
>> don't want to have jvmtiThreadState to have to add an oops_do() or 
>> nmethods_do() either.
>>
>>>
>>>
>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this 
>>> method");
>>> 975 nmethod* nm = _event_data.compiled_method_load;
>>> 976 JvmtiExport::post_compiled_method_load(env, nm);
>>> 977 }
>>>
>>> The JvmtiDeferredEvent::post name looks too generic as it posts 
>>> compiled load events only.
>>> Do you consider this function extended in the future to support more 
>>> event types?
>>>
>>
>> I don't envision an extension for this function but I do for 
>> JvmtiDeferredEventQueue::post().? I have a small enhancement that 
>> would handoff the entire queue to the ServiceThread and have it call 
>> post() to post all the events rather than one at a time.
>>
>> So I'll rename this one post_compiled_method_load_event() and leave 
>> the other post() as is for now.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev
>>
>> Thanks,
>> Coleen
>>
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>>>> Summary: Add local deferred event list to thread to post events 
>>>> outside CodeCache_lock.
>>>>
>>>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>>>> and have one per thread.? The CodeBlob event that used to drop the 
>>>> CodeCache_lock and raced with the sweeper thread, adds the events 
>>>> it wants to post to its thread local list, and processes it outside 
>>>> the lock.? The list is walked in GC and by the sweeper to keep the 
>>>> nmethods from being unloaded and zombied, respectively.
>>>>
>>>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>>>> don't create a jmethod_id until needed for 
>>>> post_compiled_method_unload.
>>>>
>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed 
>>>> in the original bug report.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>
>>>> Thanks,
>>>> Coleen
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/58f3bc06/attachment.html>

From david.holmes at oracle.com  Thu Dec  5 13:05:00 2019
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 5 Dec 2019 23:05:00 +1000
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
Message-ID: <d9d8050c-f597-bc66-9dc4-79a140ba3ab2@oracle.com>

Hi Coleen,

On 5/12/2019 10:08 pm, coleen.phillimore at oracle.com wrote:
> 
> Thanks Dan.? I moved the field.? For some reason I thought that class 
> did more/different things than hold per-thread information.
> 
> I've retested this version with tiers 2-6.
> 
> incr webrev at 
> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev

That relocation looks good to me!

One minor nit:

src/hotspot/share/code/nmethod.hpp

+  void post_compiled_method_load_event(JvmtiThreadState* thread = NULL);

parameter should be state not thread.

Thanks,
David
-----


> full? webrev at http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev
> 
> Thanks to Serguei for offline discussion.
> 
> Coleen
> 
> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote:
>> Generally speaking, JVM/TI related things should be in JvmtiThreadState
>> instead of directly in the Thread class. That way the extra space is only
>> consumed when JVM/TI is in use and only when a Thread does something that
>> requires a JvmtiThreadState to be created.
>>
>> Please reconsider moving _jvmti_event_queue.
>>
>> Dan
>>
>>
>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Hi Serguei,
>>>
>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
>>>> Hi Collen, (no problem)
>>>>
>>>> It looks good in general.
>>>> Thank you a lot for sorting this out!
>>>>
>>>> Just a couple of comments.
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html
>>>> 1993 protected:
>>>> 1994 // Jvmti Events that cannot be posted in their current context.
>>>> 1995 // ServiceThread uses this to collect deferred events from 
>>>> NonJava threads
>>>> 1996 // that cannot post events.
>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>>>>
>>>> As David I also have a concern about footprint of having the 
>>>> _jvmti_event_queue field in the Thread class.
>>>> I'm thinking if it'd be better to move this field into the 
>>>> JvmtiThreadState class.
>>>> Please, see jvmti_thread_state() and 
>>>> JvmtiThreadState::state_for(JavaThread *thread).
>>>
>>> The reason I have it directly in JavaThread is so that the GC oops_do 
>>> and nmethods_do code can find it easily.? I like your idea of hiding 
>>> it in jvmti but this doesn't seem good to have this code know about 
>>> jvmtiThreadState, which seems to be a queue of Jvmti states.? I also 
>>> don't want to have jvmtiThreadState to have to add an oops_do() or 
>>> nmethods_do() either.
>>>
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this 
>>>> method");
>>>> 975 nmethod* nm = _event_data.compiled_method_load;
>>>> 976 JvmtiExport::post_compiled_method_load(env, nm);
>>>> 977 }
>>>>
>>>> The JvmtiDeferredEvent::post name looks too generic as it posts 
>>>> compiled load events only.
>>>> Do you consider this function extended in the future to support more 
>>>> event types?
>>>>
>>>
>>> I don't envision an extension for this function but I do for 
>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that 
>>> would handoff the entire queue to the ServiceThread and have it call 
>>> post() to post all the events rather than one at a time.
>>>
>>> So I'll rename this one post_compiled_method_load_event() and leave 
>>> the other post() as is for now.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev
>>>
>>> Thanks,
>>> Coleen
>>>
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>>>>> Summary: Add local deferred event list to thread to post events 
>>>>> outside CodeCache_lock.
>>>>>
>>>>> This patch builds on the patch for JDK-8173361.? With this patch, I 
>>>>> made the JvmtiDeferredEventQueue an instance class (not AllStatic) 
>>>>> and have one per thread.? The CodeBlob event that used to drop the 
>>>>> CodeCache_lock and raced with the sweeper thread, adds the events 
>>>>> it wants to post to its thread local list, and processes it outside 
>>>>> the lock.? The list is walked in GC and by the sweeper to keep the 
>>>>> nmethods from being unloaded and zombied, respectively.
>>>>>
>>>>> Also, the jmethod_id field in nmethod was only used as a boolean so 
>>>>> don't create a jmethod_id until needed for 
>>>>> post_compiled_method_unload.
>>>>>
>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that crashed 
>>>>> in the original bug report.
>>>>>
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>
>>>
>>
> 

From coleen.phillimore at oracle.com  Thu Dec  5 13:10:32 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 5 Dec 2019 08:10:32 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <d9d8050c-f597-bc66-9dc4-79a140ba3ab2@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
 <d9d8050c-f597-bc66-9dc4-79a140ba3ab2@oracle.com>
Message-ID: <07b00937-b8a3-1c9c-0b7e-229ad0091ab1@oracle.com>


On 12/5/19 8:05 AM, David Holmes wrote:
> Hi Coleen,
>
> On 5/12/2019 10:08 pm, coleen.phillimore at oracle.com wrote:
>>
>> Thanks Dan.? I moved the field.? For some reason I thought that class 
>> did more/different things than hold per-thread information.
>>
>> I've retested this version with tiers 2-6.
>>
>> incr webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev
>
> That relocation looks good to me!
>
> One minor nit:
>
> src/hotspot/share/code/nmethod.hpp
>
> +? void post_compiled_method_load_event(JvmtiThreadState* thread = NULL);
>
> parameter should be state not thread.

Ok yes, I'll fix it.? Thanks for the code review.
Coleen
>
> Thanks,
> David
> -----
>
>
>> full? webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev
>>
>> Thanks to Serguei for offline discussion.
>>
>> Coleen
>>
>> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote:
>>> Generally speaking, JVM/TI related things should be in JvmtiThreadState
>>> instead of directly in the Thread class. That way the extra space is 
>>> only
>>> consumed when JVM/TI is in use and only when a Thread does something 
>>> that
>>> requires a JvmtiThreadState to be created.
>>>
>>> Please reconsider moving _jvmti_event_queue.
>>>
>>> Dan
>>>
>>>
>>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Hi Serguei,
>>>>
>>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Collen, (no problem)
>>>>>
>>>>> It looks good in general.
>>>>> Thank you a lot for sorting this out!
>>>>>
>>>>> Just a couple of comments.
>>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html 
>>>>>
>>>>> 1993 protected:
>>>>> 1994 // Jvmti Events that cannot be posted in their current context.
>>>>> 1995 // ServiceThread uses this to collect deferred events from 
>>>>> NonJava threads
>>>>> 1996 // that cannot post events.
>>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>>>>>
>>>>> As David I also have a concern about footprint of having the 
>>>>> _jvmti_event_queue field in the Thread class.
>>>>> I'm thinking if it'd be better to move this field into the 
>>>>> JvmtiThreadState class.
>>>>> Please, see jvmti_thread_state() and 
>>>>> JvmtiThreadState::state_for(JavaThread *thread).
>>>>
>>>> The reason I have it directly in JavaThread is so that the GC 
>>>> oops_do and nmethods_do code can find it easily.? I like your idea 
>>>> of hiding it in jvmti but this doesn't seem good to have this code 
>>>> know about jvmtiThreadState, which seems to be a queue of Jvmti 
>>>> states.? I also don't want to have jvmtiThreadState to have to add 
>>>> an oops_do() or nmethods_do() either.
>>>>
>>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html 
>>>>>
>>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
>>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this 
>>>>> method");
>>>>> 975 nmethod* nm = _event_data.compiled_method_load;
>>>>> 976 JvmtiExport::post_compiled_method_load(env, nm);
>>>>> 977 }
>>>>>
>>>>> The JvmtiDeferredEvent::post name looks too generic as it posts 
>>>>> compiled load events only.
>>>>> Do you consider this function extended in the future to support 
>>>>> more event types?
>>>>>
>>>>
>>>> I don't envision an extension for this function but I do for 
>>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that 
>>>> would handoff the entire queue to the ServiceThread and have it 
>>>> call post() to post all the events rather than one at a time.
>>>>
>>>> So I'll rename this one post_compiled_method_load_event() and leave 
>>>> the other post() as is for now.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Add local deferred event list to thread to post events 
>>>>>> outside CodeCache_lock.
>>>>>>
>>>>>> This patch builds on the patch for JDK-8173361.? With this patch, 
>>>>>> I made the JvmtiDeferredEventQueue an instance class (not 
>>>>>> AllStatic) and have one per thread. The CodeBlob event that used 
>>>>>> to drop the CodeCache_lock and raced with the sweeper thread, 
>>>>>> adds the events it wants to post to its thread local list, and 
>>>>>> processes it outside the lock.? The list is walked in GC and by 
>>>>>> the sweeper to keep the nmethods from being unloaded and zombied, 
>>>>>> respectively.
>>>>>>
>>>>>> Also, the jmethod_id field in nmethod was only used as a boolean 
>>>>>> so don't create a jmethod_id until needed for 
>>>>>> post_compiled_method_unload.
>>>>>>
>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>> crashed in the original bug report.
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>
>>>>
>>>
>>


From thomas.stuefe at gmail.com  Thu Dec  5 13:32:26 2019
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Thu, 5 Dec 2019 14:32:26 +0100
Subject: RFR (M) 8234510: Remove file seeking requirement for writing a
 heap dump
In-Reply-To: <AM0PR02MB45008C66EC315E9836F7FF7A9F4A0@AM0PR02MB4500.eurprd02.prod.outlook.com>
References: <AM0PR02MB45008C66EC315E9836F7FF7A9F4A0@AM0PR02MB4500.eurprd02.prod.outlook.com>
Message-ID: <CAA-vtUxV40m+-qQ0N7SZmOe9FotFfyr1snQGe0T5pK50LKmJXg@mail.gmail.com>

Hi Ralf,

Not a complete review yet. But this looks good. The seeking before seemed
awkward.

Some remarks:

In DumpWriter, _current_entry_left and _entry_ended seem only to be needed
for asserting. Please enclose their definition in DEBUG_ONLY, and
initialize them in the ctor.

--

I like that DumperSupport::dump_object_array() does not write stuff
anymore. That felt surprising. Still feels awkward that it warns about
large arrays, seems more of a thing the caller should do. And that we have
to pass in header size for it to do that. Not a big deal though.

--

(not your patch): since DumperSupport::dump_class_and_array_classes(Klass*)
should assert that Klass* is an InstanceKlass; or, even better, use
InstanceKlass* as parameter.

--

This was a bit of a brain teaser. More comments would be helpful. You wrote
a good abstract in the JBS issue, could you please copy the proposed
implementation as a comment into the class declaration of DumpWriter.

--

DumpWriter::start_dump_entry(): It took me a while to understand how the
segment size is updated if the entry is huge, since by the time we finish
the entry the segment header will already be flushed out. The answer is, I
think, that this is not needed since we only write one record so the
initial size we wrote into the segment header is still valid.

Proposed comment change:

-// Will be fixed up later if we can add more entries.
+// Seed segment size with size of its first record. Should we add more
records later, we will update the segment size (see finish_dump_segment())

--

That's all I have for now.  If there are still Reviewers missing after my
vacation, I'll take another look.

Cheers, Thomas


On Mon, Nov 25, 2019 at 3:41 PM Schmelter, Ralf <ralf.schmelter at sap.com>
wrote:

> Hello,
>
> this change removes the need to use seek on the hprof file when creating a
> heap dump, thus making it possible to stream the dump. This enables us to
> dump to a socket or directly gzip the dump.
>
> Instead of fixing the heap dump segments size on the written file, the
> size of the heap dump segments is either fixed up in the buffer instead or,
> for entries to big to fit into the buffer fully, the entry get its own
> segment with no need to fix up the segment size later.
>
> To do this, we now need to know how large an heap dump segment entry is
> when starting to write the entry. This is either trivial (for the roots) or
> already known (for the instance and array dump entries). Just the class
> entry needed a little more code to track the size.
>
> The change results in more heap dump segments in the written heap dump.
> But since the overhead per segment is 9 bytes, even for the smallest used
> buffer (64K) the overhead is less than 0.02%. Additionally the heap dump
> now expects to be able to allocate at least 64k for the buffer. The old
> code tried to run even with a buffer of 1 byte or no buffer at all.
>
> Bugreport: https://bugs.openjdk.java.net/browse/JDK-8234510
> Webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8234510/webrev.0/
>
> Best regards,
> Ralf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/f28b1ee9/attachment.html>

From harold.seigel at oracle.com  Thu Dec  5 14:28:04 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Thu, 5 Dec 2019 09:28:04 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for Record
 attribute
Message-ID: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>

Hi,

Please review this trivial change to add documentation about the Record 
attribute to the JDWP, JDI, and Instrumentation specs.

The changed .html pages (best viewed as 'raw') are included in the 
webrev but will not be pushed.

Open Webrev: 
http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html

JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360

The fix was regression tested by running Mach5 tiers 1 and 2 tests and 
builds on Linux-x64, Solaris, Windows, and Mac OS X.

Thanks, Harold


From lois.foltan at oracle.com  Thu Dec  5 14:59:22 2019
From: lois.foltan at oracle.com (Lois Foltan)
Date: Thu, 5 Dec 2019 09:59:22 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
Message-ID: <7da43abb-cf32-a4f6-bdbe-57b525cf13f9@oracle.com>

On 12/5/2019 9:28 AM, Harold Seigel wrote:
> Hi,
>
> Please review this trivial change to add documentation about the 
> Record attribute to the JDWP, JDI, and Instrumentation specs.
>
> The changed .html pages (best viewed as 'raw') are included in the 
> webrev but will not be pushed.
>
> Open Webrev: 
> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>
> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>
> The fix was regression tested by running Mach5 tiers 1 and 2 tests and 
> builds on Linux-x64, Solaris, Windows, and Mac OS X.
>
> Thanks, Harold
>
Looks good & trivial.? VirtualMachine.java needs a copyright update.
Lois

From harold.seigel at oracle.com  Thu Dec  5 15:13:15 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Thu, 5 Dec 2019 10:13:15 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <7da43abb-cf32-a4f6-bdbe-57b525cf13f9@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <7da43abb-cf32-a4f6-bdbe-57b525cf13f9@oracle.com>
Message-ID: <6dd5a7ed-258c-0ccc-f438-af8c66e20eb1@oracle.com>

Thanks Lois!

I'll fix the copyright before pushing.

Harold

On 12/5/2019 9:59 AM, Lois Foltan wrote:
> On 12/5/2019 9:28 AM, Harold Seigel wrote:
>> Hi,
>>
>> Please review this trivial change to add documentation about the 
>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>
>> The changed .html pages (best viewed as 'raw') are included in the 
>> webrev but will not be pushed.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>
>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>> and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>
>> Thanks, Harold
>>
> Looks good & trivial.? VirtualMachine.java needs a copyright update.
> Lois

From serguei.spitsyn at oracle.com  Thu Dec  5 16:00:01 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Dec 2019 08:00:01 -0800
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
Message-ID: <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/9e02b096/attachment-0001.html>

From daniel.daugherty at oracle.com  Thu Dec  5 16:15:54 2019
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 5 Dec 2019 11:15:54 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
Message-ID: <1a6af5af-e64f-39d5-9629-3ae80eec64a8@oracle.com>

Do you plan to make JVM/TI spec changes also?

Dan


On 12/5/19 9:28 AM, Harold Seigel wrote:
> Hi,
>
> Please review this trivial change to add documentation about the 
> Record attribute to the JDWP, JDI, and Instrumentation specs.
>
> The changed .html pages (best viewed as 'raw') are included in the 
> webrev but will not be pushed.
>
> Open Webrev: 
> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>
> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>
> The fix was regression tested by running Mach5 tiers 1 and 2 tests and 
> builds on Linux-x64, Solaris, Windows, and Mac OS X.
>
> Thanks, Harold
>
>


From coleen.phillimore at oracle.com  Thu Dec  5 18:24:04 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 5 Dec 2019 13:24:04 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
 <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>
Message-ID: <55f95141-42e1-2ee9-6f0c-fafcafc35356@oracle.com>


On 12/5/19 11:00 AM, serguei.spitsyn at oracle.com wrote:
> Hi Collen,
>
> Thank you for making this update!
> It looks good to me.
>
> One nit:
>
> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev/test/hotspot/jtreg/serviceability/jvmti/CompiledMethodLoad/libCompiledZombie.cpp.html
>
> ? 46 // Continuously generate CompiledMethodLoad events for all 
> currently compiled methods
> ? 47 void JNICALL GenerateEventsThread(jvmtiEnv* jvmti, JNIEnv* jni, 
> void* arg) {
> ? 48???? jvmti->SetEventNotificationMode(JVMTI_ENABLE, 
> JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
> ? 49???? int count = 0;
> ? 50
> ? 51???? while (true) {
> ? 52???????? events = 0;
> ? 53 jvmti->GenerateEvents(JVMTI_EVENT_COMPILED_METHOD_LOAD);
> ? 54???????? if (events != 0 && ++count == 200) {
> ? 55???????????? printf("Generated %d events\n", events);
> ? 56???????????? count = 0;
> ? 57???????? }
> ? 58???? }
> ? 59 }
>
> ? The above can be simplified a little bit:
> ????????? if (events % 200 == 199) {
> ????????????? printf("Generated %d events\n", events);
> ????????? }
>
> ? Then this line is not needed too:
> ? ? 49???? int count = 0;
>

Ok, I'll make that change before I push.
Thanks for the review and your help!
Coleen

>
> Thanks,
> Serguei
>
>
> On 12/5/19 04:08, coleen.phillimore at oracle.com wrote:
>>
>> Thanks Dan.? I moved the field.? For some reason I thought that class 
>> did more/different things than hold per-thread information.
>>
>> I've retested this version with tiers 2-6.
>>
>> incr webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev
>> full? webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev
>>
>> Thanks to Serguei for offline discussion.
>>
>> Coleen
>>
>> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote:
>>> Generally speaking, JVM/TI related things should be in JvmtiThreadState
>>> instead of directly in the Thread class. That way the extra space is 
>>> only
>>> consumed when JVM/TI is in use and only when a Thread does something 
>>> that
>>> requires a JvmtiThreadState to be created.
>>>
>>> Please reconsider moving _jvmti_event_queue.
>>>
>>> Dan
>>>
>>>
>>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Hi Serguei,
>>>>
>>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Collen, (no problem)
>>>>>
>>>>> It looks good in general.
>>>>> Thank you a lot for sorting this out!
>>>>>
>>>>> Just a couple of comments.
>>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html
>>>>> 1993 protected:
>>>>> 1994 // Jvmti Events that cannot be posted in their current context.
>>>>> 1995 // ServiceThread uses this to collect deferred events from 
>>>>> NonJava threads
>>>>> 1996 // that cannot post events.
>>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>>>>>
>>>>> As David I also have a concern about footprint of having the 
>>>>> _jvmti_event_queue field in the Thread class.
>>>>> I'm thinking if it'd be better to move this field into the 
>>>>> JvmtiThreadState class.
>>>>> Please, see jvmti_thread_state() and 
>>>>> JvmtiThreadState::state_for(JavaThread *thread).
>>>>
>>>> The reason I have it directly in JavaThread is so that the GC 
>>>> oops_do and nmethods_do code can find it easily.? I like your idea 
>>>> of hiding it in jvmti but this doesn't seem good to have this code 
>>>> know about jvmtiThreadState, which seems to be a queue of Jvmti 
>>>> states.? I also don't want to have jvmtiThreadState to have to add 
>>>> an oops_do() or nmethods_do() either.
>>>>
>>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
>>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
>>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this 
>>>>> method");
>>>>> 975 nmethod* nm = _event_data.compiled_method_load;
>>>>> 976 JvmtiExport::post_compiled_method_load(env, nm);
>>>>> 977 }
>>>>>
>>>>> The JvmtiDeferredEvent::post name looks too generic as it posts 
>>>>> compiled load events only.
>>>>> Do you consider this function extended in the future to support 
>>>>> more event types?
>>>>>
>>>>
>>>> I don't envision an extension for this function but I do for 
>>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that 
>>>> would handoff the entire queue to the ServiceThread and have it 
>>>> call post() to post all the events rather than one at a time.
>>>>
>>>> So I'll rename this one post_compiled_method_load_event() and leave 
>>>> the other post() as is for now.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Add local deferred event list to thread to post events 
>>>>>> outside CodeCache_lock.
>>>>>>
>>>>>> This patch builds on the patch for JDK-8173361.? With this patch, 
>>>>>> I made the JvmtiDeferredEventQueue an instance class (not 
>>>>>> AllStatic) and have one per thread. The CodeBlob event that used 
>>>>>> to drop the CodeCache_lock and raced with the sweeper thread, 
>>>>>> adds the events it wants to post to its thread local list, and 
>>>>>> processes it outside the lock.? The list is walked in GC and by 
>>>>>> the sweeper to keep the nmethods from being unloaded and zombied, 
>>>>>> respectively.
>>>>>>
>>>>>> Also, the jmethod_id field in nmethod was only used as a boolean 
>>>>>> so don't create a jmethod_id until needed for 
>>>>>> post_compiled_method_unload.
>>>>>>
>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>> crashed in the original bug report.
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/5474d657/attachment.html>

From harold.seigel at oracle.com  Thu Dec  5 18:30:24 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Thu, 5 Dec 2019 13:30:24 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <1a6af5af-e64f-39d5-9629-3ae80eec64a8@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <1a6af5af-e64f-39d5-9629-3ae80eec64a8@oracle.com>
Message-ID: <b2bef00f-42bd-e92a-1524-09c0f05cee81@oracle.com>

The JVM/TI change for Record attrbute was in the big Records push.? I 
missed the other three until Serguei pointed it out.

Thanks, Harold

On 12/5/2019 11:15 AM, Daniel D. Daugherty wrote:
> Do you plan to make JVM/TI spec changes also?
>
> Dan
>
>
> On 12/5/19 9:28 AM, Harold Seigel wrote:
>> Hi,
>>
>> Please review this trivial change to add documentation about the 
>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>
>> The changed .html pages (best viewed as 'raw') are included in the 
>> webrev but will not be pushed.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>
>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>> and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>
>> Thanks, Harold
>>
>>
>

From daniel.daugherty at oracle.com  Thu Dec  5 18:34:02 2019
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 5 Dec 2019 13:34:02 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <b2bef00f-42bd-e92a-1524-09c0f05cee81@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <1a6af5af-e64f-39d5-9629-3ae80eec64a8@oracle.com>
 <b2bef00f-42bd-e92a-1524-09c0f05cee81@oracle.com>
Message-ID: <410d99e1-a893-1b7d-8aa6-c93e47f12118@oracle.com>

Thanks for clarifying.

Dan


On 12/5/19 1:30 PM, Harold Seigel wrote:
> The JVM/TI change for Record attrbute was in the big Records push.? I 
> missed the other three until Serguei pointed it out.
>
> Thanks, Harold
>
> On 12/5/2019 11:15 AM, Daniel D. Daugherty wrote:
>> Do you plan to make JVM/TI spec changes also?
>>
>> Dan
>>
>>
>> On 12/5/19 9:28 AM, Harold Seigel wrote:
>>> Hi,
>>>
>>> Please review this trivial change to add documentation about the 
>>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>>
>>> The changed .html pages (best viewed as 'raw') are included in the 
>>> webrev but will not be pushed.
>>>
>>> Open Webrev: 
>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>>
>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>>
>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>>> and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>>
>>> Thanks, Harold
>>>
>>>
>>


From coleen.phillimore at oracle.com  Thu Dec  5 18:36:39 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 5 Dec 2019 13:36:39 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
 <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>
Message-ID: <a675ed22-91eb-4571-2578-8c5cd774d74f@oracle.com>


On 12/5/19 11:00 AM, serguei.spitsyn at oracle.com wrote:
> Hi Collen,
>
> Thank you for making this update!
> It looks good to me.
>
> One nit:
>
> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev/test/hotspot/jtreg/serviceability/jvmti/CompiledMethodLoad/libCompiledZombie.cpp.html
>
> ? 46 // Continuously generate CompiledMethodLoad events for all 
> currently compiled methods
> ? 47 void JNICALL GenerateEventsThread(jvmtiEnv* jvmti, JNIEnv* jni, 
> void* arg) {
> ? 48???? jvmti->SetEventNotificationMode(JVMTI_ENABLE, 
> JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
> ? 49???? int count = 0;
> ? 50
> ? 51???? while (true) {
> ? 52???????? events = 0;
> ? 53 jvmti->GenerateEvents(JVMTI_EVENT_COMPILED_METHOD_LOAD);
> ? 54???????? if (events != 0 && ++count == 200) {
> ? 55???????????? printf("Generated %d events\n", events);
> ? 56???????????? count = 0;
> ? 57???????? }
> ? 58???? }
> ? 59 }
>
> ? The above can be simplified a little bit:
> ????????? if (events % 200 == 199) {
> ????????????? printf("Generated %d events\n", events);
> ????????? }
>
> ? Then this line is not needed too:
> ? ? 49???? int count = 0;
>

I answered this too fast.? There are two conditions where I want this to 
not print.? First is where events == 0 and the other for every 200 
events that are non-zero.

I could use if (events != 0 && count++ % 200), but I thought what I had 
makes more sense and I don't have to worry about when ++ happens.

Thanks,
Coleen

>
> Thanks,
> Serguei
>
>
> On 12/5/19 04:08, coleen.phillimore at oracle.com wrote:
>>
>> Thanks Dan.? I moved the field.? For some reason I thought that class 
>> did more/different things than hold per-thread information.
>>
>> I've retested this version with tiers 2-6.
>>
>> incr webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev
>> full? webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev
>>
>> Thanks to Serguei for offline discussion.
>>
>> Coleen
>>
>> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote:
>>> Generally speaking, JVM/TI related things should be in JvmtiThreadState
>>> instead of directly in the Thread class. That way the extra space is 
>>> only
>>> consumed when JVM/TI is in use and only when a Thread does something 
>>> that
>>> requires a JvmtiThreadState to be created.
>>>
>>> Please reconsider moving _jvmti_event_queue.
>>>
>>> Dan
>>>
>>>
>>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Hi Serguei,
>>>>
>>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Collen, (no problem)
>>>>>
>>>>> It looks good in general.
>>>>> Thank you a lot for sorting this out!
>>>>>
>>>>> Just a couple of comments.
>>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html
>>>>> 1993 protected:
>>>>> 1994 // Jvmti Events that cannot be posted in their current context.
>>>>> 1995 // ServiceThread uses this to collect deferred events from 
>>>>> NonJava threads
>>>>> 1996 // that cannot post events.
>>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>>>>>
>>>>> As David I also have a concern about footprint of having the 
>>>>> _jvmti_event_queue field in the Thread class.
>>>>> I'm thinking if it'd be better to move this field into the 
>>>>> JvmtiThreadState class.
>>>>> Please, see jvmti_thread_state() and 
>>>>> JvmtiThreadState::state_for(JavaThread *thread).
>>>>
>>>> The reason I have it directly in JavaThread is so that the GC 
>>>> oops_do and nmethods_do code can find it easily.? I like your idea 
>>>> of hiding it in jvmti but this doesn't seem good to have this code 
>>>> know about jvmtiThreadState, which seems to be a queue of Jvmti 
>>>> states.? I also don't want to have jvmtiThreadState to have to add 
>>>> an oops_do() or nmethods_do() either.
>>>>
>>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
>>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
>>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of this 
>>>>> method");
>>>>> 975 nmethod* nm = _event_data.compiled_method_load;
>>>>> 976 JvmtiExport::post_compiled_method_load(env, nm);
>>>>> 977 }
>>>>>
>>>>> The JvmtiDeferredEvent::post name looks too generic as it posts 
>>>>> compiled load events only.
>>>>> Do you consider this function extended in the future to support 
>>>>> more event types?
>>>>>
>>>>
>>>> I don't envision an extension for this function but I do for 
>>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that 
>>>> would handoff the entire queue to the ServiceThread and have it 
>>>> call post() to post all the events rather than one at a time.
>>>>
>>>> So I'll rename this one post_compiled_method_load_event() and leave 
>>>> the other post() as is for now.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Add local deferred event list to thread to post events 
>>>>>> outside CodeCache_lock.
>>>>>>
>>>>>> This patch builds on the patch for JDK-8173361.? With this patch, 
>>>>>> I made the JvmtiDeferredEventQueue an instance class (not 
>>>>>> AllStatic) and have one per thread. The CodeBlob event that used 
>>>>>> to drop the CodeCache_lock and raced with the sweeper thread, 
>>>>>> adds the events it wants to post to its thread local list, and 
>>>>>> processes it outside the lock.? The list is walked in GC and by 
>>>>>> the sweeper to keep the nmethods from being unloaded and zombied, 
>>>>>> respectively.
>>>>>>
>>>>>> Also, the jmethod_id field in nmethod was only used as a boolean 
>>>>>> so don't create a jmethod_id until needed for 
>>>>>> post_compiled_method_unload.
>>>>>>
>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>> crashed in the original bug report.
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/67599b9d/attachment.html>

From serguei.spitsyn at oracle.com  Thu Dec  5 18:41:54 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Dec 2019 10:41:54 -0800
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <a675ed22-91eb-4571-2578-8c5cd774d74f@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
 <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>
 <a675ed22-91eb-4571-2578-8c5cd774d74f@oracle.com>
Message-ID: <b962601a-3bbe-3a9f-c96b-c35edcdb9c32@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/0d529e14/attachment-0001.html>

From coleen.phillimore at oracle.com  Thu Dec  5 19:15:28 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 5 Dec 2019 14:15:28 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <b962601a-3bbe-3a9f-c96b-c35edcdb9c32@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
 <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>
 <a675ed22-91eb-4571-2578-8c5cd774d74f@oracle.com>
 <b962601a-3bbe-3a9f-c96b-c35edcdb9c32@oracle.com>
Message-ID: <66e249d2-ea32-db69-5e1f-1a31fb19ecd5@oracle.com>


On 12/5/19 1:41 PM, serguei.spitsyn at oracle.com wrote:
> On 12/5/19 10:36, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/5/19 11:00 AM, serguei.spitsyn at oracle.com wrote:
>>> Hi Collen,
>>>
>>> Thank you for making this update!
>>> It looks good to me.
>>>
>>> One nit:
>>>
>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev/test/hotspot/jtreg/serviceability/jvmti/CompiledMethodLoad/libCompiledZombie.cpp.html
>>>
>>> ? 46 // Continuously generate CompiledMethodLoad events for all 
>>> currently compiled methods
>>> ? 47 void JNICALL GenerateEventsThread(jvmtiEnv* jvmti, JNIEnv* jni, 
>>> void* arg) {
>>> ? 48???? jvmti->SetEventNotificationMode(JVMTI_ENABLE, 
>>> JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
>>> ? 49???? int count = 0;
>>> ? 50
>>> ? 51???? while (true) {
>>> ? 52???????? events = 0;
>>> ? 53 jvmti->GenerateEvents(JVMTI_EVENT_COMPILED_METHOD_LOAD);
>>> ? 54???????? if (events != 0 && ++count == 200) {
>>> ? 55???????????? printf("Generated %d events\n", events);
>>> ? 56???????????? count = 0;
>>> ? 57???????? }
>>> ? 58???? }
>>> ? 59 }
>>>
>>> ? The above can be simplified a little bit:
>>> ????????? if (events % 200 == 199) {
>>> ????????????? printf("Generated %d events\n", events);
>>> ????????? }
>>>
>>> ? Then this line is not needed too:
>>> ? ? 49???? int count = 0;
>>>
>>
>> I answered this too fast.? There are two conditions where I want this 
>> to not print.? First is where events == 0 and the other for every 200 
>> events that are non-zero.
>>
>> I could use if (events != 0 && count++ % 200), but I thought what I 
>> had makes more sense and I don't have to worry about when ++ happens.
>
> Then you could replace it with:
> ? if (events % 200 == 0) {

But that would still print when events == 0, which I don't want. If I 
print them all for the little test case, it's ok, but when I run this 
with Swingset2, it's too much output.? I only want to see a few lines 
for this:

----------System.out:(3/113)----------
Test passes if it doesn't crash while posting compiled method events.
Generated 285 events
Generated 1002 events
----------System.err:(1/15)----------

The count is the number of times through the GenerateEvents loop, which 
resets events to zero each time, then prints the number of events for 
every 200 times through the GenerateEvents loop.? So I need both count 
and events.

Coleen
>
> But it is up to you. :)
>
> Thanks,
> Serguei
>
>>
>> Thanks,
>> Coleen
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 12/5/19 04:08, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Thanks Dan.? I moved the field.? For some reason I thought that 
>>>> class did more/different things than hold per-thread information.
>>>>
>>>> I've retested this version with tiers 2-6.
>>>>
>>>> incr webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev
>>>> full? webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev
>>>>
>>>> Thanks to Serguei for offline discussion.
>>>>
>>>> Coleen
>>>>
>>>> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote:
>>>>> Generally speaking, JVM/TI related things should be in 
>>>>> JvmtiThreadState
>>>>> instead of directly in the Thread class. That way the extra space 
>>>>> is only
>>>>> consumed when JVM/TI is in use and only when a Thread does 
>>>>> something that
>>>>> requires a JvmtiThreadState to be created.
>>>>>
>>>>> Please reconsider moving _jvmti_event_queue.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Hi Serguei,
>>>>>>
>>>>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Collen, (no problem)
>>>>>>>
>>>>>>> It looks good in general.
>>>>>>> Thank you a lot for sorting this out!
>>>>>>>
>>>>>>> Just a couple of comments.
>>>>>>>
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html
>>>>>>> 1993 protected:
>>>>>>> 1994 // Jvmti Events that cannot be posted in their current context.
>>>>>>> 1995 // ServiceThread uses this to collect deferred events from 
>>>>>>> NonJava threads
>>>>>>> 1996 // that cannot post events.
>>>>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>>>>>>>
>>>>>>> As David I also have a concern about footprint of having the 
>>>>>>> _jvmti_event_queue field in the Thread class.
>>>>>>> I'm thinking if it'd be better to move this field into the 
>>>>>>> JvmtiThreadState class.
>>>>>>> Please, see jvmti_thread_state() and 
>>>>>>> JvmtiThreadState::state_for(JavaThread *thread).
>>>>>>
>>>>>> The reason I have it directly in JavaThread is so that the GC 
>>>>>> oops_do and nmethods_do code can find it easily. I like your idea 
>>>>>> of hiding it in jvmti but this doesn't seem good to have this 
>>>>>> code know about jvmtiThreadState, which seems to be a queue of 
>>>>>> Jvmti states.? I also don't want to have jvmtiThreadState to have 
>>>>>> to add an oops_do() or nmethods_do() either.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
>>>>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
>>>>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of 
>>>>>>> this method");
>>>>>>> 975 nmethod* nm = _event_data.compiled_method_load;
>>>>>>> 976 JvmtiExport::post_compiled_method_load(env, nm);
>>>>>>> 977 }
>>>>>>>
>>>>>>> The JvmtiDeferredEvent::post name looks too generic as it posts 
>>>>>>> compiled load events only.
>>>>>>> Do you consider this function extended in the future to support 
>>>>>>> more event types?
>>>>>>>
>>>>>>
>>>>>> I don't envision an extension for this function but I do for 
>>>>>> JvmtiDeferredEventQueue::post().? I have a small enhancement that 
>>>>>> would handoff the entire queue to the ServiceThread and have it 
>>>>>> call post() to post all the events rather than one at a time.
>>>>>>
>>>>>> So I'll rename this one post_compiled_method_load_event() and 
>>>>>> leave the other post() as is for now.
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>>>>>>>> Summary: Add local deferred event list to thread to post events 
>>>>>>>> outside CodeCache_lock.
>>>>>>>>
>>>>>>>> This patch builds on the patch for JDK-8173361. With this 
>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class 
>>>>>>>> (not AllStatic) and have one per thread.? The CodeBlob event 
>>>>>>>> that used to drop the CodeCache_lock and raced with the sweeper 
>>>>>>>> thread, adds the events it wants to post to its thread local 
>>>>>>>> list, and processes it outside the lock.? The list is walked in 
>>>>>>>> GC and by the sweeper to keep the nmethods from being unloaded 
>>>>>>>> and zombied, respectively.
>>>>>>>>
>>>>>>>> Also, the jmethod_id field in nmethod was only used as a 
>>>>>>>> boolean so don't create a jmethod_id until needed for 
>>>>>>>> post_compiled_method_unload.
>>>>>>>>
>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>>>> crashed in the original bug report.
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/8847ce29/attachment.html>

From daniil.x.titov at oracle.com  Thu Dec  5 19:31:02 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 05 Dec 2019 11:31:02 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
Message-ID: <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>

Hi Mandy and Bob,

Please review a new version of the webrev that addresses  the most of these issues:

1) The interface and spec [3] were updated to use default methods. CSR [3] was re-approved.

2) Security-sensitive operations in j.i.p.cgroupv1.Metrics and in j.i.p.cgroupv1. SubSystem
   were wrapped with doPrivileged

3) getCpuLoad () method was optimized to fallback to  getSystemCpuLoad0  if the cpuset is identical to the host's one. 
   It uses sysconf(_SC_NPROCESSORS_CONF) to retrieve the number of CPUs configured on the host . Testing with
   different  --cpuset-cpus settings inside a Docker container proved  that it always returns the same number of  hosts configured
   CPUs regardless of --cpuset-cpus settings while the same settings affect getEffectiveCpuSetCpus and getCpuSetCpus() metrics.
 
    In addition, getCpuLoad () method  now returns -1 in the cases when quotas are active but cpu usage and cpu period metrics are not available and
   in the case when  for some reason it fails to retrieve a valid CPU load for one of CPUs while iteration over them 

>> CheckOperatingSystemMXBean.java
>>     System.out.println(String.format(...)) can simply be replaced with System.out.format.
I had to leave it unchanged since replacing  it with System.out.format results in the tests instability as  it makes the trace output
 occasionally Intervene  here (the trace message sometimes is printed inside this message)  and tests cannot find the expected
 pattern in the output.

 >> It may worth considering adding Metrics::getSwapLimit and 
 >> Metrics::getSwapUsage and move the computation to the implementation of 
 >> Metrics.  Bob may have an opinion.
    
There was no any new input regarding this so I decided to leave it unchanged.

>>Also it seems correct for the memory related methods to check if 
>>(containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).  
>> BTW what does it mean if limit == 0?

Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
if something went wrong while retrieving this metric.

Testing: Mach5 tier1-tier6 tests (that include open/test/hotspot/jtreg/containers/docker  and : jdk_management  tests) passed. 

[1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/ 
[2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575    
[3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428

Thank you,
Daniil

?On 12/4/19, 4:09 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:

    
    On 12/3/19 9:40 PM, Daniil Titov wrote:
    >      
    >>> Under what circumstance that limit or memLimit is < 0?
    > The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without
    > specifying a memory limit ( without '--memory='  Docker option) . In latter there is no limit on how much memory the container can use and
    > it can use as much memory as the host's OS allows.
    >      
    
    OK.  Please add a comment to the code.
    
    It may worth considering adding Metrics::getSwapLimit and 
    Metrics::getSwapUsage and move the computation to the implementation of 
    Metrics.  Bob may have an opinion.
    
    Also it seems correct for the memory related methods to check if 
    (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).  
    BTW what does it mean if limit == 0?
    
    >>> Is it worth  specifying this case?
    > I believe yes, since it covers the cases when JVM runs  on a Linux host or a docker container was started without memory limitation.
    >      
    
    I was wondering if the javadoc should specify that.
    >>> It fallbacks to return the system's total swap space size - this is not really what it should report.
    > For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set.
    > However, I am not sure how we could differentiate these 2 cases.
    
    As this is the case when the limit is not set in the container, it 
    returns the system metrics which sounds appropriate.
    
    >      
    >>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.
    > For  getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available.
    > For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0).
    
    Will zero memory usage happen?
    
    > Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result.
    >      
    
    > For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod,  CpuNumPeriods , or getCpuUsage are unavailable or if a valid  CPU load for some CPU was
    > not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just  return -1 in these cases rather then falling back to getSystemCpuLoad0()
    >      
    
    returning -1 sounds right.
    >>> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
    >>>     There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
    > I could make these methods defaults if you feel it is a better approach here.
    >      
    >      
    
    It's not strictly needed but I can go either way.
    
    
    >>> CheckOperatingSystemMXBean.java
    >>>      System.out.println(String.format(...)) can simply be replaced with System.out.format.
    > I will include this change in the next webrev, thank you!
    >      
    >
    
    thanks
    Mandy
    

From alexey.menkov at oracle.com  Thu Dec  5 20:29:06 2019
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Thu, 5 Dec 2019 12:29:06 -0800
Subject: RFR(XS): JDK-8235433: Problem list JdwpListenTest.java and
 JdwpAttachTest.java on Windows
Message-ID: <85623de1-5915-522e-6db0-671b9fce4fba@oracle.com>

Hi all,

Recently JdwpListenTest.java and JdwpAttachTest.java have started to 
fail on Windows2016 for unclear (yet) reason:
https://bugs.openjdk.java.net/browse/JDK-8234935
Until the issue is resolved need to problem list the tests.

jira: https://bugs.openjdk.java.net/browse/JDK-8235433

the fix:

--- a/test/jdk/ProblemList.txt  Thu Dec 05 16:43:06 2019 +0000
+++ b/test/jdk/ProblemList.txt  Thu Dec 05 11:59:27 2019 -0800
@@ -904,6 +904,9 @@

  com/sun/jdi/NashornPopFrameTest.java 
8225620 generic-all

+com/sun/jdi/JdwpListenTest.java                                 8234935 
windows-all
+com/sun/jdi/JdwpAttachTest.java                                 8234935 
windows-all
+
 
############################################################################

  # jdk_time

--alex

From bob.vandette at oracle.com  Thu Dec  5 20:50:49 2019
From: bob.vandette at oracle.com (Bob Vandette)
Date: Thu, 5 Dec 2019 15:50:49 -0500
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
Message-ID: <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>

In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html

Shouldn?t you keep the IOException catch clauses in case the file is not found?


> On Dec 5, 2019, at 2:31 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
> 
> Hi Mandy and Bob,
> 
> Please review a new version of the webrev that addresses  the most of these issues:
> 
> 1) The interface and spec [3] were updated to use default methods. CSR [3] was re-approved.
> 
> 2) Security-sensitive operations in j.i.p.cgroupv1.Metrics and in j.i.p.cgroupv1. SubSystem
>   were wrapped with doPrivileged
> 
> 3) getCpuLoad () method was optimized to fallback to  getSystemCpuLoad0  if the cpuset is identical to the host's one. 
>   It uses sysconf(_SC_NPROCESSORS_CONF) to retrieve the number of CPUs configured on the host . Testing with
>   different  --cpuset-cpus settings inside a Docker container proved  that it always returns the same number of  hosts configured
>   CPUs regardless of --cpuset-cpus settings while the same settings affect getEffectiveCpuSetCpus and getCpuSetCpus() metrics.
> 
>    In addition, getCpuLoad () method  now returns -1 in the cases when quotas are active but cpu usage and cpu period metrics are not available and
>   in the case when  for some reason it fails to retrieve a valid CPU load for one of CPUs while iteration over them 

Shouldn't you do the same for getCpuLoad

149                     int[] cpuSet = containerMetrics.getEffectiveCpuSetCpus();
150                     if (cpuSet != null && cpuSet.length > 0) {

If cpuSet.length == 0?


> 
>>> CheckOperatingSystemMXBean.java
>>>    System.out.println(String.format(...)) can simply be replaced with System.out.format.
> I had to leave it unchanged since replacing  it with System.out.format results in the tests instability as  it makes the trace output
> occasionally Intervene  here (the trace message sometimes is printed inside this message)  and tests cannot find the expected
> pattern in the output.
> 
>>> It may worth considering adding Metrics::getSwapLimit and 
>>> Metrics::getSwapUsage and move the computation to the implementation of 
>>> Metrics.  Bob may have an opinion.
> 
> There was no any new input regarding this so I decided to leave it unchanged.

Sorry, I didn?t respond to this.  Since the calculation required for getFreeSwapSpaceSize requires retries
due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of
these methods understands the limitations of the API.

Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker
online documentation so it?s probably best to be consistent.

> 
>>> Also it seems correct for the memory related methods to check if 
>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).  
>>> BTW what does it mean if limit == 0?
> 
> Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
> And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
> if something went wrong while retrieving this metric.

That is true but shouldn?t you return -1 in that case?

I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1)
I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0.  This would be more consistent.

You should only fall back to the original logic (host values) if container values are set to unlimited.

Bob.

> 
> Testing: Mach5 tier1-tier6 tests (that include open/test/hotspot/jtreg/containers/docker  and : jdk_management  tests) passed. 
> 
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/ 
> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575    
> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
> 
> Thank you,
> Daniil
> 
> ?On 12/4/19, 4:09 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
> 
> 
> 
>    On 12/3/19 9:40 PM, Daniil Titov wrote:
>> 
>>>> Under what circumstance that limit or memLimit is < 0?
>> The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without
>> specifying a memory limit ( without '--memory='  Docker option) . In latter there is no limit on how much memory the container can use and
>> it can use as much memory as the host's OS allows.
>> 
> 
>    OK.  Please add a comment to the code.
> 
>    It may worth considering adding Metrics::getSwapLimit and 
>    Metrics::getSwapUsage and move the computation to the implementation of 
>    Metrics.  Bob may have an opinion.
> 
>    Also it seems correct for the memory related methods to check if 
>    (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).  
>    BTW what does it mean if limit == 0?
> 
>>>> Is it worth  specifying this case?
>> I believe yes, since it covers the cases when JVM runs  on a Linux host or a docker container was started without memory limitation.
>> 
> 
>    I was wondering if the javadoc should specify that.
>>>> It fallbacks to return the system's total swap space size - this is not really what it should report.
>> For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set.
>> However, I am not sure how we could differentiate these 2 cases.
> 
>    As this is the case when the limit is not set in the container, it 
>    returns the system metrics which sounds appropriate.
> 
>> 
>>>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.
>> For  getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available.
>> For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0).
> 
>    Will zero memory usage happen?
> 
>> Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result.
>> 
> 
>> For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod,  CpuNumPeriods , or getCpuUsage are unavailable or if a valid  CPU load for some CPU was
>> not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just  return -1 in these cases rather then falling back to getSystemCpuLoad0()
>> 
> 
>    returning -1 sounds right.
>>>> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
>>>>    There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
>> I could make these methods defaults if you feel it is a better approach here.
>> 
>> 
> 
>    It's not strictly needed but I can go either way.
> 
> 
>>>> CheckOperatingSystemMXBean.java
>>>>     System.out.println(String.format(...)) can simply be replaced with System.out.format.
>> I will include this change in the next webrev, thank you!
>> 
>> 
> 
>    thanks
>    Mandy
> 
> 
> 


From mandy.chung at oracle.com  Thu Dec  5 20:59:16 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Thu, 5 Dec 2019 12:59:16 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
Message-ID: <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>


On 12/5/19 12:50 PM, Bob Vandette wrote:
>
>>>> It may worth considering adding Metrics::getSwapLimit and
>>>> Metrics::getSwapUsage and move the computation to the implementation of
>>>> Metrics.  Bob may have an opinion.
>> There was no any new input regarding this so I decided to leave it unchanged.
> Sorry, I didn?t respond to this.  Since the calculation required for getFreeSwapSpaceSize requires retries
> due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of
> these methods understands the limitations of the API.

OK with me.
> Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker
> online documentation so it?s probably best to be consistent.
>
>>>> Also it seems correct for the memory related methods to check if
>>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).
>>>> BTW what does it mean if limit == 0?
>> Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
>> And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
>> if something went wrong while retrieving this metric.
> That is true but shouldn?t you return -1 in that case?
>
> I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1)
> I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0.  This would be more consistent.

+1

The javadoc should be changed and returns -1 when it's unavailable and 
the CSR should also be updated to reflect this.??? I'm sure Joe can 
re-approve the CSR quickly when the fix is reviewed and approved.

> You should only fall back to the original logic (host values) if container values are set to unlimited.
>
+1

Mandy

From serguei.spitsyn at oracle.com  Thu Dec  5 21:24:28 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Dec 2019 13:24:28 -0800
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <66e249d2-ea32-db69-5e1f-1a31fb19ecd5@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
 <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>
 <a675ed22-91eb-4571-2578-8c5cd774d74f@oracle.com>
 <b962601a-3bbe-3a9f-c96b-c35edcdb9c32@oracle.com>
 <66e249d2-ea32-db69-5e1f-1a31fb19ecd5@oracle.com>
Message-ID: <5b865d09-2ec4-8378-b438-264279a9a6fd@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/7adc1e97/attachment-0001.html>

From daniel.daugherty at oracle.com  Thu Dec  5 21:35:42 2019
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 5 Dec 2019 16:35:42 -0500
Subject: RFR(XS): JDK-8235433: Problem list JdwpListenTest.java and
 JdwpAttachTest.java on Windows
In-Reply-To: <85623de1-5915-522e-6db0-671b9fce4fba@oracle.com>
References: <85623de1-5915-522e-6db0-671b9fce4fba@oracle.com>
Message-ID: <4875b2c1-7a08-f60d-b645-c9bc766d3a78@oracle.com>

Thumbs up. This is a trivial change.

Dan


On 12/5/19 3:29 PM, Alex Menkov wrote:
> Hi all,
>
> Recently JdwpListenTest.java and JdwpAttachTest.java have started to 
> fail on Windows2016 for unclear (yet) reason:
> https://bugs.openjdk.java.net/browse/JDK-8234935
> Until the issue is resolved need to problem list the tests.
>
> jira: https://bugs.openjdk.java.net/browse/JDK-8235433
>
> the fix:
>
> --- a/test/jdk/ProblemList.txt? Thu Dec 05 16:43:06 2019 +0000
> +++ b/test/jdk/ProblemList.txt? Thu Dec 05 11:59:27 2019 -0800
> @@ -904,6 +904,9 @@
>
> ?com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all
>
> +com/sun/jdi/JdwpListenTest.java 8234935 windows-all
> +com/sun/jdi/JdwpAttachTest.java 8234935 windows-all
> +
>
> ############################################################################ 
>
>
> ?# jdk_time
>
> --alex


From coleen.phillimore at oracle.com  Thu Dec  5 21:46:59 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 5 Dec 2019 16:46:59 -0500
Subject: RFR (M) 8212160: JVMTI agent crashes with "assert(_value != 0LL)
 failed: resolving NULL _value"
In-Reply-To: <5b865d09-2ec4-8378-b438-264279a9a6fd@oracle.com>
References: <886380dd-fa13-94e5-ba1d-fc4678a5f90c@oracle.com>
 <008cc38c-a407-3acf-3769-7b8b6f47fcb3@oracle.com>
 <3d6ace12-ef89-64e4-02fc-3fe8806772e1@oracle.com>
 <7d5393cb-3783-9e89-6de8-f3d2f87b017d@oracle.com>
 <9173a527-bbdc-e04d-3981-f1cc0efbdca0@oracle.com>
 <2385071e-6fd6-3da4-9ab3-93e950f69f77@oracle.com>
 <a675ed22-91eb-4571-2578-8c5cd774d74f@oracle.com>
 <b962601a-3bbe-3a9f-c96b-c35edcdb9c32@oracle.com>
 <66e249d2-ea32-db69-5e1f-1a31fb19ecd5@oracle.com>
 <5b865d09-2ec4-8378-b438-264279a9a6fd@oracle.com>
Message-ID: <0d9c92d5-a9af-102b-9f0c-ef26d6210574@oracle.com>

Thanks Serguei!
Coleen

On 12/5/19 4:24 PM, serguei.spitsyn at oracle.com wrote:
> Got it, thanks!
> Serguei
>
>
> On 12/5/19 11:15, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/5/19 1:41 PM, serguei.spitsyn at oracle.com wrote:
>>> On 12/5/19 10:36, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 12/5/19 11:00 AM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Collen,
>>>>>
>>>>> Thank you for making this update!
>>>>> It looks good to me.
>>>>>
>>>>> One nit:
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev/test/hotspot/jtreg/serviceability/jvmti/CompiledMethodLoad/libCompiledZombie.cpp.html
>>>>>
>>>>> ? 46 // Continuously generate CompiledMethodLoad events for all 
>>>>> currently compiled methods
>>>>> ? 47 void JNICALL GenerateEventsThread(jvmtiEnv* jvmti, JNIEnv* 
>>>>> jni, void* arg) {
>>>>> ? 48 jvmti->SetEventNotificationMode(JVMTI_ENABLE, 
>>>>> JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
>>>>> ? 49???? int count = 0;
>>>>> ? 50
>>>>> ? 51???? while (true) {
>>>>> ? 52???????? events = 0;
>>>>> ? 53 jvmti->GenerateEvents(JVMTI_EVENT_COMPILED_METHOD_LOAD);
>>>>> ? 54???????? if (events != 0 && ++count == 200) {
>>>>> ? 55???????????? printf("Generated %d events\n", events);
>>>>> ? 56???????????? count = 0;
>>>>> ? 57???????? }
>>>>> ? 58???? }
>>>>> ? 59 }
>>>>>
>>>>> ? The above can be simplified a little bit:
>>>>> ????????? if (events % 200 == 199) {
>>>>> ????????????? printf("Generated %d events\n", events);
>>>>> ????????? }
>>>>>
>>>>> ? Then this line is not needed too:
>>>>> ? ? 49???? int count = 0;
>>>>>
>>>>
>>>> I answered this too fast.? There are two conditions where I want 
>>>> this to not print.? First is where events == 0 and the other for 
>>>> every 200 events that are non-zero.
>>>>
>>>> I could use if (events != 0 && count++ % 200), but I thought what I 
>>>> had makes more sense and I don't have to worry about when ++ happens.
>>>
>>> Then you could replace it with:
>>> ? if (events % 200 == 0) {
>>
>> But that would still print when events == 0, which I don't want.?? If 
>> I print them all for the little test case, it's ok, but when I run 
>> this with Swingset2, it's too much output.? I only want to see a few 
>> lines for this:
>>
>> ----------System.out:(3/113)----------
>> Test passes if it doesn't crash while posting compiled method events.
>> Generated 285 events
>> Generated 1002 events
>> ----------System.err:(1/15)----------
>>
>> The count is the number of times through the GenerateEvents loop, 
>> which resets events to zero each time, then prints the number of 
>> events for every 200 times through the GenerateEvents loop.? So I 
>> need both count and events.
>>
>> Coleen
>>>
>>> But it is up to you. :)
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 12/5/19 04:08, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Thanks Dan.? I moved the field.? For some reason I thought that 
>>>>>> class did more/different things than hold per-thread information.
>>>>>>
>>>>>> I've retested this version with tiers 2-6.
>>>>>>
>>>>>> incr webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03.incr/webrev
>>>>>> full? webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.03/webrev
>>>>>>
>>>>>> Thanks to Serguei for offline discussion.
>>>>>>
>>>>>> Coleen
>>>>>>
>>>>>> On 12/4/19 7:40 PM, Daniel D. Daugherty wrote:
>>>>>>> Generally speaking, JVM/TI related things should be in 
>>>>>>> JvmtiThreadState
>>>>>>> instead of directly in the Thread class. That way the extra 
>>>>>>> space is only
>>>>>>> consumed when JVM/TI is in use and only when a Thread does 
>>>>>>> something that
>>>>>>> requires a JvmtiThreadState to be created.
>>>>>>>
>>>>>>> Please reconsider moving _jvmti_event_queue.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 12/4/19 6:06 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>> Hi Serguei,
>>>>>>>>
>>>>>>>> On 12/4/19 5:15 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Collen, (no problem)
>>>>>>>>>
>>>>>>>>> It looks good in general.
>>>>>>>>> Thank you a lot for sorting this out!
>>>>>>>>>
>>>>>>>>> Just a couple of comments.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/runtime/thread.hpp.frames.html
>>>>>>>>> 1993 protected:
>>>>>>>>> 1994 // Jvmti Events that cannot be posted in their current 
>>>>>>>>> context.
>>>>>>>>> 1995 // ServiceThread uses this to collect deferred events 
>>>>>>>>> from NonJava threads
>>>>>>>>> 1996 // that cannot post events.
>>>>>>>>> 1997 JvmtiDeferredEventQueue* _jvmti_event_queue;
>>>>>>>>>
>>>>>>>>> As David I also have a concern about footprint of having the 
>>>>>>>>> _jvmti_event_queue field in the Thread class.
>>>>>>>>> I'm thinking if it'd be better to move this field into the 
>>>>>>>>> JvmtiThreadState class.
>>>>>>>>> Please, see jvmti_thread_state() and 
>>>>>>>>> JvmtiThreadState::state_for(JavaThread *thread).
>>>>>>>>
>>>>>>>> The reason I have it directly in JavaThread is so that the GC 
>>>>>>>> oops_do and nmethods_do code can find it easily.? I like your 
>>>>>>>> idea of hiding it in jvmti but this doesn't seem good to have 
>>>>>>>> this code know about jvmtiThreadState, which seems to be a 
>>>>>>>> queue of Jvmti states.? I also don't want to have 
>>>>>>>> jvmtiThreadState to have to add an oops_do() or nmethods_do() 
>>>>>>>> either.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
>>>>>>>>> 973 void JvmtiDeferredEvent::post(JvmtiEnv* env) {
>>>>>>>>> 974 assert(_type == TYPE_COMPILED_METHOD_LOAD, "only user of 
>>>>>>>>> this method");
>>>>>>>>> 975 nmethod* nm = _event_data.compiled_method_load;
>>>>>>>>> 976 JvmtiExport::post_compiled_method_load(env, nm);
>>>>>>>>> 977 }
>>>>>>>>>
>>>>>>>>> The JvmtiDeferredEvent::post name looks too generic as it 
>>>>>>>>> posts compiled load events only.
>>>>>>>>> Do you consider this function extended in the future to 
>>>>>>>>> support more event types?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I don't envision an extension for this function but I do for 
>>>>>>>> JvmtiDeferredEventQueue::post().? I have a small enhancement 
>>>>>>>> that would handoff the entire queue to the ServiceThread and 
>>>>>>>> have it call post() to post all the events rather than one at a 
>>>>>>>> time.
>>>>>>>>
>>>>>>>> So I'll rename this one post_compiled_method_load_event() and 
>>>>>>>> leave the other post() as is for now.
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.02.incr/webrev
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/26/19 06:22, coleen.phillimore at oracle.com wrote:
>>>>>>>>>> Summary: Add local deferred event list to thread to post 
>>>>>>>>>> events outside CodeCache_lock.
>>>>>>>>>>
>>>>>>>>>> This patch builds on the patch for JDK-8173361. With this 
>>>>>>>>>> patch, I made the JvmtiDeferredEventQueue an instance class 
>>>>>>>>>> (not AllStatic) and have one per thread.? The CodeBlob event 
>>>>>>>>>> that used to drop the CodeCache_lock and raced with the 
>>>>>>>>>> sweeper thread, adds the events it wants to post to its 
>>>>>>>>>> thread local list, and processes it outside the lock.? The 
>>>>>>>>>> list is walked in GC and by the sweeper to keep the nmethods 
>>>>>>>>>> from being unloaded and zombied, respectively.
>>>>>>>>>>
>>>>>>>>>> Also, the jmethod_id field in nmethod was only used as a 
>>>>>>>>>> boolean so don't create a jmethod_id until needed for 
>>>>>>>>>> post_compiled_method_unload.
>>>>>>>>>>
>>>>>>>>>> Ran hs tier1-8 on linux-x64-debug and the stress test that 
>>>>>>>>>> crashed in the original bug report.
>>>>>>>>>>
>>>>>>>>>> open webrev at 
>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8212160.01/webrev
>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8212160
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/36c2c8e3/attachment-0001.html>

From david.holmes at oracle.com  Thu Dec  5 22:52:31 2019
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Dec 2019 08:52:31 +1000
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
Message-ID: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>

Looks good Harold!

If we get any more of these unmodifiable attributes we may have to look 
at a way to refer to them more abstractly and only define them in one place.

Thanks,
David

On 6/12/2019 12:28 am, Harold Seigel wrote:
> Hi,
> 
> Please review this trivial change to add documentation about the Record 
> attribute to the JDWP, JDI, and Instrumentation specs.
> 
> The changed .html pages (best viewed as 'raw') are included in the 
> webrev but will not be pushed.
> 
> Open Webrev: 
> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
> 
> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
> 
> The fix was regression tested by running Mach5 tiers 1 and 2 tests and 
> builds on Linux-x64, Solaris, Windows, and Mac OS X.
> 
> Thanks, Harold
> 

From serguei.spitsyn at oracle.com  Fri Dec  6 00:25:40 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Dec 2019 16:25:40 -0800
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
Message-ID: <abcd72c9-ac66-7641-f801-c12cb741564a@oracle.com>

Hi David,

Agreed. I was thinking about the same.

Thanks,
Serguei

On 12/5/19 2:52 PM, David Holmes wrote:
> Looks good Harold!
>
> If we get any more of these unmodifiable attributes we may have to 
> look at a way to refer to them more abstractly and only define them in 
> one place.
>
> Thanks,
> David
>
> On 6/12/2019 12:28 am, Harold Seigel wrote:
>> Hi,
>>
>> Please review this trivial change to add documentation about the 
>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>
>> The changed .html pages (best viewed as 'raw') are included in the 
>> webrev but will not be pushed.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>
>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>> and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>
>> Thanks, Harold
>>


From daniil.x.titov at oracle.com  Fri Dec  6 01:03:21 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 05 Dec 2019 17:03:21 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
Message-ID: <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>

Hi Mandy and Bob,

Thank you for your comments. Please review a new version of the fix [1] that makes 
OperatingSystemImpl methods return -1 if one of the metric has value 0.

As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean 
indicating that methods could return -1 if the information is not available. 
There were no changes in CSR [3] yet, I plan to proceed with them after the fix is
reviewed.

> In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html
> Shouldn?t you keep the IOException catch clauses in case the file is not found?

There is no need in keeping IOException catch  in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). 
As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException.
Now these calls are performed inside  AccessController.doPrivileged(PrivilegedExceptionAction) that wraps
 all checked exceptions in  PrivilegedActionException  that we are catching now instead of IOException.

Here is the sampe of the stacktrace:
java.security.PrivilegedActionException: java.io.FileNotFoundException
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:558)
	at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113)
	at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390)
	at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109)
	at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36)
Caused by: java.io.FileNotFoundException
	at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:554)


In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch 
IOException inside this code block or convert this  block  to  PrivilegedExceptionAction and then put AccessController.doPrivileged  
call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable.

Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.

[1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/ 
[2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575     
[3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 

Thanks,
Daniil

?On 12/5/19, 12:59 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:

    
    On 12/5/19 12:50 PM, Bob Vandette wrote:
    >
    >>>> It may worth considering adding Metrics::getSwapLimit and
    >>>> Metrics::getSwapUsage and move the computation to the implementation of
    >>>> Metrics.  Bob may have an opinion.
    >> There was no any new input regarding this so I decided to leave it unchanged.
    > Sorry, I didn?t respond to this.  Since the calculation required for getFreeSwapSpaceSize requires retries
    > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of
    > these methods understands the limitations of the API.
    
    OK with me.
    > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker
    > online documentation so it?s probably best to be consistent.
    >
    >>>> Also it seems correct for the memory related methods to check if
    >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).
    >>>> BTW what does it mean if limit == 0?
    >> Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
    >> And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
    >> if something went wrong while retrieving this metric.
    > That is true but shouldn?t you return -1 in that case?
    >
    > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1)
    > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0.  This would be more consistent.
    
    +1
    
    The javadoc should be changed and returns -1 when it's unavailable and 
    the CSR should also be updated to reflect this.    I'm sure Joe can 
    re-approve the CSR quickly when the fix is reviewed and approved.
    
    > You should only fall back to the original logic (host values) if container values are set to unlimited.
    >
    +1
    
    Mandy
    

From serguei.spitsyn at oracle.com  Fri Dec  6 01:31:42 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Dec 2019 17:31:42 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
Message-ID: <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>

Hi Chris and Alex,

(I've also included Dan, David and Dean to the mailing list)

We have to reach a consensus about this.

We have 3 options:

Option #1:
 ? The JIT optimization to delete a code which "looks useless"
 ? has to be disabled if can_pop_frame capability is enabled.
 ? Than this problem becomes a JIT compiler bug.

Option #2:
 ? Consider to relax the JVMTI PopFrame spec by changing it to something 
like:
 ? "Note however, that the original argument values are not
 ?? preserved and can be changed by the called method;"
 ? Than this problem becomes a JVM TI spec bug.

Option #3:
 ? Consider it is Okay for compiler to eliminate useless code,
 ? so the argument values can be reinitialized by the PopFrame.
 ? Than this problem becomes just a test bug.


My preference is option #3.
The point is that if the arguments are not really used in
a method then restoring them to any values is a no-op.
It is really meaningless use case, so why should we care about it.

Thanks,
Serguei


On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
> Hi Alex,
>
> The fix itself looks Okay.
> Minor: replace in the comment: "compiler don't drop" => "compiler 
> doesn't drop".
>
> However, we still have to reach a consensus on how we treat this issue 
> (as Chris already commented).
>
> Thanks,
> Serguei
>
>
> On 11/8/19 15:22, Alex Menkov wrote:
>> Hi all,
>>
>> Please review the fix for
>> https://bugs.openjdk.java.net/browse/JDK-8215196
>> webrev:
>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>
>> Currently PopFrame is disabled with JVMCI by [1], so for testing I 
>> reverted [1] changes.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>
>> --alex
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191205/be431f15/attachment.html>

From david.holmes at oracle.com  Fri Dec  6 02:45:06 2019
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Dec 2019 12:45:06 +1000
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
Message-ID: <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>

Hi Serguei,

On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
> Hi Chris and Alex,
> 
> (I've also included Dan, David and Dean to the mailing list)
> 
> We have to reach a consensus about this.

This is just part of a much broader issue with JVM TI that I tried to 
have a discussion started based on Richard Reingruber's proposals around 
Escape Analysis:

http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html

Unfortunately that discussion did not get much traction.

> We have 3 options:
> 
> Option #1:
>  ? The JIT optimization to delete a code which "looks useless"
>  ? has to be disabled if can_pop_frame capability is enabled.
>  ? Than this problem becomes a JIT compiler bug.
> 
> Option #2:
>  ? Consider to relax the JVMTI PopFrame spec by changing it to something 
> like:
>  ? "Note however, that the original argument values are not
>  ?? preserved and can be changed by the called method;"
>  ? Than this problem becomes a JVM TI spec bug.
> 
> Option #3:
>  ? Consider it is Okay for compiler to eliminate useless code,
>  ? so the argument values can be reinitialized by the PopFrame.
>  ? Than this problem becomes just a test bug.
> 
> 
> My preference is option #3.
> The point is that if the arguments are not really used in
> a method then restoring them to any values is a no-op.
> It is really meaningless use case, so why should we care about it.

Thanks for setting that out clearly.

I'd like to agree this is particular case is a test bug. If we have a 
method:

int incr(int val) {
   val++;
   popFrameHere();
   return val;
}

then the change to the argument is necessary and must be preserved. In 
contrast:

void incr(int val) {
   val++;
   popFrameHere();
}

the change to the argument is meaningless and I would hope any decent 
JIT would simply elide it.

But we must have a consistent approach to such things. What would happen 
if a breakpoint were to be placed on the instruction that uselessly 
modified the argument - would we still see the modification or would it 
be elided?

And how do C1 and C2 avoid this issue? Do they simply not optimise away 
the useless assignment? Or do they actively disable that optimization in 
this context?

We need, IMO, to establish the basic philosophy of how to manage JVM TI 
/ JIT interactions, so we know what things must remain visible and which 
can be optimised away.

That said, changing the test allows us to defer having to reach that 
consensus.

David
-----

> Thanks,
> Serguei
> 
> 
> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>> Hi Alex,
>>
>> The fix itself looks Okay.
>> Minor: replace in the comment: "compiler don't drop" => "compiler 
>> doesn't drop".
>>
>> However, we still have to reach a consensus on how we treat this issue 
>> (as Chris already commented).
>>
>> Thanks,
>> Serguei
>>
>>
>> On 11/8/19 15:22, Alex Menkov wrote:
>>> Hi all,
>>>
>>> Please review the fix for
>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>> webrev:
>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>
>>> Currently PopFrame is disabled with JVMCI by [1], so for testing I 
>>> reverted [1] changes.
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>
>>> --alex
>>
> 

From serguei.spitsyn at oracle.com  Fri Dec  6 03:00:32 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Dec 2019 19:00:32 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
Message-ID: <9efb702f-a46c-5c26-475d-34e7165fc32a@oracle.com>

Hi David,

Thank you for writing this down.
Totally agree with you here.


On 12/5/19 6:45 PM, David Holmes wrote:
> Hi Serguei,
>
> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>> Hi Chris and Alex,
>>
>> (I've also included Dan, David and Dean to the mailing list)
>>
>> We have to reach a consensus about this.
>
> This is just part of a much broader issue with JVM TI that I tried to 
> have a discussion started based on Richard Reingruber's proposals 
> around Escape Analysis:
>
> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>
>
> Unfortunately that discussion did not get much traction.

I've mentioned the general discussion you started about JIT compiler
optimizations in one of my previous replies to this review threads.
Sorry, I was busy with other things and was not able to participate in 
it properly.
But I'm looking forward to continue this when there is a chance.

>> We have 3 options:
>>
>> Option #1:
>> ?? The JIT optimization to delete a code which "looks useless"
>> ?? has to be disabled if can_pop_frame capability is enabled.
>> ?? Than this problem becomes a JIT compiler bug.
>>
>> Option #2:
>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>> something like:
>> ?? "Note however, that the original argument values are not
>> ??? preserved and can be changed by the called method;"
>> ?? Than this problem becomes a JVM TI spec bug.
>>
>> Option #3:
>> ?? Consider it is Okay for compiler to eliminate useless code,
>> ?? so the argument values can be reinitialized by the PopFrame.
>> ?? Than this problem becomes just a test bug.
>>
>>
>> My preference is option #3.
>> The point is that if the arguments are not really used in
>> a method then restoring them to any values is a no-op.
>> It is really meaningless use case, so why should we care about it.
>
> Thanks for setting that out clearly.
>
> I'd like to agree this is particular case is a test bug. If we have a 
> method:
>
> int incr(int val) {
> ? val++;
> ? popFrameHere();
> ? return val;
> }
>
> then the change to the argument is necessary and must be preserved. In 
> contrast:
>
> void incr(int val) {
> ? val++;
> ? popFrameHere();
> }
>
> the change to the argument is meaningless and I would hope any decent 
> JIT would simply elide it.
>
> But we must have a consistent approach to such things. What would 
> happen if a breakpoint were to be placed on the instruction that 
> uselessly modified the argument - would we still see the modification 
> or would it be elided?
>
> And how do C1 and C2 avoid this issue? Do they simply not optimise 
> away the useless assignment? Or do they actively disable that 
> optimization in this context?
>
> We need, IMO, to establish the basic philosophy of how to manage JVM 
> TI / JIT interactions, so we know what things must remain visible and 
> which can be optimised away.

It is painful that we have not established it yet.

>
> That said, changing the test allows us to defer having to reach that 
> consensus.

Right.

Thanks,
Serguei

> David
> -----
>
>> Thanks,
>> Serguei
>>
>>
>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>> Hi Alex,
>>>
>>> The fix itself looks Okay.
>>> Minor: replace in the comment: "compiler don't drop" => "compiler 
>>> doesn't drop".
>>>
>>> However, we still have to reach a consensus on how we treat this 
>>> issue (as Chris already commented).
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>> Hi all,
>>>>
>>>> Please review the fix for
>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>> webrev:
>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>
>>>> Currently PopFrame is disabled with JVMCI by [1], so for testing I 
>>>> reverted [1] changes.
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>
>>>> --alex
>>>
>>


From david.holmes at oracle.com  Fri Dec  6 07:49:34 2019
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Dec 2019 17:49:34 +1000
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
Message-ID: <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>

Hi Daniil,

I'm not familiar with all the details of the various API's involved here 
so just a few general comments in places. I do have one major issue 
flagged below.

---

src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c

!     static int initialized=1;

Am I reading this right that the code currently fails to actually do the 
initialization because of this ???

Style nit:      if(perfInit()

space after "if"

---

src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java

The changes to allow for a return of -1 are somewhat more extensive than 
we have previously discussed. These methods previously were (per the 
spec) guaranteed to return some (assumably) meaningful value but now 
they are effectively allowed to fail by returning -1. No existing code 
is expecting to have to handle a return of -1 so I see this as a 
significant compatibility issue. Surely there must always be some 
information available from the operating environment? I see from the 
impl file:

     // the host data, value 0 indicates that something went wrong while 
the metric was read and
    // in this case we return "information unavailable" code -1.

I don't agree with this. If the container metrics are messed up somehow 
we should either fallback to the host value or else abort with some kind 
of exception. Returning -1 is not an option here IMO.

---

src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java
src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java

Can you please rename the legacy methods so that, for example, 
getTotalMemorySize() calls getTotalMemorySize0() rather than 
getTotalPhysicalMemorySize0(). That way we relegate the legacy names to 
the interface only.

---

test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java

System.out.println(String.format(...)

Why not simply

System.out.printf(..)

?

---

Thanks,
David
-----

On 6/12/2019 11:03 am, Daniil Titov wrote:
> Hi Mandy and Bob,
> 
> Thank you for your comments. Please review a new version of the fix [1] that makes
> OperatingSystemImpl methods return -1 if one of the metric has value 0.
> 
> As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean
> indicating that methods could return -1 if the information is not available.
> There were no changes in CSR [3] yet, I plan to proceed with them after the fix is
> reviewed.
> 
>> In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html
>> Shouldn?t you keep the IOException catch clauses in case the file is not found?
> 
> There is no need in keeping IOException catch  in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods).
> As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException.
> Now these calls are performed inside  AccessController.doPrivileged(PrivilegedExceptionAction) that wraps
>   all checked exceptions in  PrivilegedActionException  that we are catching now instead of IOException.
> 
> Here is the sampe of the stacktrace:
> java.security.PrivilegedActionException: java.io.FileNotFoundException
> 	at java.base/java.security.AccessController.doPrivileged(AccessController.java:558)
> 	at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113)
> 	at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390)
> 	at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109)
> 	at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36)
> Caused by: java.io.FileNotFoundException
> 	at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116)
> 	at java.base/java.security.AccessController.doPrivileged(AccessController.java:554)
> 
> 
> In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch
> IOException inside this code block or convert this  block  to  PrivilegedExceptionAction and then put AccessController.doPrivileged
> call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable.
> 
> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
> 
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/
> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
> 
> Thanks,
> Daniil
> 
> ?On 12/5/19, 12:59 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
> 
>      
>      
>      On 12/5/19 12:50 PM, Bob Vandette wrote:
>      >
>      >>>> It may worth considering adding Metrics::getSwapLimit and
>      >>>> Metrics::getSwapUsage and move the computation to the implementation of
>      >>>> Metrics.  Bob may have an opinion.
>      >> There was no any new input regarding this so I decided to leave it unchanged.
>      > Sorry, I didn?t respond to this.  Since the calculation required for getFreeSwapSpaceSize requires retries
>      > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of
>      > these methods understands the limitations of the API.
>      
>      OK with me.
>      > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker
>      > online documentation so it?s probably best to be consistent.
>      >
>      >>>> Also it seems correct for the memory related methods to check if
>      >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).
>      >>>> BTW what does it mean if limit == 0?
>      >> Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
>      >> And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
>      >> if something went wrong while retrieving this metric.
>      > That is true but shouldn?t you return -1 in that case?
>      >
>      > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1)
>      > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0.  This would be more consistent.
>      
>      +1
>      
>      The javadoc should be changed and returns -1 when it's unavailable and
>      the CSR should also be updated to reflect this.    I'm sure Joe can
>      re-approve the CSR quickly when the fix is reviewed and approved.
>      
>      > You should only fall back to the original logic (host values) if container values are set to unlimited.
>      >
>      +1
>      
>      Mandy
>      
> 
> 

From martin.doerr at sap.com  Fri Dec  6 10:51:11 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Fri, 6 Dec 2019 10:51:11 +0000
Subject: RFR(T) : 8235353 : clean up hotspot problem lists
In-Reply-To: <DB8PR02MB55472BC0ECAA8DA1FEE18A0A8A5C0@DB8PR02MB5547.eurprd02.prod.outlook.com>
References: <F2800CA7-5FC0-4B32-A886-6F661694EA3F@oracle.com>
 <a70b3cea-f759-1c5d-e73d-464a7804cffc@oracle.com>
 <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com>
 <DB8PR02MB55472BC0ECAA8DA1FEE18A0A8A5C0@DB8PR02MB5547.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB32978B2022050232C96300399A5F0@AM0PR0202MB3297.eurprd02.prod.outlook.com>

Hi Igor and Vladimir,

the tests have passed on PPC64. Change is good. Thanks for checking with us.

Best regards,
Martin


> -----Original Message-----
> From: Langer, Christoph <christoph.langer at sap.com>
> Sent: Donnerstag, 5. Dezember 2019 12:24
> To: Igor Ignatyev <igor.ignatyev at oracle.com>; Doerr, Martin
> <martin.doerr at sap.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>
> Cc: serviceability-dev <serviceability-dev at openjdk.java.net>; Vladimir
> Kozlov <vladimir.kozlov at oracle.com>; hotspot-dev Source Developers
> <hotspot-dev at openjdk.java.net>
> Subject: RE: RFR(T) : 8235353 : clean up hotspot problem lists
> 
> Hi Igor,
> 
> I have added your update to our test system. I'll let you know the results by
> tomorrow.
> 
> Best regards
> Christoph
> 
> > -----Original Message-----
> > From: serviceability-dev <serviceability-dev-bounces at openjdk.java.net>
> On
> > Behalf Of Igor Ignatyev
> > Sent: Donnerstag, 5. Dezember 2019 03:08
> > To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz
> > <goetz.lindenmaier at sap.com>
> > Cc: serviceability-dev <serviceability-dev at openjdk.java.net>; Vladimir
> > Kozlov <vladimir.kozlov at oracle.com>; hotspot-dev Source Developers
> > <hotspot-dev at openjdk.java.net>
> > Subject: Re: RFR(T) : 8235353 : clean up hotspot problem lists
> >
> > Martin, Goetz.
> >
> > could you please check that these 9 tests still pass on PPC?
> >
> > -- Igor
> >
> > > On Dec 4, 2019, at 12:01 PM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com>
> > wrote:
> > >
> > > I am fine with changes but we need to ask PPC64 supporter to verify that
> > tests passed now.
> > >
> > > Thanks,
> > > Vladimir K
> > >
> > > On 12/4/19 11:52 AM, Igor Ignatyev wrote:
> > >> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00
> > >>> 9 lines changed: 0 ins; 0 del; 9 mod;
> > >> Hi all,
> > >> could you please review this small and trivial cleanup which returns
> > serviceablility/sa tests back to execution on linux-ppc64. the tests were
> > problem listed due to 8211767[1], which is closed as a dup of resolved
> > 8228649[2].
> > >> [1] https://bugs.openjdk.java.net/browse/JDK-8211767
> > >> [2] https://bugs.openjdk.java.net/browse/JDK-8228649
> > >> Thanks,
> > >> -- Igor


From harold.seigel at oracle.com  Fri Dec  6 13:14:47 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 6 Dec 2019 08:14:47 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
Message-ID: <3f304c0e-00fd-988d-521a-1b17104bb6f4@oracle.com>

Thanks David!

Harold

On 12/5/2019 5:52 PM, David Holmes wrote:
> Looks good Harold!
>
> If we get any more of these unmodifiable attributes we may have to 
> look at a way to refer to them more abstractly and only define them in 
> one place.
>
> Thanks,
> David
>
> On 6/12/2019 12:28 am, Harold Seigel wrote:
>> Hi,
>>
>> Please review this trivial change to add documentation about the 
>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>
>> The changed .html pages (best viewed as 'raw') are included in the 
>> webrev but will not be pushed.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>
>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>> and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>
>> Thanks, Harold
>>

From harold.seigel at oracle.com  Fri Dec  6 13:16:18 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 6 Dec 2019 08:16:18 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <abcd72c9-ac66-7641-f801-c12cb741564a@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
 <abcd72c9-ac66-7641-f801-c12cb741564a@oracle.com>
Message-ID: <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com>

There will be another unmodifiable attribute with sealed types called 
PermittedSubtypes.

Harold

On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote:
> Hi David,
>
> Agreed. I was thinking about the same.
>
> Thanks,
> Serguei
>
> On 12/5/19 2:52 PM, David Holmes wrote:
>> Looks good Harold!
>>
>> If we get any more of these unmodifiable attributes we may have to 
>> look at a way to refer to them more abstractly and only define them 
>> in one place.
>>
>> Thanks,
>> David
>>
>> On 6/12/2019 12:28 am, Harold Seigel wrote:
>>> Hi,
>>>
>>> Please review this trivial change to add documentation about the 
>>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>>
>>> The changed .html pages (best viewed as 'raw') are included in the 
>>> webrev but will not be pushed.
>>>
>>> Open Webrev: 
>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>>
>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>>
>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>>> and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>>
>>> Thanks, Harold
>>>
>

From bob.vandette at oracle.com  Fri Dec  6 13:59:10 2019
From: bob.vandette at oracle.com (Bob Vandette)
Date: Fri, 6 Dec 2019 08:59:10 -0500
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
Message-ID: <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>


> On Dec 6, 2019, at 2:49 AM, David Holmes <David.Holmes at oracle.com> wrote:
> 
> Hi Daniil,
> 
> I'm not familiar with all the details of the various API's involved here so just a few general comments in places. I do have one major issue flagged below.
> 
> ---
> 
> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
> 
> !     static int initialized=1;
> 
> Am I reading this right that the code currently fails to actually do the initialization because of this ???
> 
> Style nit:      if(perfInit()
> 
> space after "if"
> 
> ---
> 
> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
> 
> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. Surely there must always be some information available from the operating environment? I see from the impl file:
> 
>    // the host data, value 0 indicates that something went wrong while the metric was read and
>   // in this case we return "information unavailable" code -1.
> 
> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.

I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
limits.

Bob.

> 
> ---
> 
> src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java
> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
> 
> Can you please rename the legacy methods so that, for example, getTotalMemorySize() calls getTotalMemorySize0() rather than getTotalPhysicalMemorySize0(). That way we relegate the legacy names to the interface only.
> 
> ---
> 
> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
> 
> System.out.println(String.format(...)
> 
> Why not simply
> 
> System.out.printf(..)
> 
> ?
> 
> ---
> 
> Thanks,
> David
> -----
> 
> On 6/12/2019 11:03 am, Daniil Titov wrote:
>> Hi Mandy and Bob,
>> Thank you for your comments. Please review a new version of the fix [1] that makes
>> OperatingSystemImpl methods return -1 if one of the metric has value 0.
>> As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean
>> indicating that methods could return -1 if the information is not available.
>> There were no changes in CSR [3] yet, I plan to proceed with them after the fix is
>> reviewed.
>>> In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html
>>> Shouldn?t you keep the IOException catch clauses in case the file is not found?
>> There is no need in keeping IOException catch  in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods).
>> As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException.
>> Now these calls are performed inside  AccessController.doPrivileged(PrivilegedExceptionAction) that wraps
>>  all checked exceptions in  PrivilegedActionException  that we are catching now instead of IOException.
>> Here is the sampe of the stacktrace:
>> java.security.PrivilegedActionException: java.io.FileNotFoundException
>> 	at java.base/java.security.AccessController.doPrivileged(AccessController.java:558)
>> 	at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113)
>> 	at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390)
>> 	at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109)
>> 	at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36)
>> Caused by: java.io.FileNotFoundException
>> 	at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116)
>> 	at java.base/java.security.AccessController.doPrivileged(AccessController.java:554)
>> In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch
>> IOException inside this code block or convert this  block  to  PrivilegedExceptionAction and then put AccessController.doPrivileged
>> call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable.
>> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
>> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/
>> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
>> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
>> Thanks,
>> Daniil
>> ?On 12/5/19, 12:59 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
>>               On 12/5/19 12:50 PM, Bob Vandette wrote:
>>     >
>>     >>>> It may worth considering adding Metrics::getSwapLimit and
>>     >>>> Metrics::getSwapUsage and move the computation to the implementation of
>>     >>>> Metrics.  Bob may have an opinion.
>>     >> There was no any new input regarding this so I decided to leave it unchanged.
>>     > Sorry, I didn?t respond to this.  Since the calculation required for getFreeSwapSpaceSize requires retries
>>     > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of
>>     > these methods understands the limitations of the API.
>>          OK with me.
>>     > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker
>>     > online documentation so it?s probably best to be consistent.
>>     >
>>     >>>> Also it seems correct for the memory related methods to check if
>>     >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).
>>     >>>> BTW what does it mean if limit == 0?
>>     >> Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
>>     >> And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
>>     >> if something went wrong while retrieving this metric.
>>     > That is true but shouldn?t you return -1 in that case?
>>     >
>>     > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1)
>>     > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0.  This would be more consistent.
>>          +1
>>          The javadoc should be changed and returns -1 when it's unavailable and
>>     the CSR should also be updated to reflect this.    I'm sure Joe can
>>     re-approve the CSR quickly when the fix is reviewed and approved.
>>          > You should only fall back to the original logic (host values) if container values are set to unlimited.
>>     >
>>     +1
>>          Mandy
>>     


From larry.cable at oracle.com  Fri Dec  6 17:09:42 2019
From: larry.cable at oracle.com (Laurence Cable)
Date: Fri, 6 Dec 2019 09:09:42 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
Message-ID: <6fbd0b6f-90bf-457c-88d8-754ce9ef45ec@oracle.com>

+1

On 12/6/19 5:59 AM, Bob Vandette wrote:
>> On Dec 6, 2019, at 2:49 AM, David Holmes <David.Holmes at oracle.com> wrote:
>>
>> Hi Daniil,
>>
>> I'm not familiar with all the details of the various API's involved here so just a few general comments in places. I do have one major issue flagged below.
>>
>> ---
>>
>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>>
>> !     static int initialized=1;
>>
>> Am I reading this right that the code currently fails to actually do the initialization because of this ???
>>
>> Style nit:      if(perfInit()
>>
>> space after "if"
>>
>> ---
>>
>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>>
>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue. Surely there must always be some information available from the operating environment? I see from the impl file:
>>
>>     // the host data, value 0 indicates that something went wrong while the metric was read and
>>    // in this case we return "information unavailable" code -1.
>>
>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
> I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
> limits.
>
> Bob.
>
>> ---
>>
>> src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java
>> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
>>
>> Can you please rename the legacy methods so that, for example, getTotalMemorySize() calls getTotalMemorySize0() rather than getTotalPhysicalMemorySize0(). That way we relegate the legacy names to the interface only.
>>
>> ---
>>
>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>
>> System.out.println(String.format(...)
>>
>> Why not simply
>>
>> System.out.printf(..)
>>
>> ?
>>
>> ---
>>
>> Thanks,
>> David
>> -----
>>
>> On 6/12/2019 11:03 am, Daniil Titov wrote:
>>> Hi Mandy and Bob,
>>> Thank you for your comments. Please review a new version of the fix [1] that makes
>>> OperatingSystemImpl methods return -1 if one of the metric has value 0.
>>> As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean
>>> indicating that methods could return -1 if the information is not available.
>>> There were no changes in CSR [3] yet, I plan to proceed with them after the fix is
>>> reviewed.
>>>> In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html
>>>> Shouldn?t you keep the IOException catch clauses in case the file is not found?
>>> There is no need in keeping IOException catch  in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods).
>>> As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException.
>>> Now these calls are performed inside  AccessController.doPrivileged(PrivilegedExceptionAction) that wraps
>>>   all checked exceptions in  PrivilegedActionException  that we are catching now instead of IOException.
>>> Here is the sampe of the stacktrace:
>>> java.security.PrivilegedActionException: java.io.FileNotFoundException
>>> 	at java.base/java.security.AccessController.doPrivileged(AccessController.java:558)
>>> 	at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113)
>>> 	at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390)
>>> 	at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109)
>>> 	at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36)
>>> Caused by: java.io.FileNotFoundException
>>> 	at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116)
>>> 	at java.base/java.security.AccessController.doPrivileged(AccessController.java:554)
>>> In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch
>>> IOException inside this code block or convert this  block  to  PrivilegedExceptionAction and then put AccessController.doPrivileged
>>> call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable.
>>> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
>>> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/
>>> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
>>> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
>>> Thanks,
>>> Daniil
>>> ?On 12/5/19, 12:59 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
>>>                On 12/5/19 12:50 PM, Bob Vandette wrote:
>>>      >
>>>      >>>> It may worth considering adding Metrics::getSwapLimit and
>>>      >>>> Metrics::getSwapUsage and move the computation to the implementation of
>>>      >>>> Metrics.  Bob may have an opinion.
>>>      >> There was no any new input regarding this so I decided to leave it unchanged.
>>>      > Sorry, I didn?t respond to this.  Since the calculation required for getFreeSwapSpaceSize requires retries
>>>      > due to the access of multiple changing values, I think it?s best to leave things as they are so the caller of
>>>      > these methods understands the limitations of the API.
>>>           OK with me.
>>>      > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker
>>>      > online documentation so it?s probably best to be consistent.
>>>      >
>>>      >>>> Also it seems correct for the memory related methods to check if
>>>      >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).
>>>      >>>> BTW what does it mean if limit == 0?
>>>      >> Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
>>>      >> And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
>>>      >> if something went wrong while retrieving this metric.
>>>      > That is true but shouldn?t you return -1 in that case?
>>>      >
>>>      > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1)
>>>      > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0.  This would be more consistent.
>>>           +1
>>>           The javadoc should be changed and returns -1 when it's unavailable and
>>>      the CSR should also be updated to reflect this.    I'm sure Joe can
>>>      re-approve the CSR quickly when the fix is reviewed and approved.
>>>           > You should only fall back to the original logic (host values) if container values are set to unlimited.
>>>      >
>>>      +1
>>>           Mandy
>>>      


From igor.ignatyev at oracle.com  Fri Dec  6 17:18:16 2019
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 6 Dec 2019 09:18:16 -0800
Subject: RFR(T) : 8235353 : clean up hotspot problem lists
In-Reply-To: <AM0PR0202MB32978B2022050232C96300399A5F0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
References: <F2800CA7-5FC0-4B32-A886-6F661694EA3F@oracle.com>
 <a70b3cea-f759-1c5d-e73d-464a7804cffc@oracle.com>
 <9405A3B3-8045-4863-9BD4-FD293E2C62F9@oracle.com>
 <DB8PR02MB55472BC0ECAA8DA1FEE18A0A8A5C0@DB8PR02MB5547.eurprd02.prod.outlook.com>
 <AM0PR0202MB32978B2022050232C96300399A5F0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
Message-ID: <FAABB115-BE46-420B-A06B-0A8111CA19DC@oracle.com>

Martin, Christoph,

thanks for verifying this.

pushed.

-- Igor

> On Dec 6, 2019, at 2:51 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
> 
> Hi Igor and Vladimir,
> 
> the tests have passed on PPC64. Change is good. Thanks for checking with us.
> 
> Best regards,
> Martin
> 
> 
>> -----Original Message-----
>> From: Langer, Christoph <christoph.langer at sap.com>
>> Sent: Donnerstag, 5. Dezember 2019 12:24
>> To: Igor Ignatyev <igor.ignatyev at oracle.com>; Doerr, Martin
>> <martin.doerr at sap.com>; Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com>
>> Cc: serviceability-dev <serviceability-dev at openjdk.java.net>; Vladimir
>> Kozlov <vladimir.kozlov at oracle.com>; hotspot-dev Source Developers
>> <hotspot-dev at openjdk.java.net>
>> Subject: RE: RFR(T) : 8235353 : clean up hotspot problem lists
>> 
>> Hi Igor,
>> 
>> I have added your update to our test system. I'll let you know the results by
>> tomorrow.
>> 
>> Best regards
>> Christoph
>> 
>>> -----Original Message-----
>>> From: serviceability-dev <serviceability-dev-bounces at openjdk.java.net>
>> On
>>> Behalf Of Igor Ignatyev
>>> Sent: Donnerstag, 5. Dezember 2019 03:08
>>> To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz
>>> <goetz.lindenmaier at sap.com>
>>> Cc: serviceability-dev <serviceability-dev at openjdk.java.net>; Vladimir
>>> Kozlov <vladimir.kozlov at oracle.com>; hotspot-dev Source Developers
>>> <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(T) : 8235353 : clean up hotspot problem lists
>>> 
>>> Martin, Goetz.
>>> 
>>> could you please check that these 9 tests still pass on PPC?
>>> 
>>> -- Igor
>>> 
>>>> On Dec 4, 2019, at 12:01 PM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com>
>>> wrote:
>>>> 
>>>> I am fine with changes but we need to ask PPC64 supporter to verify that
>>> tests passed now.
>>>> 
>>>> Thanks,
>>>> Vladimir K
>>>> 
>>>> On 12/4/19 11:52 AM, Igor Ignatyev wrote:
>>>>> http://cr.openjdk.java.net/~iignatyev//8235353/webrev.00
>>>>>> 9 lines changed: 0 ins; 0 del; 9 mod;
>>>>> Hi all,
>>>>> could you please review this small and trivial cleanup which returns
>>> serviceablility/sa tests back to execution on linux-ppc64. the tests were
>>> problem listed due to 8211767[1], which is closed as a dup of resolved
>>> 8228649[2].
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8211767
>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8228649
>>>>> Thanks,
>>>>> -- Igor
> 


From serguei.spitsyn at oracle.com  Fri Dec  6 18:21:54 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Dec 2019 10:21:54 -0800
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
 <abcd72c9-ac66-7641-f801-c12cb741564a@oracle.com>
 <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com>
Message-ID: <cce2a7b6-4d30-6916-86ab-d776e02a5f2c@oracle.com>

Hi Harold,

Okay, thanks!

Thanks,
Serguei


On 12/6/19 05:16, Harold Seigel wrote:
> There will be another unmodifiable attribute with sealed types called 
> PermittedSubtypes.
>
> Harold
>
> On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote:
>> Hi David,
>>
>> Agreed. I was thinking about the same.
>>
>> Thanks,
>> Serguei
>>
>> On 12/5/19 2:52 PM, David Holmes wrote:
>>> Looks good Harold!
>>>
>>> If we get any more of these unmodifiable attributes we may have to 
>>> look at a way to refer to them more abstractly and only define them 
>>> in one place.
>>>
>>> Thanks,
>>> David
>>>
>>> On 6/12/2019 12:28 am, Harold Seigel wrote:
>>>> Hi,
>>>>
>>>> Please review this trivial change to add documentation about the 
>>>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>>>
>>>> The changed .html pages (best viewed as 'raw') are included in the 
>>>> webrev but will not be pushed.
>>>>
>>>> Open Webrev: 
>>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>>>
>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>>>
>>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>>>> and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>>>
>>>> Thanks, Harold
>>>>
>>


From serguei.spitsyn at oracle.com  Fri Dec  6 18:27:28 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Dec 2019 10:27:28 -0800
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <cce2a7b6-4d30-6916-86ab-d776e02a5f2c@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
 <abcd72c9-ac66-7641-f801-c12cb741564a@oracle.com>
 <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com>
 <cce2a7b6-4d30-6916-86ab-d776e02a5f2c@oracle.com>
Message-ID: <4755bf60-4701-d1d2-9218-e72cd58ed9a6@oracle.com>

Forgot to ask.
Is this new attribute for 14?
Will it also come from Amber?

Thanks,
Serguei


On 12/6/19 10:21, serguei.spitsyn at oracle.com wrote:
> Hi Harold,
>
> Okay, thanks!
>
> Thanks,
> Serguei
>
>
> On 12/6/19 05:16, Harold Seigel wrote:
>> There will be another unmodifiable attribute with sealed types called 
>> PermittedSubtypes.
>>
>> Harold
>>
>> On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi David,
>>>
>>> Agreed. I was thinking about the same.
>>>
>>> Thanks,
>>> Serguei
>>>
>>> On 12/5/19 2:52 PM, David Holmes wrote:
>>>> Looks good Harold!
>>>>
>>>> If we get any more of these unmodifiable attributes we may have to 
>>>> look at a way to refer to them more abstractly and only define them 
>>>> in one place.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 6/12/2019 12:28 am, Harold Seigel wrote:
>>>>> Hi,
>>>>>
>>>>> Please review this trivial change to add documentation about the 
>>>>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>>>>
>>>>> The changed .html pages (best viewed as 'raw') are included in the 
>>>>> webrev but will not be pushed.
>>>>>
>>>>> Open Webrev: 
>>>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>>>>
>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>>>>
>>>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>>>>> and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>>>>
>>>>> Thanks, Harold
>>>>>
>>>
>


From harold.seigel at oracle.com  Fri Dec  6 18:29:12 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 6 Dec 2019 13:29:12 -0500
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <4755bf60-4701-d1d2-9218-e72cd58ed9a6@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
 <abcd72c9-ac66-7641-f801-c12cb741564a@oracle.com>
 <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com>
 <cce2a7b6-4d30-6916-86ab-d776e02a5f2c@oracle.com>
 <4755bf60-4701-d1d2-9218-e72cd58ed9a6@oracle.com>
Message-ID: <58015dfe-a587-c1ab-c5bc-453ad2cd13c6@oracle.com>

Hi Serguei,

 >> Is this new attribute for 14?

No.? 15, maybe?

 >>Will it also come from Amber?

Yes.

Harold

On 12/6/2019 1:27 PM, serguei.spitsyn at oracle.com wrote:
> Forgot to ask.
> Is this new attribute for 14?
> Will it also come from Amber?
>
> Thanks,
> Serguei
>
>
> On 12/6/19 10:21, serguei.spitsyn at oracle.com wrote:
>> Hi Harold,
>>
>> Okay, thanks!
>>
>> Thanks,
>> Serguei
>>
>>
>> On 12/6/19 05:16, Harold Seigel wrote:
>>> There will be another unmodifiable attribute with sealed types 
>>> called PermittedSubtypes.
>>>
>>> Harold
>>>
>>> On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote:
>>>> Hi David,
>>>>
>>>> Agreed. I was thinking about the same.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>> On 12/5/19 2:52 PM, David Holmes wrote:
>>>>> Looks good Harold!
>>>>>
>>>>> If we get any more of these unmodifiable attributes we may have to 
>>>>> look at a way to refer to them more abstractly and only define 
>>>>> them in one place.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> On 6/12/2019 12:28 am, Harold Seigel wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Please review this trivial change to add documentation about the 
>>>>>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>>>>>
>>>>>> The changed .html pages (best viewed as 'raw') are included in 
>>>>>> the webrev but will not be pushed.
>>>>>>
>>>>>> Open Webrev: 
>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>>>>>
>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>>>>>
>>>>>> The fix was regression tested by running Mach5 tiers 1 and 2 
>>>>>> tests and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>>>>>
>>>>>> Thanks, Harold
>>>>>>
>>>>
>>
>

From chris.plummer at oracle.com  Fri Dec  6 19:07:38 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 6 Dec 2019 11:07:38 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
Message-ID: <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>

On 12/5/19 6:45 PM, David Holmes wrote:
> Hi Serguei,
>
> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>> Hi Chris and Alex,
>>
>> (I've also included Dan, David and Dean to the mailing list)
>>
>> We have to reach a consensus about this.
>
> This is just part of a much broader issue with JVM TI that I tried to 
> have a discussion started based on Richard Reingruber's proposals 
> around Escape Analysis:
>
> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>
>
> Unfortunately that discussion did not get much traction.
Hmm. I have the emails that precede yours above, but not that one. Not 
sure how what happened. Just read through it and it did give me one 
thought. Consider a model where the program is designed drive behavior 
of the agent, triggering the agent to do certain things by having the 
program do certain things. Normally an agent monitors the application, 
but in this case the application is purposefully controlling actions 
performed by the agent. If code is elided from the program, then the 
agent no longer performs as expected. It's a kind of backwards jvmti 
programming model, and you may ask why would anyone do this. I'm not 
sure if there's a good reason for it, but should it be expected to work 
given how the spec is written?
>
>> We have 3 options:
>>
>> Option #1:
>> ?? The JIT optimization to delete a code which "looks useless"
>> ?? has to be disabled if can_pop_frame capability is enabled.
>> ?? Than this problem becomes a JIT compiler bug.
>>
>> Option #2:
>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>> something like:
>> ?? "Note however, that the original argument values are not
>> ??? preserved and can be changed by the called method;"
>> ?? Than this problem becomes a JVM TI spec bug.
>>
>> Option #3:
>> ?? Consider it is Okay for compiler to eliminate useless code,
>> ?? so the argument values can be reinitialized by the PopFrame.
>> ?? Than this problem becomes just a test bug.
>>
>>
>> My preference is option #3.
>> The point is that if the arguments are not really used in
>> a method then restoring them to any values is a no-op.
>> It is really meaningless use case, so why should we care about it.
Is "restoring" the proper term here? I thought they were just left on 
the stack and reused on the subsequent invoke. In fact I figured the 
reason for the language in the spec in the first place is to alleviate 
JVMTI from having to restore them to their original values, which is 
probably not even possible.
>
> Thanks for setting that out clearly.
>
> I'd like to agree this is particular case is a test bug. If we have a 
> method:
>
> int incr(int val) {
> ? val++;
> ? popFrameHere();
> ? return val;
> }
>
> then the change to the argument is necessary and must be preserved. In 
> contrast:
>
> void incr(int val) {
> ? val++;
> ? popFrameHere();
> }
>
> the change to the argument is meaningless and I would hope any decent 
> JIT would simply elide it.
So, this goes back to my example above where the program is trying to 
elicit behavior from the agent. It's not meaningless in that case, but 
that doesn't mean I think we need to support it.
>
> But we must have a consistent approach to such things. What would 
> happen if a breakpoint were to be placed on the instruction that 
> uselessly modified the argument - would we still see the modification 
> or would it be elided?
Breakpoints force interpreted mode for the method, although I suppose 
that's a hotspot implementation detail and not something a VM would be 
required to do. A VM that allows breakpoints in compiled methods has the 
potential to miss the breakpoint if code is elided.

Also, what if you put a breakpoint in a method, the call to it is 
elided. You would never hit the breakpoint. That could cause some 
serious head scratching for a debugger user if they know the code doing 
the method call is "executed".

>
> And how do C1 and C2 avoid this issue? Do they simply not optimise 
> away the useless assignment? Or do they actively disable that 
> optimization in this context?
>
> We need, IMO, to establish the basic philosophy of how to manage JVM 
> TI / JIT interactions, so we know what things must remain visible and 
> which can be optimised away.
>
> That said, changing the test allows us to defer having to reach that 
> consensus.
Agreed. I think it's ok to work around the test issue as long as we keep 
this overall issue on the radar. Do we have a bug field for that?

thanks,

Chris
>
> David
> -----
>
>> Thanks,
>> Serguei
>>
>>
>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>> Hi Alex,
>>>
>>> The fix itself looks Okay.
>>> Minor: replace in the comment: "compiler don't drop" => "compiler 
>>> doesn't drop".
>>>
>>> However, we still have to reach a consensus on how we treat this 
>>> issue (as Chris already commented).
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>> Hi all,
>>>>
>>>> Please review the fix for
>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>> webrev:
>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>
>>>> Currently PopFrame is disabled with JVMCI by [1], so for testing I 
>>>> reverted [1] changes.
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>
>>>> --alex
>>>
>>


From serguei.spitsyn at oracle.com  Fri Dec  6 20:33:17 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Dec 2019 12:33:17 -0800
Subject: RFR 8235360: Update JDWP, JDI and Instrumentation specs for
 Record attribute
In-Reply-To: <58015dfe-a587-c1ab-c5bc-453ad2cd13c6@oracle.com>
References: <d4ee7dcd-74a5-200e-9c92-260dea90c8b3@oracle.com>
 <656ff89c-6834-0984-4c21-9ca889a44015@oracle.com>
 <abcd72c9-ac66-7641-f801-c12cb741564a@oracle.com>
 <30ebf715-d189-3ff6-a024-3529b2caf626@oracle.com>
 <cce2a7b6-4d30-6916-86ab-d776e02a5f2c@oracle.com>
 <4755bf60-4701-d1d2-9218-e72cd58ed9a6@oracle.com>
 <58015dfe-a587-c1ab-c5bc-453ad2cd13c6@oracle.com>
Message-ID: <a18e0ad7-eeaa-5a03-ffa4-8867ecbb3d21@oracle.com>

Thanks, Harold.
Serguei

On 12/6/19 10:29, Harold Seigel wrote:
> Hi Serguei,
>
> >> Is this new attribute for 14?
>
> No.? 15, maybe?
>
> >>Will it also come from Amber?
>
> Yes.
>
> Harold
>
> On 12/6/2019 1:27 PM, serguei.spitsyn at oracle.com wrote:
>> Forgot to ask.
>> Is this new attribute for 14?
>> Will it also come from Amber?
>>
>> Thanks,
>> Serguei
>>
>>
>> On 12/6/19 10:21, serguei.spitsyn at oracle.com wrote:
>>> Hi Harold,
>>>
>>> Okay, thanks!
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 12/6/19 05:16, Harold Seigel wrote:
>>>> There will be another unmodifiable attribute with sealed types 
>>>> called PermittedSubtypes.
>>>>
>>>> Harold
>>>>
>>>> On 12/5/2019 7:25 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi David,
>>>>>
>>>>> Agreed. I was thinking about the same.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>> On 12/5/19 2:52 PM, David Holmes wrote:
>>>>>> Looks good Harold!
>>>>>>
>>>>>> If we get any more of these unmodifiable attributes we may have 
>>>>>> to look at a way to refer to them more abstractly and only define 
>>>>>> them in one place.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 6/12/2019 12:28 am, Harold Seigel wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please review this trivial change to add documentation about the 
>>>>>>> Record attribute to the JDWP, JDI, and Instrumentation specs.
>>>>>>>
>>>>>>> The changed .html pages (best viewed as 'raw') are included in 
>>>>>>> the webrev but will not be pushed.
>>>>>>>
>>>>>>> Open Webrev: 
>>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8235360/webrev/index.html
>>>>>>>
>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235360
>>>>>>>
>>>>>>> The fix was regression tested by running Mach5 tiers 1 and 2 
>>>>>>> tests and builds on Linux-x64, Solaris, Windows, and Mac OS X.
>>>>>>>
>>>>>>> Thanks, Harold
>>>>>>>
>>>>>
>>>
>>


From serguei.spitsyn at oracle.com  Fri Dec  6 21:18:30 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Dec 2019 13:18:30 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
Message-ID: <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>

On 12/6/19 11:07, Chris Plummer wrote:
> On 12/5/19 6:45 PM, David Holmes wrote:
>> Hi Serguei,
>>
>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>>> Hi Chris and Alex,
>>>
>>> (I've also included Dan, David and Dean to the mailing list)
>>>
>>> We have to reach a consensus about this.
>>
>> This is just part of a much broader issue with JVM TI that I tried to 
>> have a discussion started based on Richard Reingruber's proposals 
>> around Escape Analysis:
>>
>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>>
>>
>> Unfortunately that discussion did not get much traction.
> Hmm. I have the emails that precede yours above, but not that one. Not 
> sure how what happened. Just read through it and it did give me one 
> thought.

> Consider a model where the program is designed drive behavior of the 
> agent, triggering the agent to do certain things by having the program 
> do certain things. Normally an agent monitors the application, but in 
> this case the application is purposefully controlling actions 
> performed by the agent. If code is elided from the program, then the 
> agent no longer performs as expected. It's a kind of backwards jvmti 
> programming model, and you may ask why would anyone do this. I'm not 
> sure if there's a good reason for it, but should it be expected to 
> work given how the spec is written?

My interpretation is that the current JVM TI PopFrame behavior does not 
break this model.
The spec says: "any changes to the arguments, which occurred in the 
called method, remain;"
As the code was eliminated by the compiler then no changes to this 
argument occurred.
So, the PopFrame behavior follows the spec. So, I think, the option #2 
is not right. But it depends on our basic philosophy.
If the developer wants to control the agent then the program has to be 
designed to do something meaningful that is not going to be optimized 
out by the JIT compiler.

>>
>>> We have 3 options:
>>>
>>> Option #1:
>>> ?? The JIT optimization to delete a code which "looks useless"
>>> ?? has to be disabled if can_pop_frame capability is enabled.
>>> ?? Than this problem becomes a JIT compiler bug.
>>>
>>> Option #2:
>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>>> something like:
>>> ?? "Note however, that the original argument values are not
>>> ??? preserved and can be changed by the called method;"
>>> ?? Than this problem becomes a JVM TI spec bug.
>>>
>>> Option #3:
>>> ?? Consider it is Okay for compiler to eliminate useless code,
>>> ?? so the argument values can be reinitialized by the PopFrame.
>>> ?? Than this problem becomes just a test bug.
>>>
>>>
>>> My preference is option #3.
>>> The point is that if the arguments are not really used in
>>> a method then restoring them to any values is a no-op.
>>> It is really meaningless use case, so why should we care about it.
> Is "restoring" the proper term here? I thought they were just left on 
> the stack and reused on the subsequent invoke.

Agreed. The term "restoring" is not accurate here.

> In fact I figured the reason for the language in the spec in the first 
> place is to alleviate JVMTI from having to restore them to their 
> original values, which is probably not even possible.

Right.

>>
>> Thanks for setting that out clearly.
>>
>> I'd like to agree this is particular case is a test bug. If we have a 
>> method:
>>
>> int incr(int val) {
>> ? val++;
>> ? popFrameHere();
>> ? return val;
>> }
>>
>> then the change to the argument is necessary and must be preserved. 
>> In contrast:
>>
>> void incr(int val) {
>> ? val++;
>> ? popFrameHere();
>> }
>>
>> the change to the argument is meaningless and I would hope any decent 
>> JIT would simply elide it.
> So, this goes back to my example above where the program is trying to 
> elicit behavior from the agent. It's not meaningless in that case, but 
> that doesn't mean I think we need to support it.

Even with this model it is possible and better to do something 
meaningful to control the agent.
This model is very rare use case.
It is hard to justify a need to support it. :)

>>
>> But we must have a consistent approach to such things. What would 
>> happen if a breakpoint were to be placed on the instruction that 
>> uselessly modified the argument - would we still see the modification 
>> or would it be elided?
> Breakpoints force interpreted mode for the method, although I suppose 
> that's a hotspot implementation detail and not something a VM would be 
> required to do. A VM that allows breakpoints in compiled methods has 
> the potential to miss the breakpoint if code is elided.
>
> Also, what if you put a breakpoint in a method, the call to it is 
> elided. You would never hit the breakpoint. That could cause some 
> serious head scratching for a debugger user if they know the code 
> doing the method call is "executed".

If the method is not actually being called then missing breakpoints 
there gives a clue what is going on.
Otherwise, it will cause cause some serious head scratching for a 
debugger user.
In general, my preference would be to debug actual behavior.
It is not good we have no support breakpoints in compiled methods.


>>
>> And how do C1 and C2 avoid this issue? Do they simply not optimise 
>> away the useless assignment? Or do they actively disable that 
>> optimization in this context?
>>
>> We need, IMO, to establish the basic philosophy of how to manage JVM 
>> TI / JIT interactions, so we know what things must remain visible and 
>> which can be optimised away.
>>
>> That said, changing the test allows us to defer having to reach that 
>> consensus.
> Agreed. I think it's ok to work around the test issue as long as we 
> keep this overall issue on the radar. Do we have a bug field for that?

I thought, it is a little bit early to file a bug for it.
Also, probably, it can be an umbrella enhancement or task.

Thanks,
Serguei

>
> thanks,
>
> Chris
>>
>> David
>> -----
>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>>> Hi Alex,
>>>>
>>>> The fix itself looks Okay.
>>>> Minor: replace in the comment: "compiler don't drop" => "compiler 
>>>> doesn't drop".
>>>>
>>>> However, we still have to reach a consensus on how we treat this 
>>>> issue (as Chris already commented).
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>>> Hi all,
>>>>>
>>>>> Please review the fix for
>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>>
>>>>> Currently PopFrame is disabled with JVMCI by [1], so for testing I 
>>>>> reverted [1] changes.
>>>>>
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>>
>>>>> --alex
>>>>
>>>
>
>


From mandy.chung at oracle.com  Fri Dec  6 21:38:22 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 6 Dec 2019 13:38:22 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
Message-ID: <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>


On 12/6/19 5:59 AM, Bob Vandette wrote:
>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>>
>>
>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>>
>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.

I thought that the error case we are referring to is limit == 0 which 
indicates something unexpected goes wrong.? So the compatibility concern 
should be low.? This is very specific to Metrics implementation for 
cgroup v1 and let me know if I'm wrong.

>> Surely there must always be some information available from the operating environment? I see from the impl file:
>>
>>     // the host data, value 0 indicates that something went wrong while the metric was read and
>>    // in this case we return "information unavailable" code -1.
>>
>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
> I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
> limits.
>

It's important to consider carefully if the monitoring API indicates an 
error vs unavailable and an application should continue to run when the 
monitoring system fails to get the metrics.

There are several choices to report "something goes wrong" scenarios 
(should unlikely happen???):
1. fall back to a random positive value? (e.g. host value)
2. return a negative value
3. throw an exception

#3 is not an option as the application is not expecting this.? For #2, 
the application can filter bad values if desirable.

I'm okay if you want to file a JBS issue to follow up and thoroughly 
look at the cases that the metrics are unavailable and the cases when 
fails to obtain.

>> ---
>>
>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>
>> System.out.println(String.format(...)
>>
>> Why not simply
>>
>> System.out.printf(..)
>>
>> ?

or simply (as I commented [1])
 ??? System.out.format

Mandy
[1] 
https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html


From chris.plummer at oracle.com  Fri Dec  6 21:52:32 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 6 Dec 2019 13:52:32 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
Message-ID: <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>

On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote:
> On 12/6/19 11:07, Chris Plummer wrote:
>> On 12/5/19 6:45 PM, David Holmes wrote:
>>> Hi Serguei,
>>>
>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>>>> Hi Chris and Alex,
>>>>
>>>> (I've also included Dan, David and Dean to the mailing list)
>>>>
>>>> We have to reach a consensus about this.
>>>
>>> This is just part of a much broader issue with JVM TI that I tried 
>>> to have a discussion started based on Richard Reingruber's proposals 
>>> around Escape Analysis:
>>>
>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>>>
>>>
>>> Unfortunately that discussion did not get much traction.
>> Hmm. I have the emails that precede yours above, but not that one. 
>> Not sure how what happened. Just read through it and it did give me 
>> one thought.
>
>> Consider a model where the program is designed drive behavior of the 
>> agent, triggering the agent to do certain things by having the 
>> program do certain things. Normally an agent monitors the 
>> application, but in this case the application is purposefully 
>> controlling actions performed by the agent. If code is elided from 
>> the program, then the agent no longer performs as expected. It's a 
>> kind of backwards jvmti programming model, and you may ask why would 
>> anyone do this. I'm not sure if there's a good reason for it, but 
>> should it be expected to work given how the spec is written?
>
> My interpretation is that the current JVM TI PopFrame behavior does 
> not break this model.
> The spec says: "any changes to the arguments, which occurred in the 
> called method, remain;"
> As the code was eliminated by the compiler then no changes to this 
> argument occurred.
> So, the PopFrame behavior follows the spec. So, I think, the option #2 
> is not right. But it depends on our basic philosophy.
> If the developer wants to control the agent then the program has to be 
> designed to do something meaningful that is not going to be optimized 
> out by the JIT compiler.
You misunderstood my point. What I'm saying is that someone might do 
something like assign to a local with the specific intent of having that 
trigger a jmvti event, with the specific intent of having the agent 
perform some expected action as a result. Think of it as being a trigger 
for the agent, not as the agent monitoring the app. For example, you 
could right a program + agent, and setting a specific local in the 
program triggers the agent to turn on a light, and setting some other 
local turns it off. Absurd, but possible, and maybe there are less 
absurd applications.

Chris
>
>>>
>>>> We have 3 options:
>>>>
>>>> Option #1:
>>>> ?? The JIT optimization to delete a code which "looks useless"
>>>> ?? has to be disabled if can_pop_frame capability is enabled.
>>>> ?? Than this problem becomes a JIT compiler bug.
>>>>
>>>> Option #2:
>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>>>> something like:
>>>> ?? "Note however, that the original argument values are not
>>>> ??? preserved and can be changed by the called method;"
>>>> ?? Than this problem becomes a JVM TI spec bug.
>>>>
>>>> Option #3:
>>>> ?? Consider it is Okay for compiler to eliminate useless code,
>>>> ?? so the argument values can be reinitialized by the PopFrame.
>>>> ?? Than this problem becomes just a test bug.
>>>>
>>>>
>>>> My preference is option #3.
>>>> The point is that if the arguments are not really used in
>>>> a method then restoring them to any values is a no-op.
>>>> It is really meaningless use case, so why should we care about it.
>> Is "restoring" the proper term here? I thought they were just left on 
>> the stack and reused on the subsequent invoke.
>
> Agreed. The term "restoring" is not accurate here.
>
>> In fact I figured the reason for the language in the spec in the 
>> first place is to alleviate JVMTI from having to restore them to 
>> their original values, which is probably not even possible.
>
> Right.
>
>>>
>>> Thanks for setting that out clearly.
>>>
>>> I'd like to agree this is particular case is a test bug. If we have 
>>> a method:
>>>
>>> int incr(int val) {
>>> ? val++;
>>> ? popFrameHere();
>>> ? return val;
>>> }
>>>
>>> then the change to the argument is necessary and must be preserved. 
>>> In contrast:
>>>
>>> void incr(int val) {
>>> ? val++;
>>> ? popFrameHere();
>>> }
>>>
>>> the change to the argument is meaningless and I would hope any 
>>> decent JIT would simply elide it.
>> So, this goes back to my example above where the program is trying to 
>> elicit behavior from the agent. It's not meaningless in that case, 
>> but that doesn't mean I think we need to support it.
>
> Even with this model it is possible and better to do something 
> meaningful to control the agent.
> This model is very rare use case.
> It is hard to justify a need to support it. :)
>
>>>
>>> But we must have a consistent approach to such things. What would 
>>> happen if a breakpoint were to be placed on the instruction that 
>>> uselessly modified the argument - would we still see the 
>>> modification or would it be elided?
>> Breakpoints force interpreted mode for the method, although I suppose 
>> that's a hotspot implementation detail and not something a VM would 
>> be required to do. A VM that allows breakpoints in compiled methods 
>> has the potential to miss the breakpoint if code is elided.
>>
>> Also, what if you put a breakpoint in a method, the call to it is 
>> elided. You would never hit the breakpoint. That could cause some 
>> serious head scratching for a debugger user if they know the code 
>> doing the method call is "executed".
>
> If the method is not actually being called then missing breakpoints 
> there gives a clue what is going on.
> Otherwise, it will cause cause some serious head scratching for a 
> debugger user.
> In general, my preference would be to debug actual behavior.
> It is not good we have no support breakpoints in compiled methods.
>
>
>>>
>>> And how do C1 and C2 avoid this issue? Do they simply not optimise 
>>> away the useless assignment? Or do they actively disable that 
>>> optimization in this context?
>>>
>>> We need, IMO, to establish the basic philosophy of how to manage JVM 
>>> TI / JIT interactions, so we know what things must remain visible 
>>> and which can be optimised away.
>>>
>>> That said, changing the test allows us to defer having to reach that 
>>> consensus.
>> Agreed. I think it's ok to work around the test issue as long as we 
>> keep this overall issue on the radar. Do we have a bug field for that?
>
> I thought, it is a little bit early to file a bug for it.
> Also, probably, it can be an umbrella enhancement or task.
>
> Thanks,
> Serguei
>
>>
>> thanks,
>>
>> Chris
>>>
>>> David
>>> -----
>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Alex,
>>>>>
>>>>> The fix itself looks Okay.
>>>>> Minor: replace in the comment: "compiler don't drop" => "compiler 
>>>>> doesn't drop".
>>>>>
>>>>> However, we still have to reach a consensus on how we treat this 
>>>>> issue (as Chris already commented).
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Please review the fix for
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>>>> webrev:
>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>>>
>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for testing 
>>>>>> I reverted [1] changes.
>>>>>>
>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>>>
>>>>>> --alex
>>>>>
>>>>
>>
>>
>


From dean.long at oracle.com  Fri Dec  6 22:59:05 2019
From: dean.long at oracle.com (Dean Long)
Date: Fri, 6 Dec 2019 14:59:05 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
 <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
Message-ID: <d21b432e-eef4-22a8-e2a3-cda2cdc4afaa@oracle.com>

This might be a dumb question, but how is PopFrame used in practice?? 
Re-invoking the method, especially with modified argument values seems 
dangerous.

dl

From serguei.spitsyn at oracle.com  Fri Dec  6 23:26:34 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Dec 2019 15:26:34 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
 <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
Message-ID: <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com>

On 12/6/19 13:52, Chris Plummer wrote:
> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote:
>> On 12/6/19 11:07, Chris Plummer wrote:
>>> On 12/5/19 6:45 PM, David Holmes wrote:
>>>> Hi Serguei,
>>>>
>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Chris and Alex,
>>>>>
>>>>> (I've also included Dan, David and Dean to the mailing list)
>>>>>
>>>>> We have to reach a consensus about this.
>>>>
>>>> This is just part of a much broader issue with JVM TI that I tried 
>>>> to have a discussion started based on Richard Reingruber's 
>>>> proposals around Escape Analysis:
>>>>
>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>>>>
>>>>
>>>> Unfortunately that discussion did not get much traction.
>>> Hmm. I have the emails that precede yours above, but not that one. 
>>> Not sure how what happened. Just read through it and it did give me 
>>> one thought.
>>
>>> Consider a model where the program is designed drive behavior of the 
>>> agent, triggering the agent to do certain things by having the 
>>> program do certain things. Normally an agent monitors the 
>>> application, but in this case the application is purposefully 
>>> controlling actions performed by the agent. If code is elided from 
>>> the program, then the agent no longer performs as expected. It's a 
>>> kind of backwards jvmti programming model, and you may ask why would 
>>> anyone do this. I'm not sure if there's a good reason for it, but 
>>> should it be expected to work given how the spec is written?
>>
>> My interpretation is that the current JVM TI PopFrame behavior does 
>> not break this model.
>> The spec says: "any changes to the arguments, which occurred in the 
>> called method, remain;"
>> As the code was eliminated by the compiler then no changes to this 
>> argument occurred.
>> So, the PopFrame behavior follows the spec. So, I think, the option 
>> #2 is not right. But it depends on our basic philosophy.
>> If the developer wants to control the agent then the program has to 
>> be designed to do something meaningful that is not going to be 
>> optimized out by the JIT compiler.
> You misunderstood my point. What I'm saying is that someone might do 
> something like assign to a local with the specific intent of having 
> that trigger a jmvti event, with the specific intent of having the 
> agent perform some expected action as a result. Think of it as being a 
> trigger for the agent, not as the agent monitoring the app. For 
> example, you could right a program + agent, and setting a specific 
> local in the program triggers the agent to turn on a light, and 
> setting some other local turns it off. Absurd, but possible, and maybe 
> there are less absurd applications.

I think, I understood your point correctly.
Your point is that the code that can be eliminated (e.g. local++) is not 
that meaningless as it seems to be.
My point is that there are still other more reliable ways to trigger the 
agent.
So that relying on something that can be eliminated by JIT compilers is 
not important to support.

Thanks,
Serguei

> Chris
>>
>>>>
>>>>> We have 3 options:
>>>>>
>>>>> Option #1:
>>>>> ?? The JIT optimization to delete a code which "looks useless"
>>>>> ?? has to be disabled if can_pop_frame capability is enabled.
>>>>> ?? Than this problem becomes a JIT compiler bug.
>>>>>
>>>>> Option #2:
>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>>>>> something like:
>>>>> ?? "Note however, that the original argument values are not
>>>>> ??? preserved and can be changed by the called method;"
>>>>> ?? Than this problem becomes a JVM TI spec bug.
>>>>>
>>>>> Option #3:
>>>>> ?? Consider it is Okay for compiler to eliminate useless code,
>>>>> ?? so the argument values can be reinitialized by the PopFrame.
>>>>> ?? Than this problem becomes just a test bug.
>>>>>
>>>>>
>>>>> My preference is option #3.
>>>>> The point is that if the arguments are not really used in
>>>>> a method then restoring them to any values is a no-op.
>>>>> It is really meaningless use case, so why should we care about it.
>>> Is "restoring" the proper term here? I thought they were just left 
>>> on the stack and reused on the subsequent invoke.
>>
>> Agreed. The term "restoring" is not accurate here.
>>
>>> In fact I figured the reason for the language in the spec in the 
>>> first place is to alleviate JVMTI from having to restore them to 
>>> their original values, which is probably not even possible.
>>
>> Right.
>>
>>>>
>>>> Thanks for setting that out clearly.
>>>>
>>>> I'd like to agree this is particular case is a test bug. If we have 
>>>> a method:
>>>>
>>>> int incr(int val) {
>>>> ? val++;
>>>> ? popFrameHere();
>>>> ? return val;
>>>> }
>>>>
>>>> then the change to the argument is necessary and must be preserved. 
>>>> In contrast:
>>>>
>>>> void incr(int val) {
>>>> ? val++;
>>>> ? popFrameHere();
>>>> }
>>>>
>>>> the change to the argument is meaningless and I would hope any 
>>>> decent JIT would simply elide it.
>>> So, this goes back to my example above where the program is trying 
>>> to elicit behavior from the agent. It's not meaningless in that 
>>> case, but that doesn't mean I think we need to support it.
>>
>> Even with this model it is possible and better to do something 
>> meaningful to control the agent.
>> This model is very rare use case.
>> It is hard to justify a need to support it. :)
>>
>>>>
>>>> But we must have a consistent approach to such things. What would 
>>>> happen if a breakpoint were to be placed on the instruction that 
>>>> uselessly modified the argument - would we still see the 
>>>> modification or would it be elided?
>>> Breakpoints force interpreted mode for the method, although I 
>>> suppose that's a hotspot implementation detail and not something a 
>>> VM would be required to do. A VM that allows breakpoints in compiled 
>>> methods has the potential to miss the breakpoint if code is elided.
>>>
>>> Also, what if you put a breakpoint in a method, the call to it is 
>>> elided. You would never hit the breakpoint. That could cause some 
>>> serious head scratching for a debugger user if they know the code 
>>> doing the method call is "executed".
>>
>> If the method is not actually being called then missing breakpoints 
>> there gives a clue what is going on.
>> Otherwise, it will cause cause some serious head scratching for a 
>> debugger user.
>> In general, my preference would be to debug actual behavior.
>> It is not good we have no support breakpoints in compiled methods.
>>
>>
>>>>
>>>> And how do C1 and C2 avoid this issue? Do they simply not optimise 
>>>> away the useless assignment? Or do they actively disable that 
>>>> optimization in this context?
>>>>
>>>> We need, IMO, to establish the basic philosophy of how to manage 
>>>> JVM TI / JIT interactions, so we know what things must remain 
>>>> visible and which can be optimised away.
>>>>
>>>> That said, changing the test allows us to defer having to reach 
>>>> that consensus.
>>> Agreed. I think it's ok to work around the test issue as long as we 
>>> keep this overall issue on the radar. Do we have a bug field for that?
>>
>> I thought, it is a little bit early to file a bug for it.
>> Also, probably, it can be an umbrella enhancement or task.
>>
>> Thanks,
>> Serguei
>>
>>>
>>> thanks,
>>>
>>> Chris
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Alex,
>>>>>>
>>>>>> The fix itself looks Okay.
>>>>>> Minor: replace in the comment: "compiler don't drop" => "compiler 
>>>>>> doesn't drop".
>>>>>>
>>>>>> However, we still have to reach a consensus on how we treat this 
>>>>>> issue (as Chris already commented).
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Please review the fix for
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>>>>> webrev:
>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>>>>
>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for testing 
>>>>>>> I reverted [1] changes.
>>>>>>>
>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>>>>
>>>>>>> --alex
>>>>>>
>>>>>
>>>
>>>
>>
>
>


From serguei.spitsyn at oracle.com  Fri Dec  6 23:39:26 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Dec 2019 15:39:26 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <d21b432e-eef4-22a8-e2a3-cda2cdc4afaa@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
 <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
 <d21b432e-eef4-22a8-e2a3-cda2cdc4afaa@oracle.com>
Message-ID: <5c3e704e-7b2c-f72c-1342-2af7d16af53c@oracle.com>

The PopFrame together with RedefineClasses is a part of the JVM TI 
HotSwap feature.
The use case is to hot patch the methods.
If after class redefinition there are still some method frames then the 
PopFrame is an option to "refresh" such frames.
I agree, this is unreliable and dangerous.
But the whole class redefinition feature is somewhat dangerous. :)
It is because the responsibility is on the agents.
And there are many ways for the agents to break the methods execution 
semantics with redefinition.

Thanks,
Serguei


On 12/6/19 14:59, Dean Long wrote:
> This might be a dumb question, but how is PopFrame used in practice? 
> Re-invoking the method, especially with modified argument values seems 
> dangerous.
>
> dl


From daniel.daugherty at oracle.com  Sat Dec  7 01:24:11 2019
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 6 Dec 2019 20:24:11 -0500
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
 <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
 <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com>
Message-ID: <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com>

On 12/6/19 6:26 PM, serguei.spitsyn at oracle.com wrote:
> On 12/6/19 13:52, Chris Plummer wrote:
>> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote:
>>> On 12/6/19 11:07, Chris Plummer wrote:
>>>> On 12/5/19 6:45 PM, David Holmes wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Chris and Alex,
>>>>>>
>>>>>> (I've also included Dan, David and Dean to the mailing list)
>>>>>>
>>>>>> We have to reach a consensus about this.
>>>>>
>>>>> This is just part of a much broader issue with JVM TI that I tried 
>>>>> to have a discussion started based on Richard Reingruber's 
>>>>> proposals around Escape Analysis:
>>>>>
>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>>>>>
>>>>>
>>>>> Unfortunately that discussion did not get much traction.
>>>> Hmm. I have the emails that precede yours above, but not that one. 
>>>> Not sure how what happened. Just read through it and it did give me 
>>>> one thought.
>>>
>>>> Consider a model where the program is designed drive behavior of 
>>>> the agent, triggering the agent to do certain things by having the 
>>>> program do certain things. Normally an agent monitors the 
>>>> application, but in this case the application is purposefully 
>>>> controlling actions performed by the agent. If code is elided from 
>>>> the program, then the agent no longer performs as expected. It's a 
>>>> kind of backwards jvmti programming model, and you may ask why 
>>>> would anyone do this. I'm not sure if there's a good reason for it, 
>>>> but should it be expected to work given how the spec is written?
>>>
>>> My interpretation is that the current JVM TI PopFrame behavior does 
>>> not break this model.
>>> The spec says: "any changes to the arguments, which occurred in the 
>>> called method, remain;"
>>> As the code was eliminated by the compiler then no changes to this 
>>> argument occurred.
>>> So, the PopFrame behavior follows the spec. So, I think, the option 
>>> #2 is not right. But it depends on our basic philosophy.
>>> If the developer wants to control the agent then the program has to 
>>> be designed to do something meaningful that is not going to be 
>>> optimized out by the JIT compiler.
>> You misunderstood my point. What I'm saying is that someone might do 
>> something like assign to a local with the specific intent of having 
>> that trigger a jmvti event, with the specific intent of having the 
>> agent perform some expected action as a result. Think of it as being 
>> a trigger for the agent, not as the agent monitoring the app. For 
>> example, you could right a program + agent, and setting a specific 
>> local in the program triggers the agent to turn on a light, and 
>> setting some other local turns it off. Absurd, but possible, and 
>> maybe there are less absurd applications.
>
> I think, I understood your point correctly.
> Your point is that the code that can be eliminated (e.g. local++) is 
> not that meaningless as it seems to be.
> My point is that there are still other more reliable ways to trigger 
> the agent.
> So that relying on something that can be eliminated by JIT compilers 
> is not important to support.

You are making the assumption that the agent author understands what
Java code/variables *might* be eliminated by the JIT compiler. I don't
think that's a good assumption. I might have code that does a really
complicated thing in a local variable that is only useful to the
agent itself. The JIT will see that the local variable cannot escape
the function and is not used outside the function (as far as it can
see) so it will elide the local variable and the code that was used
to generated the local result in the variable.

If that local result happens to be some computation that the agent
needed to see to do its next operation...

Dan


>
> Thanks,
> Serguei
>
>> Chris
>>>
>>>>>
>>>>>> We have 3 options:
>>>>>>
>>>>>> Option #1:
>>>>>> ?? The JIT optimization to delete a code which "looks useless"
>>>>>> ?? has to be disabled if can_pop_frame capability is enabled.
>>>>>> ?? Than this problem becomes a JIT compiler bug.
>>>>>>
>>>>>> Option #2:
>>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>>>>>> something like:
>>>>>> ?? "Note however, that the original argument values are not
>>>>>> ??? preserved and can be changed by the called method;"
>>>>>> ?? Than this problem becomes a JVM TI spec bug.
>>>>>>
>>>>>> Option #3:
>>>>>> ?? Consider it is Okay for compiler to eliminate useless code,
>>>>>> ?? so the argument values can be reinitialized by the PopFrame.
>>>>>> ?? Than this problem becomes just a test bug.
>>>>>>
>>>>>>
>>>>>> My preference is option #3.
>>>>>> The point is that if the arguments are not really used in
>>>>>> a method then restoring them to any values is a no-op.
>>>>>> It is really meaningless use case, so why should we care about it.
>>>> Is "restoring" the proper term here? I thought they were just left 
>>>> on the stack and reused on the subsequent invoke.
>>>
>>> Agreed. The term "restoring" is not accurate here.
>>>
>>>> In fact I figured the reason for the language in the spec in the 
>>>> first place is to alleviate JVMTI from having to restore them to 
>>>> their original values, which is probably not even possible.
>>>
>>> Right.
>>>
>>>>>
>>>>> Thanks for setting that out clearly.
>>>>>
>>>>> I'd like to agree this is particular case is a test bug. If we 
>>>>> have a method:
>>>>>
>>>>> int incr(int val) {
>>>>> ? val++;
>>>>> ? popFrameHere();
>>>>> ? return val;
>>>>> }
>>>>>
>>>>> then the change to the argument is necessary and must be 
>>>>> preserved. In contrast:
>>>>>
>>>>> void incr(int val) {
>>>>> ? val++;
>>>>> ? popFrameHere();
>>>>> }
>>>>>
>>>>> the change to the argument is meaningless and I would hope any 
>>>>> decent JIT would simply elide it.
>>>> So, this goes back to my example above where the program is trying 
>>>> to elicit behavior from the agent. It's not meaningless in that 
>>>> case, but that doesn't mean I think we need to support it.
>>>
>>> Even with this model it is possible and better to do something 
>>> meaningful to control the agent.
>>> This model is very rare use case.
>>> It is hard to justify a need to support it. :)
>>>
>>>>>
>>>>> But we must have a consistent approach to such things. What would 
>>>>> happen if a breakpoint were to be placed on the instruction that 
>>>>> uselessly modified the argument - would we still see the 
>>>>> modification or would it be elided?
>>>> Breakpoints force interpreted mode for the method, although I 
>>>> suppose that's a hotspot implementation detail and not something a 
>>>> VM would be required to do. A VM that allows breakpoints in 
>>>> compiled methods has the potential to miss the breakpoint if code 
>>>> is elided.
>>>>
>>>> Also, what if you put a breakpoint in a method, the call to it is 
>>>> elided. You would never hit the breakpoint. That could cause some 
>>>> serious head scratching for a debugger user if they know the code 
>>>> doing the method call is "executed".
>>>
>>> If the method is not actually being called then missing breakpoints 
>>> there gives a clue what is going on.
>>> Otherwise, it will cause cause some serious head scratching for a 
>>> debugger user.
>>> In general, my preference would be to debug actual behavior.
>>> It is not good we have no support breakpoints in compiled methods.
>>>
>>>
>>>>>
>>>>> And how do C1 and C2 avoid this issue? Do they simply not optimise 
>>>>> away the useless assignment? Or do they actively disable that 
>>>>> optimization in this context?
>>>>>
>>>>> We need, IMO, to establish the basic philosophy of how to manage 
>>>>> JVM TI / JIT interactions, so we know what things must remain 
>>>>> visible and which can be optimised away.
>>>>>
>>>>> That said, changing the test allows us to defer having to reach 
>>>>> that consensus.
>>>> Agreed. I think it's ok to work around the test issue as long as we 
>>>> keep this overall issue on the radar. Do we have a bug field for that?
>>>
>>> I thought, it is a little bit early to file a bug for it.
>>> Also, probably, it can be an umbrella enhancement or task.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Alex,
>>>>>>>
>>>>>>> The fix itself looks Okay.
>>>>>>> Minor: replace in the comment: "compiler don't drop" => 
>>>>>>> "compiler doesn't drop".
>>>>>>>
>>>>>>> However, we still have to reach a consensus on how we treat this 
>>>>>>> issue (as Chris already commented).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Please review the fix for
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>>>>>> webrev:
>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>>>>>
>>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for 
>>>>>>>> testing I reverted [1] changes.
>>>>>>>>
>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>>>>>
>>>>>>>> --alex
>>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>
>>
>


From daniil.x.titov at oracle.com  Sat Dec  7 01:41:13 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Fri, 06 Dec 2019 17:41:13 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
Message-ID: <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>

Hi David, Mandy, and Bob,

Thank you for reviewing this fix.

Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
     was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
     I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
     but I agree that the changes proposed in the previous version of the webrev increase such probability.
     I filed the follow-up issue [4] as Mandy suggested.
3.  The legacy methods were renamed as David suggested.


> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
> !     static int initialized=1;
>
>  Am I reading this right that the code currently fails to actually do the
> initialization because of this ???

Yes, currently the code fails to do the initialization but it was unnoticed since method 
get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
was always -1.

>  test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>
> System.out.println(String.format(...)
>
> Why not simply
>
> System.out.printf(..)

As I tried explain it earlier it would make the tests unstable.
System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
Instead it parses the format string into a list of FormatString objects and then iterates over the list.
As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
in the output.

For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
and "1030762496".

<skipped>
[0.304s][trace][os,container] Memory Usage is: 42983424
OperatingSystemMXBean.getFreeMemorySize: 1030758400
[0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
[0.305s][trace][os,container] Memory Usage is: 42979328
[0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
1030762496
OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176

<skipped>
java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr 

	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
	at java.base/java.lang.Thread.run(Thread.java:832)

Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.

[1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 
[2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
[3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
[4] https://bugs.openjdk.java.net/browse/JDK-8235522 

Thank you,
Daniil

?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:

    
    On 12/6/19 5:59 AM, Bob Vandette wrote:
    >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
    >>
    >>
    >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
    >>
    >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
    
    I thought that the error case we are referring to is limit == 0 which 
    indicates something unexpected goes wrong.  So the compatibility concern 
    should be low.  This is very specific to Metrics implementation for 
    cgroup v1 and let me know if I'm wrong.
    
    >> Surely there must always be some information available from the operating environment? I see from the impl file:
    >>
    >>     // the host data, value 0 indicates that something went wrong while the metric was read and
    >>    // in this case we return "information unavailable" code -1.
    >>
    >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
    > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
    > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
    > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
    > limits.
    >
    
    It's important to consider carefully if the monitoring API indicates an 
    error vs unavailable and an application should continue to run when the 
    monitoring system fails to get the metrics.
    
    There are several choices to report "something goes wrong" scenarios 
    (should unlikely happen???):
    1. fall back to a random positive value  (e.g. host value)
    2. return a negative value
    3. throw an exception
    
    #3 is not an option as the application is not expecting this.  For #2, 
    the application can filter bad values if desirable.
    
    I'm okay if you want to file a JBS issue to follow up and thoroughly 
    look at the cases that the metrics are unavailable and the cases when 
    fails to obtain.
    
    >> ---
    >>
    >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>
    >> System.out.println(String.format(...)
    >>
    >> Why not simply
    >>
    >> System.out.printf(..)
    >>
    >> ?
    
    or simply (as I commented [1])
         System.out.format
    
    Mandy
    [1] 
    https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
    
    
From serguei.spitsyn at oracle.com  Sat Dec  7 02:12:07 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Dec 2019 18:12:07 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
 <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
 <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com>
 <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com>
Message-ID: <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com>

On 12/6/19 17:24, Daniel D. Daugherty wrote:
> On 12/6/19 6:26 PM, serguei.spitsyn at oracle.com wrote:
>> On 12/6/19 13:52, Chris Plummer wrote:
>>> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote:
>>>> On 12/6/19 11:07, Chris Plummer wrote:
>>>>> On 12/5/19 6:45 PM, David Holmes wrote:
>>>>>> Hi Serguei,
>>>>>>
>>>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Chris and Alex,
>>>>>>>
>>>>>>> (I've also included Dan, David and Dean to the mailing list)
>>>>>>>
>>>>>>> We have to reach a consensus about this.
>>>>>>
>>>>>> This is just part of a much broader issue with JVM TI that I 
>>>>>> tried to have a discussion started based on Richard Reingruber's 
>>>>>> proposals around Escape Analysis:
>>>>>>
>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>>>>>>
>>>>>>
>>>>>> Unfortunately that discussion did not get much traction.
>>>>> Hmm. I have the emails that precede yours above, but not that one. 
>>>>> Not sure how what happened. Just read through it and it did give 
>>>>> me one thought.
>>>>
>>>>> Consider a model where the program is designed drive behavior of 
>>>>> the agent, triggering the agent to do certain things by having the 
>>>>> program do certain things. Normally an agent monitors the 
>>>>> application, but in this case the application is purposefully 
>>>>> controlling actions performed by the agent. If code is elided from 
>>>>> the program, then the agent no longer performs as expected. It's a 
>>>>> kind of backwards jvmti programming model, and you may ask why 
>>>>> would anyone do this. I'm not sure if there's a good reason for 
>>>>> it, but should it be expected to work given how the spec is written?
>>>>
>>>> My interpretation is that the current JVM TI PopFrame behavior does 
>>>> not break this model.
>>>> The spec says: "any changes to the arguments, which occurred in the 
>>>> called method, remain;"
>>>> As the code was eliminated by the compiler then no changes to this 
>>>> argument occurred.
>>>> So, the PopFrame behavior follows the spec. So, I think, the option 
>>>> #2 is not right. But it depends on our basic philosophy.
>>>> If the developer wants to control the agent then the program has to 
>>>> be designed to do something meaningful that is not going to be 
>>>> optimized out by the JIT compiler.
>>> You misunderstood my point. What I'm saying is that someone might do 
>>> something like assign to a local with the specific intent of having 
>>> that trigger a jmvti event, with the specific intent of having the 
>>> agent perform some expected action as a result. Think of it as being 
>>> a trigger for the agent, not as the agent monitoring the app. For 
>>> example, you could right a program + agent, and setting a specific 
>>> local in the program triggers the agent to turn on a light, and 
>>> setting some other local turns it off. Absurd, but possible, and 
>>> maybe there are less absurd applications.
>>
>> I think, I understood your point correctly.
>> Your point is that the code that can be eliminated (e.g. local++) is 
>> not that meaningless as it seems to be.
>> My point is that there are still other more reliable ways to trigger 
>> the agent.
>> So that relying on something that can be eliminated by JIT compilers 
>> is not important to support.
>
> You are making the assumption that the agent author understands what
> Java code/variables *might* be eliminated by the JIT compiler. I don't
> think that's a good assumption. I might have code that does a really
> complicated thing in a local variable that is only useful to the
> agent itself. The JIT will see that the local variable cannot escape
> the function and is not used outside the function (as far as it can
> see) so it will elide the local variable and the code that was used
> to generated the local result in the variable.
>
> If that local result happens to be some computation that the agent
> needed to see to do its next operation...

Thank you for sharing your point.
I'm not insisting on my assumptions here, just not sure this is more 
important than allowing optimizations.
Do you actually think this use case needs to be supported?

In general, to identify our philosophy about interaction between JIT 
compiler
code elimination and JVM TI we need to make some assumptions.


Let's temporarily put JVM TI out of scope.
Are there any assumptions when JIT compilers eliminate some code?
Is it based on some vision what code is observable?
If it was decided some code can be eliminated then is it JVM TI only 
that breaks such assumptions about observability?

If so, then such optimizations can be disabled at some level.
Then we end up debugging/profiling/monitoring, and finally, observing a 
slightly different application.
Are we Okay with this? Do we need any compromises here?
Maybe we need more flags to control the JIT compiler behavior.

Thanks,
Serguei

>
> Dan
>
>
>>
>> Thanks,
>> Serguei
>>
>>> Chris
>>>>
>>>>>>
>>>>>>> We have 3 options:
>>>>>>>
>>>>>>> Option #1:
>>>>>>> ?? The JIT optimization to delete a code which "looks useless"
>>>>>>> ?? has to be disabled if can_pop_frame capability is enabled.
>>>>>>> ?? Than this problem becomes a JIT compiler bug.
>>>>>>>
>>>>>>> Option #2:
>>>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>>>>>>> something like:
>>>>>>> ?? "Note however, that the original argument values are not
>>>>>>> ??? preserved and can be changed by the called method;"
>>>>>>> ?? Than this problem becomes a JVM TI spec bug.
>>>>>>>
>>>>>>> Option #3:
>>>>>>> ?? Consider it is Okay for compiler to eliminate useless code,
>>>>>>> ?? so the argument values can be reinitialized by the PopFrame.
>>>>>>> ?? Than this problem becomes just a test bug.
>>>>>>>
>>>>>>>
>>>>>>> My preference is option #3.
>>>>>>> The point is that if the arguments are not really used in
>>>>>>> a method then restoring them to any values is a no-op.
>>>>>>> It is really meaningless use case, so why should we care about it.
>>>>> Is "restoring" the proper term here? I thought they were just left 
>>>>> on the stack and reused on the subsequent invoke.
>>>>
>>>> Agreed. The term "restoring" is not accurate here.
>>>>
>>>>> In fact I figured the reason for the language in the spec in the 
>>>>> first place is to alleviate JVMTI from having to restore them to 
>>>>> their original values, which is probably not even possible.
>>>>
>>>> Right.
>>>>
>>>>>>
>>>>>> Thanks for setting that out clearly.
>>>>>>
>>>>>> I'd like to agree this is particular case is a test bug. If we 
>>>>>> have a method:
>>>>>>
>>>>>> int incr(int val) {
>>>>>> ? val++;
>>>>>> ? popFrameHere();
>>>>>> ? return val;
>>>>>> }
>>>>>>
>>>>>> then the change to the argument is necessary and must be 
>>>>>> preserved. In contrast:
>>>>>>
>>>>>> void incr(int val) {
>>>>>> ? val++;
>>>>>> ? popFrameHere();
>>>>>> }
>>>>>>
>>>>>> the change to the argument is meaningless and I would hope any 
>>>>>> decent JIT would simply elide it.
>>>>> So, this goes back to my example above where the program is trying 
>>>>> to elicit behavior from the agent. It's not meaningless in that 
>>>>> case, but that doesn't mean I think we need to support it.
>>>>
>>>> Even with this model it is possible and better to do something 
>>>> meaningful to control the agent.
>>>> This model is very rare use case.
>>>> It is hard to justify a need to support it. :)
>>>>
>>>>>>
>>>>>> But we must have a consistent approach to such things. What would 
>>>>>> happen if a breakpoint were to be placed on the instruction that 
>>>>>> uselessly modified the argument - would we still see the 
>>>>>> modification or would it be elided?
>>>>> Breakpoints force interpreted mode for the method, although I 
>>>>> suppose that's a hotspot implementation detail and not something a 
>>>>> VM would be required to do. A VM that allows breakpoints in 
>>>>> compiled methods has the potential to miss the breakpoint if code 
>>>>> is elided.
>>>>>
>>>>> Also, what if you put a breakpoint in a method, the call to it is 
>>>>> elided. You would never hit the breakpoint. That could cause some 
>>>>> serious head scratching for a debugger user if they know the code 
>>>>> doing the method call is "executed".
>>>>
>>>> If the method is not actually being called then missing breakpoints 
>>>> there gives a clue what is going on.
>>>> Otherwise, it will cause cause some serious head scratching for a 
>>>> debugger user.
>>>> In general, my preference would be to debug actual behavior.
>>>> It is not good we have no support breakpoints in compiled methods.
>>>>
>>>>
>>>>>>
>>>>>> And how do C1 and C2 avoid this issue? Do they simply not 
>>>>>> optimise away the useless assignment? Or do they actively disable 
>>>>>> that optimization in this context?
>>>>>>
>>>>>> We need, IMO, to establish the basic philosophy of how to manage 
>>>>>> JVM TI / JIT interactions, so we know what things must remain 
>>>>>> visible and which can be optimised away.
>>>>>>
>>>>>> That said, changing the test allows us to defer having to reach 
>>>>>> that consensus.
>>>>> Agreed. I think it's ok to work around the test issue as long as 
>>>>> we keep this overall issue on the radar. Do we have a bug field 
>>>>> for that?
>>>>
>>>> I thought, it is a little bit early to file a bug for it.
>>>> Also, probably, it can be an umbrella enhancement or task.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Alex,
>>>>>>>>
>>>>>>>> The fix itself looks Okay.
>>>>>>>> Minor: replace in the comment: "compiler don't drop" => 
>>>>>>>> "compiler doesn't drop".
>>>>>>>>
>>>>>>>> However, we still have to reach a consensus on how we treat 
>>>>>>>> this issue (as Chris already commented).
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Please review the fix for
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>>>>>>> webrev:
>>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>>>>>>
>>>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for 
>>>>>>>>> testing I reverted [1] changes.
>>>>>>>>>
>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>>>>>>
>>>>>>>>> --alex
>>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>


From chris.plummer at oracle.com  Sat Dec  7 05:28:42 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 6 Dec 2019 21:28:42 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
 <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
 <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com>
Message-ID: <84d715ed-6b54-1456-0b2b-0b7291e01698@oracle.com>

On 12/6/19 3:26 PM, serguei.spitsyn at oracle.com wrote:
> On 12/6/19 13:52, Chris Plummer wrote:
>> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote:
>>> On 12/6/19 11:07, Chris Plummer wrote:
>>>> On 12/5/19 6:45 PM, David Holmes wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Chris and Alex,
>>>>>>
>>>>>> (I've also included Dan, David and Dean to the mailing list)
>>>>>>
>>>>>> We have to reach a consensus about this.
>>>>>
>>>>> This is just part of a much broader issue with JVM TI that I tried 
>>>>> to have a discussion started based on Richard Reingruber's 
>>>>> proposals around Escape Analysis:
>>>>>
>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>>>>>
>>>>>
>>>>> Unfortunately that discussion did not get much traction.
>>>> Hmm. I have the emails that precede yours above, but not that one. 
>>>> Not sure how what happened. Just read through it and it did give me 
>>>> one thought.
>>>
>>>> Consider a model where the program is designed drive behavior of 
>>>> the agent, triggering the agent to do certain things by having the 
>>>> program do certain things. Normally an agent monitors the 
>>>> application, but in this case the application is purposefully 
>>>> controlling actions performed by the agent. If code is elided from 
>>>> the program, then the agent no longer performs as expected. It's a 
>>>> kind of backwards jvmti programming model, and you may ask why 
>>>> would anyone do this. I'm not sure if there's a good reason for it, 
>>>> but should it be expected to work given how the spec is written?
>>>
>>> My interpretation is that the current JVM TI PopFrame behavior does 
>>> not break this model.
>>> The spec says: "any changes to the arguments, which occurred in the 
>>> called method, remain;"
>>> As the code was eliminated by the compiler then no changes to this 
>>> argument occurred.
>>> So, the PopFrame behavior follows the spec. So, I think, the option 
>>> #2 is not right. But it depends on our basic philosophy.
>>> If the developer wants to control the agent then the program has to 
>>> be designed to do something meaningful that is not going to be 
>>> optimized out by the JIT compiler.
>> You misunderstood my point. What I'm saying is that someone might do 
>> something like assign to a local with the specific intent of having 
>> that trigger a jmvti event, with the specific intent of having the 
>> agent perform some expected action as a result. Think of it as being 
>> a trigger for the agent, not as the agent monitoring the app. For 
>> example, you could right a program + agent, and setting a specific 
>> local in the program triggers the agent to turn on a light, and 
>> setting some other local turns it off. Absurd, but possible, and 
>> maybe there are less absurd applications.
>
> I think, I understood your point correctly.
> Your point is that the code that can be eliminated (e.g. local++) is 
> not that meaningless as it seems to be.
> My point is that there are still other more reliable ways to trigger 
> the agent.
> So that relying on something that can be eliminated by JIT compilers 
> is not important to support.
>
Yes, I wasn't trying to imply that it is important. However, the spec 
should be clear about it.

Chris
> Thanks,
> Serguei
>
>> Chris
>>>
>>>>>
>>>>>> We have 3 options:
>>>>>>
>>>>>> Option #1:
>>>>>> ?? The JIT optimization to delete a code which "looks useless"
>>>>>> ?? has to be disabled if can_pop_frame capability is enabled.
>>>>>> ?? Than this problem becomes a JIT compiler bug.
>>>>>>
>>>>>> Option #2:
>>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>>>>>> something like:
>>>>>> ?? "Note however, that the original argument values are not
>>>>>> ??? preserved and can be changed by the called method;"
>>>>>> ?? Than this problem becomes a JVM TI spec bug.
>>>>>>
>>>>>> Option #3:
>>>>>> ?? Consider it is Okay for compiler to eliminate useless code,
>>>>>> ?? so the argument values can be reinitialized by the PopFrame.
>>>>>> ?? Than this problem becomes just a test bug.
>>>>>>
>>>>>>
>>>>>> My preference is option #3.
>>>>>> The point is that if the arguments are not really used in
>>>>>> a method then restoring them to any values is a no-op.
>>>>>> It is really meaningless use case, so why should we care about it.
>>>> Is "restoring" the proper term here? I thought they were just left 
>>>> on the stack and reused on the subsequent invoke.
>>>
>>> Agreed. The term "restoring" is not accurate here.
>>>
>>>> In fact I figured the reason for the language in the spec in the 
>>>> first place is to alleviate JVMTI from having to restore them to 
>>>> their original values, which is probably not even possible.
>>>
>>> Right.
>>>
>>>>>
>>>>> Thanks for setting that out clearly.
>>>>>
>>>>> I'd like to agree this is particular case is a test bug. If we 
>>>>> have a method:
>>>>>
>>>>> int incr(int val) {
>>>>> ? val++;
>>>>> ? popFrameHere();
>>>>> ? return val;
>>>>> }
>>>>>
>>>>> then the change to the argument is necessary and must be 
>>>>> preserved. In contrast:
>>>>>
>>>>> void incr(int val) {
>>>>> ? val++;
>>>>> ? popFrameHere();
>>>>> }
>>>>>
>>>>> the change to the argument is meaningless and I would hope any 
>>>>> decent JIT would simply elide it.
>>>> So, this goes back to my example above where the program is trying 
>>>> to elicit behavior from the agent. It's not meaningless in that 
>>>> case, but that doesn't mean I think we need to support it.
>>>
>>> Even with this model it is possible and better to do something 
>>> meaningful to control the agent.
>>> This model is very rare use case.
>>> It is hard to justify a need to support it. :)
>>>
>>>>>
>>>>> But we must have a consistent approach to such things. What would 
>>>>> happen if a breakpoint were to be placed on the instruction that 
>>>>> uselessly modified the argument - would we still see the 
>>>>> modification or would it be elided?
>>>> Breakpoints force interpreted mode for the method, although I 
>>>> suppose that's a hotspot implementation detail and not something a 
>>>> VM would be required to do. A VM that allows breakpoints in 
>>>> compiled methods has the potential to miss the breakpoint if code 
>>>> is elided.
>>>>
>>>> Also, what if you put a breakpoint in a method, the call to it is 
>>>> elided. You would never hit the breakpoint. That could cause some 
>>>> serious head scratching for a debugger user if they know the code 
>>>> doing the method call is "executed".
>>>
>>> If the method is not actually being called then missing breakpoints 
>>> there gives a clue what is going on.
>>> Otherwise, it will cause cause some serious head scratching for a 
>>> debugger user.
>>> In general, my preference would be to debug actual behavior.
>>> It is not good we have no support breakpoints in compiled methods.
>>>
>>>
>>>>>
>>>>> And how do C1 and C2 avoid this issue? Do they simply not optimise 
>>>>> away the useless assignment? Or do they actively disable that 
>>>>> optimization in this context?
>>>>>
>>>>> We need, IMO, to establish the basic philosophy of how to manage 
>>>>> JVM TI / JIT interactions, so we know what things must remain 
>>>>> visible and which can be optimised away.
>>>>>
>>>>> That said, changing the test allows us to defer having to reach 
>>>>> that consensus.
>>>> Agreed. I think it's ok to work around the test issue as long as we 
>>>> keep this overall issue on the radar. Do we have a bug field for that?
>>>
>>> I thought, it is a little bit early to file a bug for it.
>>> Also, probably, it can be an umbrella enhancement or task.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Alex,
>>>>>>>
>>>>>>> The fix itself looks Okay.
>>>>>>> Minor: replace in the comment: "compiler don't drop" => 
>>>>>>> "compiler doesn't drop".
>>>>>>>
>>>>>>> However, we still have to reach a consensus on how we treat this 
>>>>>>> issue (as Chris already commented).
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Please review the fix for
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>>>>>> webrev:
>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>>>>>
>>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for 
>>>>>>>> testing I reverted [1] changes.
>>>>>>>>
>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>>>>>
>>>>>>>> --alex
>>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>
>>
>


From chris.plummer at oracle.com  Sat Dec  7 05:31:57 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 6 Dec 2019 21:31:57 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
 <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
 <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com>
 <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com>
 <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com>
Message-ID: <c5ff7df9-9fc5-15ea-3f6a-720eb44e7c75@oracle.com>

On 12/6/19 6:12 PM, serguei.spitsyn at oracle.com wrote:
> On 12/6/19 17:24, Daniel D. Daugherty wrote:
>> On 12/6/19 6:26 PM, serguei.spitsyn at oracle.com wrote:
>>> On 12/6/19 13:52, Chris Plummer wrote:
>>>> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote:
>>>>> On 12/6/19 11:07, Chris Plummer wrote:
>>>>>> On 12/5/19 6:45 PM, David Holmes wrote:
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Chris and Alex,
>>>>>>>>
>>>>>>>> (I've also included Dan, David and Dean to the mailing list)
>>>>>>>>
>>>>>>>> We have to reach a consensus about this.
>>>>>>>
>>>>>>> This is just part of a much broader issue with JVM TI that I 
>>>>>>> tried to have a discussion started based on Richard Reingruber's 
>>>>>>> proposals around Escape Analysis:
>>>>>>>
>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>>>>>>>
>>>>>>>
>>>>>>> Unfortunately that discussion did not get much traction.
>>>>>> Hmm. I have the emails that precede yours above, but not that 
>>>>>> one. Not sure how what happened. Just read through it and it did 
>>>>>> give me one thought.
>>>>>
>>>>>> Consider a model where the program is designed drive behavior of 
>>>>>> the agent, triggering the agent to do certain things by having 
>>>>>> the program do certain things. Normally an agent monitors the 
>>>>>> application, but in this case the application is purposefully 
>>>>>> controlling actions performed by the agent. If code is elided 
>>>>>> from the program, then the agent no longer performs as expected. 
>>>>>> It's a kind of backwards jvmti programming model, and you may ask 
>>>>>> why would anyone do this. I'm not sure if there's a good reason 
>>>>>> for it, but should it be expected to work given how the spec is 
>>>>>> written?
>>>>>
>>>>> My interpretation is that the current JVM TI PopFrame behavior 
>>>>> does not break this model.
>>>>> The spec says: "any changes to the arguments, which occurred in 
>>>>> the called method, remain;"
>>>>> As the code was eliminated by the compiler then no changes to this 
>>>>> argument occurred.
>>>>> So, the PopFrame behavior follows the spec. So, I think, the 
>>>>> option #2 is not right. But it depends on our basic philosophy.
>>>>> If the developer wants to control the agent then the program has 
>>>>> to be designed to do something meaningful that is not going to be 
>>>>> optimized out by the JIT compiler.
>>>> You misunderstood my point. What I'm saying is that someone might 
>>>> do something like assign to a local with the specific intent of 
>>>> having that trigger a jmvti event, with the specific intent of 
>>>> having the agent perform some expected action as a result. Think of 
>>>> it as being a trigger for the agent, not as the agent monitoring 
>>>> the app. For example, you could right a program + agent, and 
>>>> setting a specific local in the program triggers the agent to turn 
>>>> on a light, and setting some other local turns it off. Absurd, but 
>>>> possible, and maybe there are less absurd applications.
>>>
>>> I think, I understood your point correctly.
>>> Your point is that the code that can be eliminated (e.g. local++) is 
>>> not that meaningless as it seems to be.
>>> My point is that there are still other more reliable ways to trigger 
>>> the agent.
>>> So that relying on something that can be eliminated by JIT compilers 
>>> is not important to support.
>>
>> You are making the assumption that the agent author understands what
>> Java code/variables *might* be eliminated by the JIT compiler. I don't
>> think that's a good assumption. I might have code that does a really
>> complicated thing in a local variable that is only useful to the
>> agent itself. The JIT will see that the local variable cannot escape
>> the function and is not used outside the function (as far as it can
>> see) so it will elide the local variable and the code that was used
>> to generated the local result in the variable.
>>
>> If that local result happens to be some computation that the agent
>> needed to see to do its next operation...
>
> Thank you for sharing your point.
> I'm not insisting on my assumptions here, just not sure this is more 
> important than allowing optimizations.
> Do you actually think this use case needs to be supported?
>
> In general, to identify our philosophy about interaction between JIT 
> compiler
> code elimination and JVM TI we need to make some assumptions.
>
>
> Let's temporarily put JVM TI out of scope.
> Are there any assumptions when JIT compilers eliminate some code?
> Is it based on some vision what code is observable?
> If it was decided some code can be eliminated then is it JVM TI only 
> that breaks such assumptions about observability?
>
> If so, then such optimizations can be disabled at some level.
> Then we end up debugging/profiling/monitoring, and finally, observing 
> a slightly different application.
> Are we Okay with this? Do we need any compromises here?
> Maybe we need more flags to control the JIT compiler behavior.
>
Dan is restating the point I was making, but I also agree that unless 
someone can show us a useful application of that kind of use of jvmti 
events, I don't think we need to support it. We do need to clarify it in 
the spec however.

Chris
> Thanks,
> Serguei
>
>>
>> Dan
>>
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>> Chris
>>>>>
>>>>>>>
>>>>>>>> We have 3 options:
>>>>>>>>
>>>>>>>> Option #1:
>>>>>>>> ?? The JIT optimization to delete a code which "looks useless"
>>>>>>>> ?? has to be disabled if can_pop_frame capability is enabled.
>>>>>>>> ?? Than this problem becomes a JIT compiler bug.
>>>>>>>>
>>>>>>>> Option #2:
>>>>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>>>>>>>> something like:
>>>>>>>> ?? "Note however, that the original argument values are not
>>>>>>>> ??? preserved and can be changed by the called method;"
>>>>>>>> ?? Than this problem becomes a JVM TI spec bug.
>>>>>>>>
>>>>>>>> Option #3:
>>>>>>>> ?? Consider it is Okay for compiler to eliminate useless code,
>>>>>>>> ?? so the argument values can be reinitialized by the PopFrame.
>>>>>>>> ?? Than this problem becomes just a test bug.
>>>>>>>>
>>>>>>>>
>>>>>>>> My preference is option #3.
>>>>>>>> The point is that if the arguments are not really used in
>>>>>>>> a method then restoring them to any values is a no-op.
>>>>>>>> It is really meaningless use case, so why should we care about it.
>>>>>> Is "restoring" the proper term here? I thought they were just 
>>>>>> left on the stack and reused on the subsequent invoke.
>>>>>
>>>>> Agreed. The term "restoring" is not accurate here.
>>>>>
>>>>>> In fact I figured the reason for the language in the spec in the 
>>>>>> first place is to alleviate JVMTI from having to restore them to 
>>>>>> their original values, which is probably not even possible.
>>>>>
>>>>> Right.
>>>>>
>>>>>>>
>>>>>>> Thanks for setting that out clearly.
>>>>>>>
>>>>>>> I'd like to agree this is particular case is a test bug. If we 
>>>>>>> have a method:
>>>>>>>
>>>>>>> int incr(int val) {
>>>>>>> ? val++;
>>>>>>> ? popFrameHere();
>>>>>>> ? return val;
>>>>>>> }
>>>>>>>
>>>>>>> then the change to the argument is necessary and must be 
>>>>>>> preserved. In contrast:
>>>>>>>
>>>>>>> void incr(int val) {
>>>>>>> ? val++;
>>>>>>> ? popFrameHere();
>>>>>>> }
>>>>>>>
>>>>>>> the change to the argument is meaningless and I would hope any 
>>>>>>> decent JIT would simply elide it.
>>>>>> So, this goes back to my example above where the program is 
>>>>>> trying to elicit behavior from the agent. It's not meaningless in 
>>>>>> that case, but that doesn't mean I think we need to support it.
>>>>>
>>>>> Even with this model it is possible and better to do something 
>>>>> meaningful to control the agent.
>>>>> This model is very rare use case.
>>>>> It is hard to justify a need to support it. :)
>>>>>
>>>>>>>
>>>>>>> But we must have a consistent approach to such things. What 
>>>>>>> would happen if a breakpoint were to be placed on the 
>>>>>>> instruction that uselessly modified the argument - would we 
>>>>>>> still see the modification or would it be elided?
>>>>>> Breakpoints force interpreted mode for the method, although I 
>>>>>> suppose that's a hotspot implementation detail and not something 
>>>>>> a VM would be required to do. A VM that allows breakpoints in 
>>>>>> compiled methods has the potential to miss the breakpoint if code 
>>>>>> is elided.
>>>>>>
>>>>>> Also, what if you put a breakpoint in a method, the call to it is 
>>>>>> elided. You would never hit the breakpoint. That could cause some 
>>>>>> serious head scratching for a debugger user if they know the code 
>>>>>> doing the method call is "executed".
>>>>>
>>>>> If the method is not actually being called then missing 
>>>>> breakpoints there gives a clue what is going on.
>>>>> Otherwise, it will cause cause some serious head scratching for a 
>>>>> debugger user.
>>>>> In general, my preference would be to debug actual behavior.
>>>>> It is not good we have no support breakpoints in compiled methods.
>>>>>
>>>>>
>>>>>>>
>>>>>>> And how do C1 and C2 avoid this issue? Do they simply not 
>>>>>>> optimise away the useless assignment? Or do they actively 
>>>>>>> disable that optimization in this context?
>>>>>>>
>>>>>>> We need, IMO, to establish the basic philosophy of how to manage 
>>>>>>> JVM TI / JIT interactions, so we know what things must remain 
>>>>>>> visible and which can be optimised away.
>>>>>>>
>>>>>>> That said, changing the test allows us to defer having to reach 
>>>>>>> that consensus.
>>>>>> Agreed. I think it's ok to work around the test issue as long as 
>>>>>> we keep this overall issue on the radar. Do we have a bug field 
>>>>>> for that?
>>>>>
>>>>> I thought, it is a little bit early to file a bug for it.
>>>>> Also, probably, it can be an umbrella enhancement or task.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Alex,
>>>>>>>>>
>>>>>>>>> The fix itself looks Okay.
>>>>>>>>> Minor: replace in the comment: "compiler don't drop" => 
>>>>>>>>> "compiler doesn't drop".
>>>>>>>>>
>>>>>>>>> However, we still have to reach a consensus on how we treat 
>>>>>>>>> this issue (as Chris already commented).
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Please review the fix for
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>>>>>>>> webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>>>>>>>
>>>>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for 
>>>>>>>>>> testing I reverted [1] changes.
>>>>>>>>>>
>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>>>>>>>
>>>>>>>>>> --alex
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>


From leonid.mesnik at oracle.com  Sun Dec  8 02:17:00 2019
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Sat, 7 Dec 2019 18:17:00 -0800
Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi tests
Message-ID: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com>

Hi

Could you please review following fix which just remove duplicated 
threadByName methods and JDITestRuntimeException exceptions in nsk/jdi 
tests. I don't see any reason to have so many copies of them.

The method threadByName is added nsk.share.jdi.Debugee class as 
'threadByNameOrThrow' because slightly different 'threadByName' already 
exist there. I filed another sub-task 
https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage and 
merge these 2 methods later.

This fix affects about ~4000 lines and I want to keep it as 
straight-forward as possible.

webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/

bug: https://bugs.openjdk.java.net/browse/JDK-8235530

The next planned steps are in:

https://bugs.openjdk.java.net/browse/JDK-8233830

Leonid


From serguei.spitsyn at oracle.com  Sun Dec  8 04:30:33 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Sat, 7 Dec 2019 20:30:33 -0800
Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi
 tests
In-Reply-To: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com>
References: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com>
Message-ID: <f2de7bfa-e885-1357-ca89-0be7add5a0e1@oracle.com>

Hi Leonid,

The fix looks good.

Thank you for taking care about it!
I agree, it is an awful duplication.

Thanks,
Serguei


On 12/7/19 18:17, Leonid Mesnik wrote:
> Hi
>
> Could you please review following fix which just remove duplicated 
> threadByName methods and JDITestRuntimeException exceptions in nsk/jdi 
> tests. I don't see any reason to have so many copies of them.
>
> The method threadByName is added nsk.share.jdi.Debugee class as 
> 'threadByNameOrThrow' because slightly different 'threadByName' 
> already exist there. I filed another sub-task 
> https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage and 
> merge these 2 methods later.
>
> This fix affects about ~4000 lines and I want to keep it as 
> straight-forward as possible.
>
> webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8235530
>
> The next planned steps are in:
>
> https://bugs.openjdk.java.net/browse/JDK-8233830
>
> Leonid
>


From david.holmes at oracle.com  Sun Dec  8 05:19:36 2019
From: david.holmes at oracle.com (David Holmes)
Date: Sun, 8 Dec 2019 15:19:36 +1000
Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi
 tests
In-Reply-To: <f2de7bfa-e885-1357-ca89-0be7add5a0e1@oracle.com>
References: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com>
 <f2de7bfa-e885-1357-ca89-0be7add5a0e1@oracle.com>
Message-ID: <1a9d76e5-3a73-9804-be3f-93d6864465be@oracle.com>

+1 on both counts

Not sure JDITestRuntimeException is really necessary/useful versus just 
using RuntimeException, but that's a different issue.

Thanks,
David

On 8/12/2019 2:30 pm, serguei.spitsyn at oracle.com wrote:
> Hi Leonid,
> 
> The fix looks good.
> 
> Thank you for taking care about it!
> I agree, it is an awful duplication.
> 
> Thanks,
> Serguei
> 
> 
> On 12/7/19 18:17, Leonid Mesnik wrote:
>> Hi
>>
>> Could you please review following fix which just remove duplicated 
>> threadByName methods and JDITestRuntimeException exceptions in nsk/jdi 
>> tests. I don't see any reason to have so many copies of them.
>>
>> The method threadByName is added nsk.share.jdi.Debugee class as 
>> 'threadByNameOrThrow' because slightly different 'threadByName' 
>> already exist there. I filed another sub-task 
>> https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage and 
>> merge these 2 methods later.
>>
>> This fix affects about ~4000 lines and I want to keep it as 
>> straight-forward as possible.
>>
>> webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8235530
>>
>> The next planned steps are in:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8233830
>>
>> Leonid
>>
> 

From david.holmes at oracle.com  Mon Dec  9 04:49:11 2019
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 9 Dec 2019 14:49:11 +1000
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
Message-ID: <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com>

Hi Daniil,

On 7/12/2019 11:41 am, Daniil Titov wrote:
> Hi David, Mandy, and Bob,
> 
> Thank you for reviewing this fix.
> 
> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.

Okay.

> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
>       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.

Okay.

>       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
>       but I agree that the changes proposed in the previous version of the webrev increase such probability.
>       I filed the follow-up issue [4] as Mandy suggested.

I added a comment to the bug. This is potentially a difficult problem to 
resolve - it all depends on the likelihood of any errors and what they 
really indicate.

> 3.  The legacy methods were renamed as David suggested.

Thanks!

> 
>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>> !     static int initialized=1;
>>
>>   Am I reading this right that the code currently fails to actually do the
>> initialization because of this ???
> 
> Yes, currently the code fails to do the initialization but it was unnoticed since method
> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
> was always -1.

So we never try to access the uninitialized counters.cpus array which is 
good but we still return garbage for counters.jvmTicks and 
counters.cpuTicks - surely that should have been noticeable?

>>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>
>> System.out.println(String.format(...)
>>
>> Why not simply
>>
>> System.out.printf(..)
> 
> As I tried explain it earlier it would make the tests unstable.
> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
> Instead it parses the format string into a list of FormatString objects and then iterates over the list.
> As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
> in the output.

Sorry I missed the earlier explanation. I find it somewhat surprising 
that format() works that way, but without unlimited buffering there will 
always be a need to flush the outputstream at some point.

Thanks,
David
-----

> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
> and "1030762496".
> 
> <skipped>
> [0.304s][trace][os,container] Memory Usage is: 42983424
> OperatingSystemMXBean.getFreeMemorySize: 1030758400
> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
> [0.305s][trace][os,container] Memory Usage is: 42979328
> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
> 1030762496
> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
> 
> <skipped>
> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
> 
> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
> 	at java.base/java.lang.Thread.run(Thread.java:832)
> 
> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
> 
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
> 
> Thank you,
> Daniil
> 
> ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
> 
>      
>      
>      On 12/6/19 5:59 AM, Bob Vandette wrote:
>      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>      >>
>      >>
>      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>      >>
>      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
>      
>      I thought that the error case we are referring to is limit == 0 which
>      indicates something unexpected goes wrong.  So the compatibility concern
>      should be low.  This is very specific to Metrics implementation for
>      cgroup v1 and let me know if I'm wrong.
>      
>      >> Surely there must always be some information available from the operating environment? I see from the impl file:
>      >>
>      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
>      >>    // in this case we return "information unavailable" code -1.
>      >>
>      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
>      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
>      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
>      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
>      > limits.
>      >
>      
>      It's important to consider carefully if the monitoring API indicates an
>      error vs unavailable and an application should continue to run when the
>      monitoring system fails to get the metrics.
>      
>      There are several choices to report "something goes wrong" scenarios
>      (should unlikely happen???):
>      1. fall back to a random positive value  (e.g. host value)
>      2. return a negative value
>      3. throw an exception
>      
>      #3 is not an option as the application is not expecting this.  For #2,
>      the application can filter bad values if desirable.
>      
>      I'm okay if you want to file a JBS issue to follow up and thoroughly
>      look at the cases that the metrics are unavailable and the cases when
>      fails to obtain.
>      
>      >> ---
>      >>
>      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>      >>
>      >> System.out.println(String.format(...)
>      >>
>      >> Why not simply
>      >>
>      >> System.out.printf(..)
>      >>
>      >> ?
>      
>      or simply (as I commented [1])
>           System.out.format
>      
>      Mandy
>      [1]
>      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
>      
>      
> 
> 

From ralf.schmelter at sap.com  Mon Dec  9 09:01:08 2019
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Mon, 9 Dec 2019 09:01:08 +0000
Subject: RFR (M) 8234510: Remove file seeking requirement for writing a
 heap dump
In-Reply-To: <CAA-vtUxV40m+-qQ0N7SZmOe9FotFfyr1snQGe0T5pK50LKmJXg@mail.gmail.com>
References: <AM0PR02MB45008C66EC315E9836F7FF7A9F4A0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAA-vtUxV40m+-qQ0N7SZmOe9FotFfyr1snQGe0T5pK50LKmJXg@mail.gmail.com>
Message-ID: <AM0PR02MB4500B68C26D844003956EC339F580@AM0PR02MB4500.eurprd02.prod.outlook.com>

Hi Thomas,

thanks for the feedback.

> In DumpWriter, _current_entry_left and _entry_ended seem only to be needed for
> asserting. Please enclose their definition in DEBUG_ONLY, and initialize them in the ctor.

Good catch. I made them debug only.

> (not your patch): since DumperSupport::dump_class_and_array_classes(Klass*)
> should assert that Klass* is an InstanceKlass; or, even better, use InstanceKlass*
> as parameter.

A former version of the patch had a lot of the Klass* types replaced by InstanceKlass* where appropriate. I removed those changes ultimately because they had not much to with making the heap dump streamable. But a later patch could change this.

> DumpWriter::start_dump_entry(): It took me a while to understand how the
> segment size is updated if the entry is huge, since by the time we finish the
> entry the segment header will already be flushed out. The answer is, I think,
> that this is not needed since we only write one record so the initial size we wrote
> into the segment header is still valid.

Correct.

> Proposed comment change:
>
> -// Will be fixed up later if we can add more entries.
> +// Seed segment size with size of its first record. Should we add more records later,
>    we will update the segment size (see >finish_dump_segment())

I?ve changed the comment to make it more clear.

Best regards,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191209/94b22a61/attachment.htm>

From bob.vandette at oracle.com  Mon Dec  9 15:17:33 2019
From: bob.vandette at oracle.com (Bob Vandette)
Date: Mon, 9 Dec 2019 10:17:33 -0500
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
Message-ID: <EE35A423-7893-44B5-A0B6-F45A0E0A3AE1@oracle.com>

Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException
so it?s consistent with the other get functions?

Bob.


> On Dec 6, 2019, at 8:41 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
> 
> Hi David, Mandy, and Bob,
> 
> Thank you for reviewing this fix.
> 
> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
>     was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
>     I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
>     but I agree that the changes proposed in the previous version of the webrev increase such probability.
>     I filed the follow-up issue [4] as Mandy suggested.
> 3.  The legacy methods were renamed as David suggested.
> 
> 
>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>> !     static int initialized=1;
>> 
>> Am I reading this right that the code currently fails to actually do the
>> initialization because of this ???
> 
> Yes, currently the code fails to do the initialization but it was unnoticed since method 
> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
> was always -1.
> 
>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>> 
>> System.out.println(String.format(...)
>> 
>> Why not simply
>> 
>> System.out.printf(..)
> 
> As I tried explain it earlier it would make the tests unstable.
> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
> Instead it parses the format string into a list of FormatString objects and then iterates over the list.
> As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
> in the output.
> 
> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
> and "1030762496".
> 
> <skipped>
> [0.304s][trace][os,container] Memory Usage is: 42983424
> OperatingSystemMXBean.getFreeMemorySize: 1030758400
> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
> [0.305s][trace][os,container] Memory Usage is: 42979328
> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
> 1030762496
> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
> 
> <skipped>
> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr 
> 
> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
> 	at java.base/java.lang.Thread.run(Thread.java:832)
> 
> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
> 
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05 
> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
> [4] https://bugs.openjdk.java.net/browse/JDK-8235522 
> 
> Thank you,
> Daniil
> 
> ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
> 
> 
> 
>    On 12/6/19 5:59 AM, Bob Vandette wrote:
>>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>>> 
>>> 
>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>>> 
>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
> 
>    I thought that the error case we are referring to is limit == 0 which 
>    indicates something unexpected goes wrong.  So the compatibility concern 
>    should be low.  This is very specific to Metrics implementation for 
>    cgroup v1 and let me know if I'm wrong.
> 
>>> Surely there must always be some information available from the operating environment? I see from the impl file:
>>> 
>>>    // the host data, value 0 indicates that something went wrong while the metric was read and
>>>   // in this case we return "information unavailable" code -1.
>>> 
>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
>> I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
>> limits.
>> 
> 
>    It's important to consider carefully if the monitoring API indicates an 
>    error vs unavailable and an application should continue to run when the 
>    monitoring system fails to get the metrics.
> 
>    There are several choices to report "something goes wrong" scenarios 
>    (should unlikely happen???):
>    1. fall back to a random positive value  (e.g. host value)
>    2. return a negative value
>    3. throw an exception
> 
>    #3 is not an option as the application is not expecting this.  For #2, 
>    the application can filter bad values if desirable.
> 
>    I'm okay if you want to file a JBS issue to follow up and thoroughly 
>    look at the cases that the metrics are unavailable and the cases when 
>    fails to obtain.
> 
>>> ---
>>> 
>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>> 
>>> System.out.println(String.format(...)
>>> 
>>> Why not simply
>>> 
>>> System.out.printf(..)
>>> 
>>> ?
> 
>    or simply (as I commented [1])
>         System.out.format
> 
>    Mandy
>    [1] 
>    https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
> 
> 
> 
> 


From mandy.chung at oracle.com  Mon Dec  9 17:48:51 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 9 Dec 2019 09:48:51 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <EE35A423-7893-44B5-A0B6-F45A0E0A3AE1@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <EE35A423-7893-44B5-A0B6-F45A0E0A3AE1@oracle.com>
Message-ID: <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com>

Files:lines requires FilePermission check.? So it needs to be wrapped 
with doPrivileged.? The readFilePrivileged can unwrap and throw the 
cause instead like this:


 ??? static Stream<String> readFilePrivileged(Path path) throws 
IOException {
 ???????? try {
 ???????????? return 
AccessController.doPrivileged((PrivilegedExceptionAction<Stream<String>>) 
() -> Files.lines(path));
 ???????? } catch (PrivilegedActionException e) {
 ???????????? Throwable x = e.getCause();
 ???????????? if (x instanceof IOException)
 ????????????????? throw (IOException)x;
 ???????????? if (x instanceof RuntimeException)
 ????????????????? throw (RuntimeException)x;
 ???????????? if (x instanceof Error)
 ????????????????? throw (Error)x;

 ???????????? throw new InternalError(x);
 ???????? }
 ??? }

On 12/9/19 7:17 AM, Bob Vandette wrote:
> Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException
> so it?s consistent with the other get functions?
>
> Bob.
>
>
>> On Dec 6, 2019, at 8:41 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
>>
>> Hi David, Mandy, and Bob,
>>
>> Thank you for reviewing this fix.
>>
>> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
>> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
>> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
>>      was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
>>      I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
>>      but I agree that the changes proposed in the previous version of the webrev increase such probability.
>>      I filed the follow-up issue [4] as Mandy suggested.
>> 3.  The legacy methods were renamed as David suggested.
>>
>>
>>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>>> !     static int initialized=1;
>>>
>>> Am I reading this right that the code currently fails to actually do the
>>> initialization because of this ???
>> Yes, currently the code fails to do the initialization but it was unnoticed since method
>> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
>> was always -1.
>>
>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>>
>>> System.out.println(String.format(...)
>>>
>>> Why not simply
>>>
>>> System.out.printf(..)
>> As I tried explain it earlier it would make the tests unstable.
>> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
>> Instead it parses the format string into a list of FormatString objects and then iterates over the list.
>> As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
>> in the output.
>>
>> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
>> and "1030762496".
>>
>> <skipped>
>> [0.304s][trace][os,container] Memory Usage is: 42983424
>> OperatingSystemMXBean.getFreeMemorySize: 1030758400
>> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>> [0.305s][trace][os,container] Memory Usage is: 42979328
>> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
>> 1030762496
>> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
>>
>> <skipped>
>> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
>>
>> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
>> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
>> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
>> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
>> 	at java.base/java.lang.Thread.run(Thread.java:832)
>>
>> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
>>
>> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
>> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
>> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
>> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
>>
>> Thank you,
>> Daniil
>>
>> ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
>>
>>
>>
>>     On 12/6/19 5:59 AM, Bob Vandette wrote:
>>>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>>>>
>>>>
>>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>>>>
>>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
>>     I thought that the error case we are referring to is limit == 0 which
>>     indicates something unexpected goes wrong.  So the compatibility concern
>>     should be low.  This is very specific to Metrics implementation for
>>     cgroup v1 and let me know if I'm wrong.
>>
>>>> Surely there must always be some information available from the operating environment? I see from the impl file:
>>>>
>>>>     // the host data, value 0 indicates that something went wrong while the metric was read and
>>>>    // in this case we return "information unavailable" code -1.
>>>>
>>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
>>> I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
>>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
>>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
>>> limits.
>>>
>>     It's important to consider carefully if the monitoring API indicates an
>>     error vs unavailable and an application should continue to run when the
>>     monitoring system fails to get the metrics.
>>
>>     There are several choices to report "something goes wrong" scenarios
>>     (should unlikely happen???):
>>     1. fall back to a random positive value  (e.g. host value)
>>     2. return a negative value
>>     3. throw an exception
>>
>>     #3 is not an option as the application is not expecting this.  For #2,
>>     the application can filter bad values if desirable.
>>
>>     I'm okay if you want to file a JBS issue to follow up and thoroughly
>>     look at the cases that the metrics are unavailable and the cases when
>>     fails to obtain.
>>
>>>> ---
>>>>
>>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>>>
>>>> System.out.println(String.format(...)
>>>>
>>>> Why not simply
>>>>
>>>> System.out.printf(..)
>>>>
>>>> ?
>>     or simply (as I commented [1])
>>          System.out.format
>>
>>     Mandy
>>     [1]
>>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
>>
>>
>>
>>


From larry.cable at oracle.com  Mon Dec  9 18:21:44 2019
From: larry.cable at oracle.com (Laurence Cable)
Date: Mon, 9 Dec 2019 10:21:44 -0800
Subject: RFR: JDK-8215196: [Graal]
 vmTestbase/nsk/jvmti/PopFrame/popframe003/TestDescription.java fails with
 "changes for the arguments of the popped frame's method, did not remain
 current argument values"
In-Reply-To: <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com>
References: <dbcd93d3-5735-3cba-b6a5-6e2010e3809c@oracle.com>
 <9b65ade5-4f37-bbb5-7a28-3ad82bc32d6e@oracle.com>
 <fcf3b6ca-c1d6-02dd-16a1-f270797a3945@oracle.com>
 <259cfdde-f790-1350-af23-df513ccab4eb@oracle.com>
 <08f043d9-4dfc-b1f4-d11e-0ecd3eb35b36@oracle.com>
 <37e588b5-b6f1-e41b-5e42-708fb44f8bc2@oracle.com>
 <444ed938-d873-12fc-a55e-2d645a099260@oracle.com>
 <3be1587b-c0cd-5f1e-dee2-072c341e02e6@oracle.com>
 <1994d5d6-383b-0abb-3203-ab42da9ab81c@oracle.com>
 <6987c267-970c-d2e1-ce70-b7f18e9ff329@oracle.com>
Message-ID: <7f5bda1a-5741-e670-86fe-57767f0afa94@oracle.com>

inline...

On 12/6/19 6:12 PM, serguei.spitsyn at oracle.com wrote:
> On 12/6/19 17:24, Daniel D. Daugherty wrote:
>> On 12/6/19 6:26 PM, serguei.spitsyn at oracle.com wrote:
>>> On 12/6/19 13:52, Chris Plummer wrote:
>>>> On 12/6/19 1:18 PM, serguei.spitsyn at oracle.com wrote:
>>>>> On 12/6/19 11:07, Chris Plummer wrote:
>>>>>> On 12/5/19 6:45 PM, David Holmes wrote:
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> On 6/12/2019 11:31 am, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Chris and Alex,
>>>>>>>>
>>>>>>>> (I've also included Dan, David and Dean to the mailing list)
>>>>>>>>
>>>>>>>> We have to reach a consensus about this.
>>>>>>>
>>>>>>> This is just part of a much broader issue with JVM TI that I 
>>>>>>> tried to have a discussion started based on Richard Reingruber's 
>>>>>>> proposals around Escape Analysis:
>>>>>>>
>>>>>>> http://mail.openjdk.java.net/pipermail/serviceability-dev/2019-September/029285.html 
>>>>>>>
>>>>>>>
>>>>>>> Unfortunately that discussion did not get much traction.
>>>>>> Hmm. I have the emails that precede yours above, but not that 
>>>>>> one. Not sure how what happened. Just read through it and it did 
>>>>>> give me one thought.
>>>>>
>>>>>> Consider a model where the program is designed drive behavior of 
>>>>>> the agent, triggering the agent to do certain things by having 
>>>>>> the program do certain things. Normally an agent monitors the 
>>>>>> application, but in this case the application is purposefully 
>>>>>> controlling actions performed by the agent. If code is elided 
>>>>>> from the program, then the agent no longer performs as expected. 
>>>>>> It's a kind of backwards jvmti programming model, and you may ask 
>>>>>> why would anyone do this. I'm not sure if there's a good reason 
>>>>>> for it, but should it be expected to work given how the spec is 
>>>>>> written?
>>>>>
>>>>> My interpretation is that the current JVM TI PopFrame behavior 
>>>>> does not break this model.
>>>>> The spec says: "any changes to the arguments, which occurred in 
>>>>> the called method, remain;"
>>>>> As the code was eliminated by the compiler then no changes to this 
>>>>> argument occurred.
>>>>> So, the PopFrame behavior follows the spec. So, I think, the 
>>>>> option #2 is not right. But it depends on our basic philosophy.
>>>>> If the developer wants to control the agent then the program has 
>>>>> to be designed to do something meaningful that is not going to be 
>>>>> optimized out by the JIT compiler.
>>>> You misunderstood my point. What I'm saying is that someone might 
>>>> do something like assign to a local with the specific intent of 
>>>> having that trigger a jmvti event, with the specific intent of 
>>>> having the agent perform some expected action as a result. Think of 
>>>> it as being a trigger for the agent, not as the agent monitoring 
>>>> the app. For example, you could right a program + agent, and 
>>>> setting a specific local in the program triggers the agent to turn 
>>>> on a light, and setting some other local turns it off. Absurd, but 
>>>> possible, and maybe there are less absurd applications.
>>>
>>> I think, I understood your point correctly.
>>> Your point is that the code that can be eliminated (e.g. local++) is 
>>> not that meaningless as it seems to be.
>>> My point is that there are still other more reliable ways to trigger 
>>> the agent.
>>> So that relying on something that can be eliminated by JIT compilers 
>>> is not important to support.
>>
>> You are making the assumption that the agent author understands what
>> Java code/variables *might* be eliminated by the JIT compiler. I don't
>> think that's a good assumption. I might have code that does a really
>> complicated thing in a local variable that is only useful to the
>> agent itself. The JIT will see that the local variable cannot escape
>> the function and is not used outside the function (as far as it can
>> see) so it will elide the local variable and the code that was used
>> to generated the local result in the variable.
>>
>> If that local result happens to be some computation that the agent
>> needed to see to do its next operation...
>
> Thank you for sharing your point.
> I'm not insisting on my assumptions here, just not sure this is more 
> important than allowing optimizations.
> Do you actually think this use case needs to be supported?
>
> In general, to identify our philosophy about interaction between JIT 
> compiler
> code elimination and JVM TI we need to make some assumptions.
>
>
> Let's temporarily put JVM TI out of scope.
> Are there any assumptions when JIT compilers eliminate some code?
> Is it based on some vision what code is observable?
> If it was decided some code can be eliminated then is it JVM TI only 
> that breaks such assumptions about observability?
>
> If so, then such optimizations can be disabled at some level.
> Then we end up debugging/profiling/monitoring, and finally, observing 
> a slightly different application.
> Are we Okay with this? Do we need any compromises here?
> Maybe we need more flags to control the JIT compiler behavior.
I would say that "in general" (not Java specific) there is an implicit 
assumption that compiler optimization and "debugging" are diametrically 
opposed to each other, and thus one cannot assume/expect that either can 
transparently co-exist, you either optimize or you debug,
but there is a sliding scale between the two extremes, fully optimized 
and no optimization (and hence fully debug-able).

The question is: where does "observe-ability" (as distinct from 
debugging) lie on that continuum?

In most "ahead of time" language compilers, where all the code 
generation occurs during the compilation phase, the developer can choose to
inform the compiler if their intent is to debug the resulting code (no 
mutating optimizations, and full metadata retention), or to optimize for
"production" execution (maximal optimization, and no retention of debug 
metadata).

The JVM moves some of this activity to r/t which is I think an 
implementation detail, there is still a "contract" between the activity 
of the "compilation" component and the "debug/observe-ability" component 
of the runtime environment.

The debug/observe-ability component can only interact with the code that 
the "compiler" generates (AOT & JIT etc).

In other language toolchains, the specification of the intent to "debug" 
typically constrains the "compiler" from making mutating optimizations that
would result in an execution behavior that is not broadly equivalent to 
that expressed at source level.

In short, I think we have two options; define the behavior of the 
execution environment as "undefined", that is the compilers and runtime are
permitted to mutate the code generated from a form equivalent to that 
expressed in source, or we add the ability to express the intent of the
code generated, such that when that intent is to debug it, that mutating 
optimizations are suppressed by the compiler and runtime.

- Larry

> Thanks,
> Serguei
>
>>
>> Dan
>>
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>> Chris
>>>>>
>>>>>>>
>>>>>>>> We have 3 options:
>>>>>>>>
>>>>>>>> Option #1:
>>>>>>>> ?? The JIT optimization to delete a code which "looks useless"
>>>>>>>> ?? has to be disabled if can_pop_frame capability is enabled.
>>>>>>>> ?? Than this problem becomes a JIT compiler bug.
>>>>>>>>
>>>>>>>> Option #2:
>>>>>>>> ?? Consider to relax the JVMTI PopFrame spec by changing it to 
>>>>>>>> something like:
>>>>>>>> ?? "Note however, that the original argument values are not
>>>>>>>> ??? preserved and can be changed by the called method;"
>>>>>>>> ?? Than this problem becomes a JVM TI spec bug.
>>>>>>>>
>>>>>>>> Option #3:
>>>>>>>> ?? Consider it is Okay for compiler to eliminate useless code,
>>>>>>>> ?? so the argument values can be reinitialized by the PopFrame.
>>>>>>>> ?? Than this problem becomes just a test bug.
>>>>>>>>
>>>>>>>>
>>>>>>>> My preference is option #3.
>>>>>>>> The point is that if the arguments are not really used in
>>>>>>>> a method then restoring them to any values is a no-op.
>>>>>>>> It is really meaningless use case, so why should we care about it.
>>>>>> Is "restoring" the proper term here? I thought they were just 
>>>>>> left on the stack and reused on the subsequent invoke.
>>>>>
>>>>> Agreed. The term "restoring" is not accurate here.
>>>>>
>>>>>> In fact I figured the reason for the language in the spec in the 
>>>>>> first place is to alleviate JVMTI from having to restore them to 
>>>>>> their original values, which is probably not even possible.
>>>>>
>>>>> Right.
>>>>>
>>>>>>>
>>>>>>> Thanks for setting that out clearly.
>>>>>>>
>>>>>>> I'd like to agree this is particular case is a test bug. If we 
>>>>>>> have a method:
>>>>>>>
>>>>>>> int incr(int val) {
>>>>>>> ? val++;
>>>>>>> ? popFrameHere();
>>>>>>> ? return val;
>>>>>>> }
>>>>>>>
>>>>>>> then the change to the argument is necessary and must be 
>>>>>>> preserved. In contrast:
>>>>>>>
>>>>>>> void incr(int val) {
>>>>>>> ? val++;
>>>>>>> ? popFrameHere();
>>>>>>> }
>>>>>>>
>>>>>>> the change to the argument is meaningless and I would hope any 
>>>>>>> decent JIT would simply elide it.
>>>>>> So, this goes back to my example above where the program is 
>>>>>> trying to elicit behavior from the agent. It's not meaningless in 
>>>>>> that case, but that doesn't mean I think we need to support it.
>>>>>
>>>>> Even with this model it is possible and better to do something 
>>>>> meaningful to control the agent.
>>>>> This model is very rare use case.
>>>>> It is hard to justify a need to support it. :)
>>>>>
>>>>>>>
>>>>>>> But we must have a consistent approach to such things. What 
>>>>>>> would happen if a breakpoint were to be placed on the 
>>>>>>> instruction that uselessly modified the argument - would we 
>>>>>>> still see the modification or would it be elided?
>>>>>> Breakpoints force interpreted mode for the method, although I 
>>>>>> suppose that's a hotspot implementation detail and not something 
>>>>>> a VM would be required to do. A VM that allows breakpoints in 
>>>>>> compiled methods has the potential to miss the breakpoint if code 
>>>>>> is elided.
>>>>>>
>>>>>> Also, what if you put a breakpoint in a method, the call to it is 
>>>>>> elided. You would never hit the breakpoint. That could cause some 
>>>>>> serious head scratching for a debugger user if they know the code 
>>>>>> doing the method call is "executed".
>>>>>
>>>>> If the method is not actually being called then missing 
>>>>> breakpoints there gives a clue what is going on.
>>>>> Otherwise, it will cause cause some serious head scratching for a 
>>>>> debugger user.
>>>>> In general, my preference would be to debug actual behavior.
>>>>> It is not good we have no support breakpoints in compiled methods.
>>>>>
>>>>>
>>>>>>>
>>>>>>> And how do C1 and C2 avoid this issue? Do they simply not 
>>>>>>> optimise away the useless assignment? Or do they actively 
>>>>>>> disable that optimization in this context?
>>>>>>>
>>>>>>> We need, IMO, to establish the basic philosophy of how to manage 
>>>>>>> JVM TI / JIT interactions, so we know what things must remain 
>>>>>>> visible and which can be optimised away.
>>>>>>>
>>>>>>> That said, changing the test allows us to defer having to reach 
>>>>>>> that consensus.
>>>>>> Agreed. I think it's ok to work around the test issue as long as 
>>>>>> we keep this overall issue on the radar. Do we have a bug field 
>>>>>> for that?
>>>>>
>>>>> I thought, it is a little bit early to file a bug for it.
>>>>> Also, probably, it can be an umbrella enhancement or task.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/11/19 3:17 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Alex,
>>>>>>>>>
>>>>>>>>> The fix itself looks Okay.
>>>>>>>>> Minor: replace in the comment: "compiler don't drop" => 
>>>>>>>>> "compiler doesn't drop".
>>>>>>>>>
>>>>>>>>> However, we still have to reach a consensus on how we treat 
>>>>>>>>> this issue (as Chris already commented).
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/8/19 15:22, Alex Menkov wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Please review the fix for
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8215196
>>>>>>>>>> webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~amenkov/jdk14/popframe_args/webrev/
>>>>>>>>>>
>>>>>>>>>> Currently PopFrame is disabled with JVMCI by [1], so for 
>>>>>>>>>> testing I reverted [1] changes.
>>>>>>>>>>
>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8218025
>>>>>>>>>>
>>>>>>>>>> --alex
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>


From daniil.x.titov at oracle.com  Mon Dec  9 18:51:15 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 09 Dec 2019 10:51:15 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <EE35A423-7893-44B5-A0B6-F45A0E0A3AE1@oracle.com>
 <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com>
Message-ID: <E19C2394-51F4-4AAC-BB25-FD01319445A8@oracle.com>

Hi Mandy and Bob,

> Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException
> so it?s consistent with the other get functions?

In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put
the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires  FilePermission checks ( and tests proved that)
so we could change this implementation to the following:

    public static String getStringValue(SubSystem subsystem, String parm) {
        if (subsystem == null) return null;

       try (BufferedReader bufferedReader =
                     AccessController.doPrivileged((PrivilegedExceptionAction<BufferedReader>) () -> {
                         return Files.newBufferedReader(Paths.get(subsystem.path(), parm));
                     })) {
            return bufferedReader.readLine();
        } catch (PrivilegedActionException | IOException  e) {
            return null;
        }
    }

Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap
PrivilegedActionException exception and throw the cause instead?

Thank you,
Daniil

?On 12/9/19, 9:48 AM, "Mandy Chung" <mandy.chung at oracle.com> wrote:

    Files:lines requires FilePermission check.  So it needs to be wrapped 
    with doPrivileged.  The readFilePrivileged can unwrap and throw the 
    cause instead like this:
    
    
         static Stream<String> readFilePrivileged(Path path) throws 
    IOException {
              try {
                  return 
    AccessController.doPrivileged((PrivilegedExceptionAction<Stream<String>>) 
    () -> Files.lines(path));
              } catch (PrivilegedActionException e) {
                  Throwable x = e.getCause();
                  if (x instanceof IOException)
                       throw (IOException)x;
                  if (x instanceof RuntimeException)
                       throw (RuntimeException)x;
                  if (x instanceof Error)
                       throw (Error)x;
    
                  throw new InternalError(x);
              }
         }
    
    On 12/9/19 7:17 AM, Bob Vandette wrote:
    > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException
    > so it?s consistent with the other get functions?
    >
    > Bob.
    >
    >
    >> On Dec 6, 2019, at 8:41 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
    >>
    >> Hi David, Mandy, and Bob,
    >>
    >> Thank you for reviewing this fix.
    >>
    >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
    >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
    >> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
    >>      was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
    >>      I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
    >>      but I agree that the changes proposed in the previous version of the webrev increase such probability.
    >>      I filed the follow-up issue [4] as Mandy suggested.
    >> 3.  The legacy methods were renamed as David suggested.
    >>
    >>
    >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >>> !     static int initialized=1;
    >>>
    >>> Am I reading this right that the code currently fails to actually do the
    >>> initialization because of this ???
    >> Yes, currently the code fails to do the initialization but it was unnoticed since method
    >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
    >> was always -1.
    >>
    >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>>
    >>> System.out.println(String.format(...)
    >>>
    >>> Why not simply
    >>>
    >>> System.out.printf(..)
    >> As I tried explain it earlier it would make the tests unstable.
    >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
    >> Instead it parses the format string into a list of FormatString objects and then iterates over the list.
    >> As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
    >> in the output.
    >>
    >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
    >> and "1030762496".
    >>
    >> <skipped>
    >> [0.304s][trace][os,container] Memory Usage is: 42983424
    >> OperatingSystemMXBean.getFreeMemorySize: 1030758400
    >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    >> [0.305s][trace][os,container] Memory Usage is: 42979328
    >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    >> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
    >> 1030762496
    >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
    >>
    >> <skipped>
    >> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
    >>
    >> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
    >> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
    >> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
    >> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    >> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    >> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    >> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    >> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    >> 	at java.base/java.lang.Thread.run(Thread.java:832)
    >>
    >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
    >>
    >> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
    >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
    >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
    >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
    >>
    >> Thank you,
    >> Daniil
    >>
    >> ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
    >>
    >>
    >>
    >>     On 12/6/19 5:59 AM, Bob Vandette wrote:
    >>>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
    >>>>
    >>>>
    >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
    >>>>
    >>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
    >>     I thought that the error case we are referring to is limit == 0 which
    >>     indicates something unexpected goes wrong.  So the compatibility concern
    >>     should be low.  This is very specific to Metrics implementation for
    >>     cgroup v1 and let me know if I'm wrong.
    >>
    >>>> Surely there must always be some information available from the operating environment? I see from the impl file:
    >>>>
    >>>>     // the host data, value 0 indicates that something went wrong while the metric was read and
    >>>>    // in this case we return "information unavailable" code -1.
    >>>>
    >>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
    >>> I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
    >>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
    >>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
    >>> limits.
    >>>
    >>     It's important to consider carefully if the monitoring API indicates an
    >>     error vs unavailable and an application should continue to run when the
    >>     monitoring system fails to get the metrics.
    >>
    >>     There are several choices to report "something goes wrong" scenarios
    >>     (should unlikely happen???):
    >>     1. fall back to a random positive value  (e.g. host value)
    >>     2. return a negative value
    >>     3. throw an exception
    >>
    >>     #3 is not an option as the application is not expecting this.  For #2,
    >>     the application can filter bad values if desirable.
    >>
    >>     I'm okay if you want to file a JBS issue to follow up and thoroughly
    >>     look at the cases that the metrics are unavailable and the cases when
    >>     fails to obtain.
    >>
    >>>> ---
    >>>>
    >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>>>
    >>>> System.out.println(String.format(...)
    >>>>
    >>>> Why not simply
    >>>>
    >>>> System.out.printf(..)
    >>>>
    >>>> ?
    >>     or simply (as I commented [1])
    >>          System.out.format
    >>
    >>     Mandy
    >>     [1]
    >>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
    >>
    >>
    >>
    >>
    
    
From daniil.x.titov at oracle.com  Mon Dec  9 18:59:21 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 09 Dec 2019 10:59:21 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <89E9C74A-7962-4408-93C5-1AA947FD973D@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <EE35A423-7893-44B5-A0B6-F45A0E0A3AE1@oracle.com>
 <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com>
 <89E9C74A-7962-4408-93C5-1AA947FD973D@oracle.com>
Message-ID: <E842776C-D0F6-4CC8-B915-FF86B9AC0BE9@oracle.com>

A correction...

We could even further simplify  it as the following: 

   public static String getStringValue(SubSystem subsystem, String parm) {
        if (subsystem == null) return null;

        try (BufferedReader bufferedReader = AccessController.doPrivileged((PrivilegedExceptionAction<BufferedReader>)
                () -> Files.newBufferedReader(Paths.get(subsystem.path(), parm)))) {
            return bufferedReader.readLine();
        } catch (PrivilegedActionException | IOException e) {
            return null;
        }
    }

Best regards,
Daniil

?On 12/9/19, 10:51 AM, "Daniil Titov" <daniil.x.titov at oracle.com> wrote:

    Hi Mandy and Bob,
    
    > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException
    > so it?s consistent with the other get functions?
    
    In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put
    the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires  FilePermission checks ( and tests proved that)
    so we could change this implementation to the following:
    
        public static String getStringValue(SubSystem subsystem, String parm) {
            if (subsystem == null) return null;
    
           try (BufferedReader bufferedReader =
                         AccessController.doPrivileged((PrivilegedExceptionAction<BufferedReader>) () -> {
                             return Files.newBufferedReader(Paths.get(subsystem.path(), parm));
                         })) {
                return bufferedReader.readLine();
            } catch (PrivilegedActionException | IOException  e) {
                return null;
            }
        }
    
    Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap
    PrivilegedActionException exception and throw the cause instead?
    
    Thank you,
    Daniil
    
    ?On 12/9/19, 9:48 AM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
    
        Files:lines requires FilePermission check.  So it needs to be wrapped 
        with doPrivileged.  The readFilePrivileged can unwrap and throw the 
        cause instead like this:
        
        
             static Stream<String> readFilePrivileged(Path path) throws 
        IOException {
                  try {
                      return 
        AccessController.doPrivileged((PrivilegedExceptionAction<Stream<String>>) 
        () -> Files.lines(path));
                  } catch (PrivilegedActionException e) {
                      Throwable x = e.getCause();
                      if (x instanceof IOException)
                           throw (IOException)x;
                      if (x instanceof RuntimeException)
                           throw (RuntimeException)x;
                      if (x instanceof Error)
                           throw (Error)x;
        
                      throw new InternalError(x);
                  }
             }
        
        On 12/9/19 7:17 AM, Bob Vandette wrote:
        > Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException
        > so it?s consistent with the other get functions?
        >
        > Bob.
        >
        >
        >> On Dec 6, 2019, at 8:41 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
        >>
        >> Hi David, Mandy, and Bob,
        >>
        >> Thank you for reviewing this fix.
        >>
        >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
        >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
        >> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
        >>      was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
        >>      I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
        >>      but I agree that the changes proposed in the previous version of the webrev increase such probability.
        >>      I filed the follow-up issue [4] as Mandy suggested.
        >> 3.  The legacy methods were renamed as David suggested.
        >>
        >>
        >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
        >>> !     static int initialized=1;
        >>>
        >>> Am I reading this right that the code currently fails to actually do the
        >>> initialization because of this ???
        >> Yes, currently the code fails to do the initialization but it was unnoticed since method
        >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
        >> was always -1.
        >>
        >>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
        >>>
        >>> System.out.println(String.format(...)
        >>>
        >>> Why not simply
        >>>
        >>> System.out.printf(..)
        >> As I tried explain it earlier it would make the tests unstable.
        >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
        >> Instead it parses the format string into a list of FormatString objects and then iterates over the list.
        >> As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
        >> in the output.
        >>
        >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
        >> and "1030762496".
        >>
        >> <skipped>
        >> [0.304s][trace][os,container] Memory Usage is: 42983424
        >> OperatingSystemMXBean.getFreeMemorySize: 1030758400
        >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
        >> [0.305s][trace][os,container] Memory Usage is: 42979328
        >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
        >> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
        >> 1030762496
        >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
        >>
        >> <skipped>
        >> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
        >>
        >> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
        >> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
        >> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
        >> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        >> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        >> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        >> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
        >> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
        >> 	at java.base/java.lang.Thread.run(Thread.java:832)
        >>
        >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
        >>
        >> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
        >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
        >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
        >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
        >>
        >> Thank you,
        >> Daniil
        >>
        >> ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
        >>
        >>
        >>
        >>     On 12/6/19 5:59 AM, Bob Vandette wrote:
        >>>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
        >>>>
        >>>>
        >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
        >>>>
        >>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
        >>     I thought that the error case we are referring to is limit == 0 which
        >>     indicates something unexpected goes wrong.  So the compatibility concern
        >>     should be low.  This is very specific to Metrics implementation for
        >>     cgroup v1 and let me know if I'm wrong.
        >>
        >>>> Surely there must always be some information available from the operating environment? I see from the impl file:
        >>>>
        >>>>     // the host data, value 0 indicates that something went wrong while the metric was read and
        >>>>    // in this case we return "information unavailable" code -1.
        >>>>
        >>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
        >>> I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
        >>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
        >>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
        >>> limits.
        >>>
        >>     It's important to consider carefully if the monitoring API indicates an
        >>     error vs unavailable and an application should continue to run when the
        >>     monitoring system fails to get the metrics.
        >>
        >>     There are several choices to report "something goes wrong" scenarios
        >>     (should unlikely happen???):
        >>     1. fall back to a random positive value  (e.g. host value)
        >>     2. return a negative value
        >>     3. throw an exception
        >>
        >>     #3 is not an option as the application is not expecting this.  For #2,
        >>     the application can filter bad values if desirable.
        >>
        >>     I'm okay if you want to file a JBS issue to follow up and thoroughly
        >>     look at the cases that the metrics are unavailable and the cases when
        >>     fails to obtain.
        >>
        >>>> ---
        >>>>
        >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
        >>>>
        >>>> System.out.println(String.format(...)
        >>>>
        >>>> Why not simply
        >>>>
        >>>> System.out.printf(..)
        >>>>
        >>>> ?
        >>     or simply (as I commented [1])
        >>          System.out.format
        >>
        >>     Mandy
        >>     [1]
        >>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
        >>
        >>
        >>
        >>
        
        
From daniil.x.titov at oracle.com  Mon Dec  9 19:31:20 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 09 Dec 2019 11:31:20 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com>
Message-ID: <447D72BA-1046-4131-9B13-C1F7244D87EA@oracle.com>

Hi David,

>     So we never try to access the uninitialized counters.cpus array which is 
>    good but we still return garbage for counters.jvmTicks and 
>   counters.cpuTicks - surely that should have been noticeable?

It only affected the first time the CPU load was requested. Function get_cpuload_internal(...)
calls get_totalticks() and get_jvmticks() functions that update these counters.
But on the first call, yes, it compares the newly received counters with the garbage.


It has the code that seems to be written to somehow mitigate it and in worse case just return
0 or 1.0. But it also could be  that there is some other problem this code tries to solve so I'm not sure
we should remove these workarounds as a part of the current fix.

274             // seems like we sometimes end up with less kernel ticks when
 275             // reading /proc/self/stat a second time, timing issue between cpus?
 276             if (pticks->usedKernel < tmp.usedKernel) {
 277                 kdiff = 0;
 278             } else {
 279                 kdiff = pticks->usedKernel - tmp.usedKernel;
 280             }
 281             tdiff = pticks->total - tmp.total;
 282             udiff = pticks->used - tmp.used;
 283 
 284             if (tdiff == 0) {
 285                 user_load = 0;
 286             } else {
 287                 if (tdiff < (udiff + kdiff)) {
 288                     tdiff = udiff + kdiff;
 289                 }
 290                 *pkernelLoad = (kdiff / (double)tdiff);
 291                 // BUG9044876, normalize return values to sane values
 292                 *pkernelLoad = MAX(*pkernelLoad, 0.0);
 293                 *pkernelLoad = MIN(*pkernelLoad, 1.0);
 294 
 295                 user_load = (udiff / (double)tdiff);
 296                 user_load = MAX(user_load, 0.0);
 297                 user_load = MIN(user_load, 1.0);
 298             }
 299         }
 
Best regards,
Daniil

?On 12/8/19, 8:49 PM, "David Holmes" <david.holmes at oracle.com> wrote:

    Hi Daniil,
    
    On 7/12/2019 11:41 am, Daniil Titov wrote:
    > Hi David, Mandy, and Bob,
    > 
    > Thank you for reviewing this fix.
    > 
    > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
    > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
    
    Okay.
    
    > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
    >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
    
    Okay.
    
    >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
    >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
    >       I filed the follow-up issue [4] as Mandy suggested.
    
    I added a comment to the bug. This is potentially a difficult problem to 
    resolve - it all depends on the likelihood of any errors and what they 
    really indicate.
    
    > 3.  The legacy methods were renamed as David suggested.
    
    Thanks!
    
    > 
    >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >> !     static int initialized=1;
    >>
    >>   Am I reading this right that the code currently fails to actually do the
    >> initialization because of this ???
    > 
    > Yes, currently the code fails to do the initialization but it was unnoticed since method
    > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
    > was always -1.
    
    So we never try to access the uninitialized counters.cpus array which is 
    good but we still return garbage for counters.jvmTicks and 
    counters.cpuTicks - surely that should have been noticeable?
    
    >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>
    >> System.out.println(String.format(...)
    >>
    >> Why not simply
    >>
    >> System.out.printf(..)
    > 
    > As I tried explain it earlier it would make the tests unstable.
    > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
    > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
    > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
    > in the output.
    
    Sorry I missed the earlier explanation. I find it somewhat surprising 
    that format() works that way, but without unlimited buffering there will 
    always be a need to flush the outputstream at some point.
    
    Thanks,
    David
    -----
    
    > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
    > and "1030762496".
    > 
    > <skipped>
    > [0.304s][trace][os,container] Memory Usage is: 42983424
    > OperatingSystemMXBean.getFreeMemorySize: 1030758400
    > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    > [0.305s][trace][os,container] Memory Usage is: 42979328
    > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
    > 1030762496
    > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
    > 
    > <skipped>
    > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
    > 
    > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
    > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
    > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
    > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    > 	at java.base/java.lang.Thread.run(Thread.java:832)
    > 
    > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
    > 
    > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
    > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
    > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
    > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
    > 
    > Thank you,
    > Daniil
    > 
    > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
    > 
    >      
    >      
    >      On 12/6/19 5:59 AM, Bob Vandette wrote:
    >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
    >      >>
    >      >>
    >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
    >      >>
    >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
    >      
    >      I thought that the error case we are referring to is limit == 0 which
    >      indicates something unexpected goes wrong.  So the compatibility concern
    >      should be low.  This is very specific to Metrics implementation for
    >      cgroup v1 and let me know if I'm wrong.
    >      
    >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
    >      >>
    >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
    >      >>    // in this case we return "information unavailable" code -1.
    >      >>
    >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
    >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
    >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
    >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
    >      > limits.
    >      >
    >      
    >      It's important to consider carefully if the monitoring API indicates an
    >      error vs unavailable and an application should continue to run when the
    >      monitoring system fails to get the metrics.
    >      
    >      There are several choices to report "something goes wrong" scenarios
    >      (should unlikely happen???):
    >      1. fall back to a random positive value  (e.g. host value)
    >      2. return a negative value
    >      3. throw an exception
    >      
    >      #3 is not an option as the application is not expecting this.  For #2,
    >      the application can filter bad values if desirable.
    >      
    >      I'm okay if you want to file a JBS issue to follow up and thoroughly
    >      look at the cases that the metrics are unavailable and the cases when
    >      fails to obtain.
    >      
    >      >> ---
    >      >>
    >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >      >>
    >      >> System.out.println(String.format(...)
    >      >>
    >      >> Why not simply
    >      >>
    >      >> System.out.printf(..)
    >      >>
    >      >> ?
    >      
    >      or simply (as I commented [1])
    >           System.out.format
    >      
    >      Mandy
    >      [1]
    >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
    >      
    >      
    > 
    > 
    

From leonid.mesnik at oracle.com  Mon Dec  9 20:56:34 2019
From: leonid.mesnik at oracle.com (Leonid Mesnik)
Date: Mon, 9 Dec 2019 12:56:34 -0800
Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi
 tests
In-Reply-To: <1a9d76e5-3a73-9804-be3f-93d6864465be@oracle.com>
References: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com>
 <f2de7bfa-e885-1357-ca89-0be7add5a0e1@oracle.com>
 <1a9d76e5-3a73-9804-be3f-93d6864465be@oracle.com>
Message-ID: <2218fd4b-5070-aa83-2584-82f26187cd23@oracle.com>

David, Serguei

Thank you for review. I added comment about JDITestRuntimeException in 
https://bugs.openjdk.java.net/browse/JDK-8235544

Leonid

On 12/7/19 9:19 PM, David Holmes wrote:
> +1 on both counts
>
> Not sure JDITestRuntimeException is really necessary/useful versus 
> just using RuntimeException, but that's a different issue.
>
> Thanks,
> David
>
> On 8/12/2019 2:30 pm, serguei.spitsyn at oracle.com wrote:
>> Hi Leonid,
>>
>> The fix looks good.
>>
>> Thank you for taking care about it!
>> I agree, it is an awful duplication.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 12/7/19 18:17, Leonid Mesnik wrote:
>>> Hi
>>>
>>> Could you please review following fix which just remove duplicated 
>>> threadByName methods and JDITestRuntimeException exceptions in 
>>> nsk/jdi tests. I don't see any reason to have so many copies of them.
>>>
>>> The method threadByName is added nsk.share.jdi.Debugee class as 
>>> 'threadByNameOrThrow' because slightly different 'threadByName' 
>>> already exist there. I filed another sub-task 
>>> https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage and 
>>> merge these 2 methods later.
>>>
>>> This fix affects about ~4000 lines and I want to keep it as 
>>> straight-forward as possible.
>>>
>>> webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8235530
>>>
>>> The next planned steps are in:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8233830
>>>
>>> Leonid
>>>
>>

From coleen.phillimore at oracle.com  Mon Dec  9 21:04:33 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 9 Dec 2019 16:04:33 -0500
Subject: RFR: 8235530: Removed duplicated threadByName methods in nsk/jdi
 tests
In-Reply-To: <2218fd4b-5070-aa83-2584-82f26187cd23@oracle.com>
References: <5ac9ca0e-d756-3340-597e-2a03e6e6fa24@oracle.com>
 <f2de7bfa-e885-1357-ca89-0be7add5a0e1@oracle.com>
 <1a9d76e5-3a73-9804-be3f-93d6864465be@oracle.com>
 <2218fd4b-5070-aa83-2584-82f26187cd23@oracle.com>
Message-ID: <ca238477-d069-6601-45e5-9521e706b8e0@oracle.com>


Very nice!

4153 lines changed: 21 ins; 3841 del; 291 mod; 99816 unchg


Coleen

On 12/9/19 3:56 PM, Leonid Mesnik wrote:
> David, Serguei
>
> Thank you for review. I added comment about JDITestRuntimeException in 
> https://bugs.openjdk.java.net/browse/JDK-8235544
>
> Leonid
>
> On 12/7/19 9:19 PM, David Holmes wrote:
>> +1 on both counts
>>
>> Not sure JDITestRuntimeException is really necessary/useful versus 
>> just using RuntimeException, but that's a different issue.
>>
>> Thanks,
>> David
>>
>> On 8/12/2019 2:30 pm, serguei.spitsyn at oracle.com wrote:
>>> Hi Leonid,
>>>
>>> The fix looks good.
>>>
>>> Thank you for taking care about it!
>>> I agree, it is an awful duplication.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 12/7/19 18:17, Leonid Mesnik wrote:
>>>> Hi
>>>>
>>>> Could you please review following fix which just remove duplicated 
>>>> threadByName methods and JDITestRuntimeException exceptions in 
>>>> nsk/jdi tests. I don't see any reason to have so many copies of them.
>>>>
>>>> The method threadByName is added nsk.share.jdi.Debugee class as 
>>>> 'threadByNameOrThrow' because slightly different 'threadByName' 
>>>> already exist there. I filed another sub-task 
>>>> https://bugs.openjdk.java.net/browse/JDK-8235544 to review usage 
>>>> and merge these 2 methods later.
>>>>
>>>> This fix affects about ~4000 lines and I want to keep it as 
>>>> straight-forward as possible.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~lmesnik/8235530/webrev.00/
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8235530
>>>>
>>>> The next planned steps are in:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8233830
>>>>
>>>> Leonid
>>>>
>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191209/3a0e50fe/attachment.htm>

From mandy.chung at oracle.com  Mon Dec  9 21:54:58 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 9 Dec 2019 13:54:58 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <E19C2394-51F4-4AAC-BB25-FD01319445A8@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <EE35A423-7893-44B5-A0B6-F45A0E0A3AE1@oracle.com>
 <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com>
 <E19C2394-51F4-4AAC-BB25-FD01319445A8@oracle.com>
Message-ID: <2b476e9f-0266-eba7-3629-114296a55f29@oracle.com>


On 12/9/19 10:51 AM, Daniil Titov wrote:
> Hi Mandy and Bob,
>
>> Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException
>> so it?s consistent with the other get functions?
> In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put
> the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires  FilePermission checks ( and tests proved that)
> so we could change this implementation to the following:
>
>      public static String getStringValue(SubSystem subsystem, String parm) {
>          if (subsystem == null) return null;
>
>         try (BufferedReader bufferedReader =
>                       AccessController.doPrivileged((PrivilegedExceptionAction<BufferedReader>) () -> {
>                           return Files.newBufferedReader(Paths.get(subsystem.path(), parm));
>                       })) {
>              return bufferedReader.readLine();
>          } catch (PrivilegedActionException | IOException  e) {
>              return null;
>          }
>      }
>
> Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap
> PrivilegedActionException exception and throw the cause instead?
>

I think it's simpler to read and understand if the doPrivileged call is 
moved out as a separate method that will throw IOException as the 
expected functionality as suggested above.

For SubSystem::getStringValue, one suggestion would be:

diff --git 
a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java 
b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java
--- 
a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java
+++ 
b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java
@@ -29,7 +29,11 @@
 ?import java.io.IOException;
 ?import java.math.BigInteger;
 ?import java.nio.file.Files;
+import java.nio.file.Path;
 ?import java.nio.file.Paths;
+import java.security.AccessController;
+import java.security.PrivilegedActionException;
+import java.security.PrivilegedExceptionAction;
 ?import java.util.ArrayList;
 ?import java.util.List;
 ?import java.util.Optional;
@@ -90,9 +94,8 @@
 ???? public static String getStringValue(SubSystem subsystem, String 
parm) {
 ???????? if (subsystem == null) return null;

-??????? try(BufferedReader bufferedReader = 
Files.newBufferedReader(Paths.get(subsystem.path(), parm))) {
-??????????? String line = bufferedReader.readLine();
-??????????? return line;
+??????? try {
+??????????? return subsystem.readStringValue(parm);
 ???????? }
 ???????? catch (IOException e) {
 ???????????? return null;
@@ -100,6 +103,24 @@

 ???? }

+??? private String readStringValue(String param) throws IOException {
+??????? PrivilegedExceptionAction<BufferedReader> pea = () -> 
Files.newBufferedReader(Paths.get(path(), param));
+??????? try (BufferedReader bufferedReader = 
AccessController.doPrivileged(pea)) {
+??????????? String line = bufferedReader.readLine();
+??????????? return line;
+??????? } catch (PrivilegedActionException e) {
+??????????? Throwable x = e.getCause();
+??????????? if (x instanceof IOException)
+??????????????? throw (IOException)x;
+??????????? if (x instanceof RuntimeException)
+??????????????? throw (RuntimeException)x;
+??????????? if (x instanceof Error)
+??????????????? throw (Error)x;
+
+??????????? throw new InternalError(x);
+??????? }
+??? }
+

From daniil.x.titov at oracle.com  Mon Dec  9 23:47:02 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Mon, 09 Dec 2019 15:47:02 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <2b476e9f-0266-eba7-3629-114296a55f29@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <EE35A423-7893-44B5-A0B6-F45A0E0A3AE1@oracle.com>
 <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com>
 <E19C2394-51F4-4AAC-BB25-FD01319445A8@oracle.com>
 <2b476e9f-0266-eba7-3629-114296a55f29@oracle.com>
Message-ID: <48F75B58-0529-4B43-A355-A12EB0D09598@oracle.com>

Hi Mandy and Bob,

Please review a new version of the webrev [1] that moves doPrivileged calls in 
jdk.internal.platform.cgroupv1.SubSystem to separate methods that throw
 IOException, as Mandy suggested.

Mach5 tests are still running.


[1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.06/ 
[2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575      
[3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 

Thank you,
Daniil

?On 12/9/19, 1:55 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:

    
    On 12/9/19 10:51 AM, Daniil Titov wrote:
    > Hi Mandy and Bob,
    >
    >> Why did you not change the exception caught in SubSystem.java:getStringValue to PrivilegedActionException from IOException
    >> so it?s consistent with the other get functions?
    > In this method both Files.newBufferedReader and return bufferedReader.readLine could throw IOException so for simplicity I just put
    > the whole code block in doPrivileged. On the other side I don't believe that BufferedReader.readline() requires  FilePermission checks ( and tests proved that)
    > so we could change this implementation to the following:
    >
    >      public static String getStringValue(SubSystem subsystem, String parm) {
    >          if (subsystem == null) return null;
    >
    >         try (BufferedReader bufferedReader =
    >                       AccessController.doPrivileged((PrivilegedExceptionAction<BufferedReader>) () -> {
    >                           return Files.newBufferedReader(Paths.get(subsystem.path(), parm));
    >                       })) {
    >              return bufferedReader.readLine();
    >          } catch (PrivilegedActionException | IOException  e) {
    >              return null;
    >          }
    >      }
    >
    > Could you please advise are you OK with it or you would like to proceed with the approach Mandy suggested to unwrap
    > PrivilegedActionException exception and throw the cause instead?
    >
    
    I think it's simpler to read and understand if the doPrivileged call is 
    moved out as a separate method that will throw IOException as the 
    expected functionality as suggested above.
    
    For SubSystem::getStringValue, one suggestion would be:
    
    diff --git 
    a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java 
    b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java
    --- 
    a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java
    +++ 
    b/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java
    @@ -29,7 +29,11 @@
      import java.io.IOException;
      import java.math.BigInteger;
      import java.nio.file.Files;
    +import java.nio.file.Path;
      import java.nio.file.Paths;
    +import java.security.AccessController;
    +import java.security.PrivilegedActionException;
    +import java.security.PrivilegedExceptionAction;
      import java.util.ArrayList;
      import java.util.List;
      import java.util.Optional;
    @@ -90,9 +94,8 @@
          public static String getStringValue(SubSystem subsystem, String 
    parm) {
              if (subsystem == null) return null;
    
    -        try(BufferedReader bufferedReader = 
    Files.newBufferedReader(Paths.get(subsystem.path(), parm))) {
    -            String line = bufferedReader.readLine();
    -            return line;
    +        try {
    +            return subsystem.readStringValue(parm);
              }
              catch (IOException e) {
                  return null;
    @@ -100,6 +103,24 @@
    
          }
    
    +    private String readStringValue(String param) throws IOException {
    +        PrivilegedExceptionAction<BufferedReader> pea = () -> 
    Files.newBufferedReader(Paths.get(path(), param));
    +        try (BufferedReader bufferedReader = 
    AccessController.doPrivileged(pea)) {
    +            String line = bufferedReader.readLine();
    +            return line;
    +        } catch (PrivilegedActionException e) {
    +            Throwable x = e.getCause();
    +            if (x instanceof IOException)
    +                throw (IOException)x;
    +            if (x instanceof RuntimeException)
    +                throw (RuntimeException)x;
    +            if (x instanceof Error)
    +                throw (Error)x;
    +
    +            throw new InternalError(x);
    +        }
    +    }
    +
    

From serguei.spitsyn at oracle.com  Tue Dec 10 02:02:21 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 9 Dec 2019 18:02:21 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
Message-ID: <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>

Hi Daniil,

It is not a full review, just some minor comments.
In fact, I do not see real problems yet.

http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html

 ? 55???? public long getTotalSwapSpaceSize() {
 ? 56???????? if (containerMetrics != null) {
 ? 57???????????? long limit = containerMetrics.getMemoryAndSwapLimit();
 ? 58???????????? // The memory limit metrics is not available if JVM 
runs on Linux host ( not in a docker container)
 ? 59???????????? // or if a docker container was started without 
specifying a memory limit ( without '--memory='
 ? 60???????????? // Docker option). In latter case there is no limit on 
how much memory the container can use and
 ? 61???????????? // it can use as much memory as the host's OS allows.
 ? 62???????????? long memLimit = containerMetrics.getMemoryLimit();
 ? 63???????????? if (limit >= 0 && memLimit >= 0) {
 ? 64???????????????? return limit - memLimit;
 ? 65???????????? }
 ? 66???????? }
 ? 67???????? return getTotalSwapSpaceSize0();
 ? 68???? }

 ? Unneeded space after brackets '('.
 ? Do we need to check if the (limit - memLimit) value is negative?
 ? The same question is for getFreeSwapSpaceSize():
 ??? memSwapLimit - memLimit - (memSwapUsage - memUsage)

 ? and getFreeMemorySize():
 ??? 101 return limit - usage;

 ? 81???????????????????????? // If this happens just retry the loop for 
a few iterations

 ? Dot is missed at the end of comment.


http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html

 ? 34 System.out.println(String.format("Runtime.availableProcessors: 
%d", Runtime.getRuntime().availableProcessors()));
 ? 35 
System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: 
%d", osBean.getAvailableProcessors()));
 ? 36 
System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: 
%d", osBean.getTotalMemorySize()));
 ? 37 
System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
%d", osBean.getTotalPhysicalMemorySize()));
 ? 38 
System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: 
%d", osBean.getFreeMemorySize()));
 ? 39 
System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: 
%d", osBean.getFreePhysicalMemorySize()));
 ? 40 
System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
%d", osBean.getTotalSwapSpaceSize()));
 ? 41 
System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
%d", osBean.getFreeSwapSpaceSize()));
 ? 42 
System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
osBean.getCpuLoad()));
 ? 43 
System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: 
%f", osBean.getSystemCpuLoad()));


 ? To make the above lines a little bit shorter I'd suggest to define a 
log() method like this:
 ???? private static void log(String msg) ( System.out.println(msg(; }

 ? 34???????? log(String.format("Runtime.availableProcessors: %d", 
Runtime.getRuntime().availableProcessors()));
 ? 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: 
%d", osBean.getAvailableProcessors()));
 ? 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", 
osBean.getTotalMemorySize()));
 ? 37 
log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
%d", osBean.getTotalPhysicalMemorySize()));
 ? 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", 
osBean.getFreeMemorySize()));
 ? 39 
log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", 
osBean.getFreePhysicalMemorySize()));
 ? 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
%d", osBean.getTotalSwapSpaceSize()));
 ? 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
%d", osBean.getFreeSwapSpaceSize()));
 ? 42???????? log(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
osBean.getCpuLoad()));
 ? 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", 
osBean.getSystemCpuLoad()));


Thanks,
Serguei


On 12/6/19 17:41, Daniil Titov wrote:
> Hi David, Mandy, and Bob,
>
> Thank you for reviewing this fix.
>
> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
>       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
>       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
>       but I agree that the changes proposed in the previous version of the webrev increase such probability.
>       I filed the follow-up issue [4] as Mandy suggested.
> 3.  The legacy methods were renamed as David suggested.
>
>
>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>> !     static int initialized=1;
>>
>>   Am I reading this right that the code currently fails to actually do the
>> initialization because of this ???
> Yes, currently the code fails to do the initialization but it was unnoticed since method
> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
> was always -1.
>
>>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>
>> System.out.println(String.format(...)
>>
>> Why not simply
>>
>> System.out.printf(..)
> As I tried explain it earlier it would make the tests unstable.
> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
> Instead it parses the format string into a list of FormatString objects and then iterates over the list.
> As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
> in the output.
>
> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
> and "1030762496".
>
> <skipped>
> [0.304s][trace][os,container] Memory Usage is: 42983424
> OperatingSystemMXBean.getFreeMemorySize: 1030758400
> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
> [0.305s][trace][os,container] Memory Usage is: 42979328
> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
> 1030762496
> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
>
> <skipped>
> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
>
> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
> 	at java.base/java.lang.Thread.run(Thread.java:832)
>
> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
>
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
>
> Thank you,
> Daniil
>
> ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
>
>      
>      
>      On 12/6/19 5:59 AM, Bob Vandette wrote:
>      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>      >>
>      >>
>      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>      >>
>      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
>      
>      I thought that the error case we are referring to is limit == 0 which
>      indicates something unexpected goes wrong.  So the compatibility concern
>      should be low.  This is very specific to Metrics implementation for
>      cgroup v1 and let me know if I'm wrong.
>      
>      >> Surely there must always be some information available from the operating environment? I see from the impl file:
>      >>
>      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
>      >>    // in this case we return "information unavailable" code -1.
>      >>
>      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
>      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
>      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
>      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
>      > limits.
>      >
>      
>      It's important to consider carefully if the monitoring API indicates an
>      error vs unavailable and an application should continue to run when the
>      monitoring system fails to get the metrics.
>      
>      There are several choices to report "something goes wrong" scenarios
>      (should unlikely happen???):
>      1. fall back to a random positive value  (e.g. host value)
>      2. return a negative value
>      3. throw an exception
>      
>      #3 is not an option as the application is not expecting this.  For #2,
>      the application can filter bad values if desirable.
>      
>      I'm okay if you want to file a JBS issue to follow up and thoroughly
>      look at the cases that the metrics are unavailable and the cases when
>      fails to obtain.
>      
>      >> ---
>      >>
>      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>      >>
>      >> System.out.println(String.format(...)
>      >>
>      >> Why not simply
>      >>
>      >> System.out.printf(..)
>      >>
>      >> ?
>      
>      or simply (as I commented [1])
>           System.out.format
>      
>      Mandy
>      [1]
>      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
>      
>      
>
>


From mandy.chung at oracle.com  Tue Dec 10 06:11:59 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 9 Dec 2019 22:11:59 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <48F75B58-0529-4B43-A355-A12EB0D09598@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <EE35A423-7893-44B5-A0B6-F45A0E0A3AE1@oracle.com>
 <95ab70c9-8be4-13cf-ef90-f36b2e993450@oracle.com>
 <E19C2394-51F4-4AAC-BB25-FD01319445A8@oracle.com>
 <2b476e9f-0266-eba7-3629-114296a55f29@oracle.com>
 <48F75B58-0529-4B43-A355-A12EB0D09598@oracle.com>
Message-ID: <1dc773d4-14e6-9209-2a4e-190699d88a47@oracle.com>


On 12/9/19 3:47 PM, Daniil Titov wrote:
> Hi Mandy and Bob,
>
> Please review a new version of the webrev [1] that moves doPrivileged calls in
> jdk.internal.platform.cgroupv1.SubSystem to separate methods that throw
>   IOException, as Mandy suggested.
>
> Mach5 tests are still running.
>
>
> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.06/

I reviewed Metrics and Subsystem in this version.

I? think it's simpler to have unwrapIOExceptionAndRethrow handling the 
InternalError case.


+ List<String> lines = subsystem.readMatchingLines(param);
+ for (String line : lines) {
                  if (line.startsWith(match)) {
                      retval = conversion.apply(line);
                      break;
                  }
              }This can simply call Metrics::readFilePrivileged and process on the 
Stream<String>. return Metrics::readFilePrivileged(Paths.get(subsystem.path(), param))
             .filter(line -> line.startsWith(match))
             .map(conversion::apply)
             .findFirst().orElseGet(() ->retval);


I don't need to see a new webrev.

Mandy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191209/e22a6f49/attachment.htm>

From david.holmes at oracle.com  Tue Dec 10 10:11:23 2019
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Dec 2019 20:11:23 +1000
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <447D72BA-1046-4131-9B13-C1F7244D87EA@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com>
 <447D72BA-1046-4131-9B13-C1F7244D87EA@oracle.com>
Message-ID: <1996939d-54af-c34c-c796-cab3ec92b445@oracle.com>

On 10/12/2019 5:31 am, Daniil Titov wrote:
> Hi David,
> 
>>      So we never try to access the uninitialized counters.cpus array which is
>>     good but we still return garbage for counters.jvmTicks and
>>    counters.cpuTicks - surely that should have been noticeable?
> 
> It only affected the first time the CPU load was requested. Function get_cpuload_internal(...)
> calls get_totalticks() and get_jvmticks() functions that update these counters.
> But on the first call, yes, it compares the newly received counters with the garbage.
> 
> 
> It has the code that seems to be written to somehow mitigate it and in worse case just return
> 0 or 1.0. But it also could be  that there is some other problem this code tries to solve so I'm not sure
> we should remove these workarounds as a part of the current fix.

Please file a follow up RFE to look into this.

Thanks,
David

> 274             // seems like we sometimes end up with less kernel ticks when
>   275             // reading /proc/self/stat a second time, timing issue between cpus?
>   276             if (pticks->usedKernel < tmp.usedKernel) {
>   277                 kdiff = 0;
>   278             } else {
>   279                 kdiff = pticks->usedKernel - tmp.usedKernel;
>   280             }
>   281             tdiff = pticks->total - tmp.total;
>   282             udiff = pticks->used - tmp.used;
>   283
>   284             if (tdiff == 0) {
>   285                 user_load = 0;
>   286             } else {
>   287                 if (tdiff < (udiff + kdiff)) {
>   288                     tdiff = udiff + kdiff;
>   289                 }
>   290                 *pkernelLoad = (kdiff / (double)tdiff);
>   291                 // BUG9044876, normalize return values to sane values
>   292                 *pkernelLoad = MAX(*pkernelLoad, 0.0);
>   293                 *pkernelLoad = MIN(*pkernelLoad, 1.0);
>   294
>   295                 user_load = (udiff / (double)tdiff);
>   296                 user_load = MAX(user_load, 0.0);
>   297                 user_load = MIN(user_load, 1.0);
>   298             }
>   299         }
>   
> Best regards,
> Daniil
> 
> ?On 12/8/19, 8:49 PM, "David Holmes" <david.holmes at oracle.com> wrote:
> 
>      Hi Daniil,
>      
>      On 7/12/2019 11:41 am, Daniil Titov wrote:
>      > Hi David, Mandy, and Bob,
>      >
>      > Thank you for reviewing this fix.
>      >
>      > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
>      > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
>      
>      Okay.
>      
>      > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
>      >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
>      
>      Okay.
>      
>      >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
>      >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
>      >       I filed the follow-up issue [4] as Mandy suggested.
>      
>      I added a comment to the bug. This is potentially a difficult problem to
>      resolve - it all depends on the likelihood of any errors and what they
>      really indicate.
>      
>      > 3.  The legacy methods were renamed as David suggested.
>      
>      Thanks!
>      
>      >
>      >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>      >> !     static int initialized=1;
>      >>
>      >>   Am I reading this right that the code currently fails to actually do the
>      >> initialization because of this ???
>      >
>      > Yes, currently the code fails to do the initialization but it was unnoticed since method
>      > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
>      > was always -1.
>      
>      So we never try to access the uninitialized counters.cpus array which is
>      good but we still return garbage for counters.jvmTicks and
>      counters.cpuTicks - surely that should have been noticeable?
>      
>      >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>      >>
>      >> System.out.println(String.format(...)
>      >>
>      >> Why not simply
>      >>
>      >> System.out.printf(..)
>      >
>      > As I tried explain it earlier it would make the tests unstable.
>      > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
>      > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
>      > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
>      > in the output.
>      
>      Sorry I missed the earlier explanation. I find it somewhat surprising
>      that format() works that way, but without unlimited buffering there will
>      always be a need to flush the outputstream at some point.
>      
>      Thanks,
>      David
>      -----
>      
>      > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
>      > and "1030762496".
>      >
>      > <skipped>
>      > [0.304s][trace][os,container] Memory Usage is: 42983424
>      > OperatingSystemMXBean.getFreeMemorySize: 1030758400
>      > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>      > [0.305s][trace][os,container] Memory Usage is: 42979328
>      > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>      > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
>      > 1030762496
>      > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
>      >
>      > <skipped>
>      > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
>      >
>      > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
>      > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
>      > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
>      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>      > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>      > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
>      > 	at java.base/java.lang.Thread.run(Thread.java:832)
>      >
>      > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
>      >
>      > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
>      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
>      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
>      > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
>      >
>      > Thank you,
>      > Daniil
>      >
>      > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
>      >
>      >
>      >
>      >      On 12/6/19 5:59 AM, Bob Vandette wrote:
>      >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>      >      >>
>      >      >>
>      >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>      >      >>
>      >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
>      >
>      >      I thought that the error case we are referring to is limit == 0 which
>      >      indicates something unexpected goes wrong.  So the compatibility concern
>      >      should be low.  This is very specific to Metrics implementation for
>      >      cgroup v1 and let me know if I'm wrong.
>      >
>      >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
>      >      >>
>      >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
>      >      >>    // in this case we return "information unavailable" code -1.
>      >      >>
>      >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
>      >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
>      >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
>      >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
>      >      > limits.
>      >      >
>      >
>      >      It's important to consider carefully if the monitoring API indicates an
>      >      error vs unavailable and an application should continue to run when the
>      >      monitoring system fails to get the metrics.
>      >
>      >      There are several choices to report "something goes wrong" scenarios
>      >      (should unlikely happen???):
>      >      1. fall back to a random positive value  (e.g. host value)
>      >      2. return a negative value
>      >      3. throw an exception
>      >
>      >      #3 is not an option as the application is not expecting this.  For #2,
>      >      the application can filter bad values if desirable.
>      >
>      >      I'm okay if you want to file a JBS issue to follow up and thoroughly
>      >      look at the cases that the metrics are unavailable and the cases when
>      >      fails to obtain.
>      >
>      >      >> ---
>      >      >>
>      >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>      >      >>
>      >      >> System.out.println(String.format(...)
>      >      >>
>      >      >> Why not simply
>      >      >>
>      >      >> System.out.printf(..)
>      >      >>
>      >      >> ?
>      >
>      >      or simply (as I commented [1])
>      >           System.out.format
>      >
>      >      Mandy
>      >      [1]
>      >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
>      >
>      >
>      >
>      >
>      
> 
> 

From daniil.x.titov at oracle.com  Tue Dec 10 17:49:51 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 10 Dec 2019 09:49:51 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <1996939d-54af-c34c-c796-cab3ec92b445@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <0a412f63-1583-6789-9afa-cebbc968e7c4@oracle.com>
 <447D72BA-1046-4131-9B13-C1F7244D87EA@oracle.com>
 <1996939d-54af-c34c-c796-cab3ec92b445@oracle.com>
Message-ID: <92CAC22E-53FE-422B-B5E0-EA6C959352FC@oracle.com>

Hi David,

> Please file a follow up RFE to look into this.

I created an issue to follow this up [1]

[1] https://bugs.openjdk.java.net/browse/JDK-8235681 

Thank you,
Daniil

?On 12/10/19, 2:11 AM, "David Holmes" <david.holmes at oracle.com> wrote:

    On 10/12/2019 5:31 am, Daniil Titov wrote:
    > Hi David,
    > 
    >>      So we never try to access the uninitialized counters.cpus array which is
    >>     good but we still return garbage for counters.jvmTicks and
    >>    counters.cpuTicks - surely that should have been noticeable?
    > 
    > It only affected the first time the CPU load was requested. Function get_cpuload_internal(...)
    > calls get_totalticks() and get_jvmticks() functions that update these counters.
    > But on the first call, yes, it compares the newly received counters with the garbage.
    > 
    > 
    > It has the code that seems to be written to somehow mitigate it and in worse case just return
    > 0 or 1.0. But it also could be  that there is some other problem this code tries to solve so I'm not sure
    > we should remove these workarounds as a part of the current fix.
    
    Please file a follow up RFE to look into this.
    
    Thanks,
    David
    
    > 274             // seems like we sometimes end up with less kernel ticks when
    >   275             // reading /proc/self/stat a second time, timing issue between cpus?
    >   276             if (pticks->usedKernel < tmp.usedKernel) {
    >   277                 kdiff = 0;
    >   278             } else {
    >   279                 kdiff = pticks->usedKernel - tmp.usedKernel;
    >   280             }
    >   281             tdiff = pticks->total - tmp.total;
    >   282             udiff = pticks->used - tmp.used;
    >   283
    >   284             if (tdiff == 0) {
    >   285                 user_load = 0;
    >   286             } else {
    >   287                 if (tdiff < (udiff + kdiff)) {
    >   288                     tdiff = udiff + kdiff;
    >   289                 }
    >   290                 *pkernelLoad = (kdiff / (double)tdiff);
    >   291                 // BUG9044876, normalize return values to sane values
    >   292                 *pkernelLoad = MAX(*pkernelLoad, 0.0);
    >   293                 *pkernelLoad = MIN(*pkernelLoad, 1.0);
    >   294
    >   295                 user_load = (udiff / (double)tdiff);
    >   296                 user_load = MAX(user_load, 0.0);
    >   297                 user_load = MIN(user_load, 1.0);
    >   298             }
    >   299         }
    >   
    > Best regards,
    > Daniil
    > 
    > ?On 12/8/19, 8:49 PM, "David Holmes" <david.holmes at oracle.com> wrote:
    > 
    >      Hi Daniil,
    >      
    >      On 7/12/2019 11:41 am, Daniil Titov wrote:
    >      > Hi David, Mandy, and Bob,
    >      >
    >      > Thank you for reviewing this fix.
    >      >
    >      > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
    >      > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
    >      
    >      Okay.
    >      
    >      > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
    >      >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
    >      
    >      Okay.
    >      
    >      >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
    >      >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
    >      >       I filed the follow-up issue [4] as Mandy suggested.
    >      
    >      I added a comment to the bug. This is potentially a difficult problem to
    >      resolve - it all depends on the likelihood of any errors and what they
    >      really indicate.
    >      
    >      > 3.  The legacy methods were renamed as David suggested.
    >      
    >      Thanks!
    >      
    >      >
    >      >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >      >> !     static int initialized=1;
    >      >>
    >      >>   Am I reading this right that the code currently fails to actually do the
    >      >> initialization because of this ???
    >      >
    >      > Yes, currently the code fails to do the initialization but it was unnoticed since method
    >      > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
    >      > was always -1.
    >      
    >      So we never try to access the uninitialized counters.cpus array which is
    >      good but we still return garbage for counters.jvmTicks and
    >      counters.cpuTicks - surely that should have been noticeable?
    >      
    >      >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >      >>
    >      >> System.out.println(String.format(...)
    >      >>
    >      >> Why not simply
    >      >>
    >      >> System.out.printf(..)
    >      >
    >      > As I tried explain it earlier it would make the tests unstable.
    >      > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
    >      > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
    >      > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
    >      > in the output.
    >      
    >      Sorry I missed the earlier explanation. I find it somewhat surprising
    >      that format() works that way, but without unlimited buffering there will
    >      always be a need to flush the outputstream at some point.
    >      
    >      Thanks,
    >      David
    >      -----
    >      
    >      > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
    >      > and "1030762496".
    >      >
    >      > <skipped>
    >      > [0.304s][trace][os,container] Memory Usage is: 42983424
    >      > OperatingSystemMXBean.getFreeMemorySize: 1030758400
    >      > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    >      > [0.305s][trace][os,container] Memory Usage is: 42979328
    >      > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    >      > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
    >      > 1030762496
    >      > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
    >      >
    >      > <skipped>
    >      > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
    >      >
    >      > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
    >      > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
    >      > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
    >      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    >      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    >      > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    >      > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    >      > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    >      > 	at java.base/java.lang.Thread.run(Thread.java:832)
    >      >
    >      > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
    >      >
    >      > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
    >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
    >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
    >      > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
    >      >
    >      > Thank you,
    >      > Daniil
    >      >
    >      > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
    >      >
    >      >
    >      >
    >      >      On 12/6/19 5:59 AM, Bob Vandette wrote:
    >      >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
    >      >      >>
    >      >      >>
    >      >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
    >      >      >>
    >      >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
    >      >
    >      >      I thought that the error case we are referring to is limit == 0 which
    >      >      indicates something unexpected goes wrong.  So the compatibility concern
    >      >      should be low.  This is very specific to Metrics implementation for
    >      >      cgroup v1 and let me know if I'm wrong.
    >      >
    >      >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
    >      >      >>
    >      >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
    >      >      >>    // in this case we return "information unavailable" code -1.
    >      >      >>
    >      >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
    >      >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
    >      >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
    >      >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
    >      >      > limits.
    >      >      >
    >      >
    >      >      It's important to consider carefully if the monitoring API indicates an
    >      >      error vs unavailable and an application should continue to run when the
    >      >      monitoring system fails to get the metrics.
    >      >
    >      >      There are several choices to report "something goes wrong" scenarios
    >      >      (should unlikely happen???):
    >      >      1. fall back to a random positive value  (e.g. host value)
    >      >      2. return a negative value
    >      >      3. throw an exception
    >      >
    >      >      #3 is not an option as the application is not expecting this.  For #2,
    >      >      the application can filter bad values if desirable.
    >      >
    >      >      I'm okay if you want to file a JBS issue to follow up and thoroughly
    >      >      look at the cases that the metrics are unavailable and the cases when
    >      >      fails to obtain.
    >      >
    >      >      >> ---
    >      >      >>
    >      >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >      >      >>
    >      >      >> System.out.println(String.format(...)
    >      >      >>
    >      >      >> Why not simply
    >      >      >>
    >      >      >> System.out.printf(..)
    >      >      >>
    >      >      >> ?
    >      >
    >      >      or simply (as I commented [1])
    >      >           System.out.format
    >      >
    >      >      Mandy
    >      >      [1]
    >      >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
    >      >
    >      >
    >      >
    >      >
    >      
    > 
    > 
    

From daniil.x.titov at oracle.com  Tue Dec 10 18:29:56 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 10 Dec 2019 10:29:56 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
Message-ID: <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com>

Hi Serguei,

>       Do we need to check if the (limit - memLimit) value is negative?
>       The same question is for getFreeSwapSpaceSize():
>         memSwapLimit - memLimit - (memSwapUsage - memUsage)
>   
>       and getFreeMemorySize():
>        101 return limit - usage;

I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method
returns would indicate this (currently the native methods already returns -1 if something went wrong).  But we could revise it in the follow
 up issue I created for that [1].

[1] https://bugs.openjdk.java.net/browse/JDK-8235522 

Thank you,
Daniil

?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:

    Hi Daniil,
    
    It is not a full review, just some minor comments.
    In fact, I do not see real problems yet.
    
    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
    
       55     public long getTotalSwapSpaceSize() {
       56         if (containerMetrics != null) {
       57             long limit = containerMetrics.getMemoryAndSwapLimit();
       58             // The memory limit metrics is not available if JVM 
    runs on Linux host ( not in a docker container)
       59             // or if a docker container was started without 
    specifying a memory limit ( without '--memory='
       60             // Docker option). In latter case there is no limit on 
    how much memory the container can use and
       61             // it can use as much memory as the host's OS allows.
       62             long memLimit = containerMetrics.getMemoryLimit();
       63             if (limit >= 0 && memLimit >= 0) {
       64                 return limit - memLimit;
       65             }
       66         }
       67         return getTotalSwapSpaceSize0();
       68     }
    
       Unneeded space after brackets '('.
       Do we need to check if the (limit - memLimit) value is negative?
       The same question is for getFreeSwapSpaceSize():
         memSwapLimit - memLimit - (memSwapUsage - memUsage)
    
       and getFreeMemorySize():
         101 return limit - usage;
    
       81                         // If this happens just retry the loop for 
    a few iterations
    
       Dot is missed at the end of comment.
    
    
    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
    
       34 System.out.println(String.format("Runtime.availableProcessors: 
    %d", Runtime.getRuntime().availableProcessors()));
       35 
    System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: 
    %d", osBean.getAvailableProcessors()));
       36 
    System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: 
    %d", osBean.getTotalMemorySize()));
       37 
    System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
    %d", osBean.getTotalPhysicalMemorySize()));
       38 
    System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: 
    %d", osBean.getFreeMemorySize()));
       39 
    System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: 
    %d", osBean.getFreePhysicalMemorySize()));
       40 
    System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
    %d", osBean.getTotalSwapSpaceSize()));
       41 
    System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
    %d", osBean.getFreeSwapSpaceSize()));
       42 
    System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
    osBean.getCpuLoad()));
       43 
    System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: 
    %f", osBean.getSystemCpuLoad()));
    
    
       To make the above lines a little bit shorter I'd suggest to define a 
    log() method like this:
          private static void log(String msg) ( System.out.println(msg(; }
    
       34         log(String.format("Runtime.availableProcessors: %d", 
    Runtime.getRuntime().availableProcessors()));
       35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: 
    %d", osBean.getAvailableProcessors()));
       36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", 
    osBean.getTotalMemorySize()));
       37 
    log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
    %d", osBean.getTotalPhysicalMemorySize()));
       38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", 
    osBean.getFreeMemorySize()));
       39 
    log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", 
    osBean.getFreePhysicalMemorySize()));
       40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
    %d", osBean.getTotalSwapSpaceSize()));
       41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
    %d", osBean.getFreeSwapSpaceSize()));
       42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
    osBean.getCpuLoad()));
       43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", 
    osBean.getSystemCpuLoad()));
    
    
    Thanks,
    Serguei
    
    
    On 12/6/19 17:41, Daniil Titov wrote:
    > Hi David, Mandy, and Bob,
    >
    > Thank you for reviewing this fix.
    >
    > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
    > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
    > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
    >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
    >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
    >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
    >       I filed the follow-up issue [4] as Mandy suggested.
    > 3.  The legacy methods were renamed as David suggested.
    >
    >
    >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >> !     static int initialized=1;
    >>
    >>   Am I reading this right that the code currently fails to actually do the
    >> initialization because of this ???
    > Yes, currently the code fails to do the initialization but it was unnoticed since method
    > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
    > was always -1.
    >
    >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>
    >> System.out.println(String.format(...)
    >>
    >> Why not simply
    >>
    >> System.out.printf(..)
    > As I tried explain it earlier it would make the tests unstable.
    > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
    > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
    > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
    > in the output.
    >
    > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
    > and "1030762496".
    >
    > <skipped>
    > [0.304s][trace][os,container] Memory Usage is: 42983424
    > OperatingSystemMXBean.getFreeMemorySize: 1030758400
    > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    > [0.305s][trace][os,container] Memory Usage is: 42979328
    > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
    > 1030762496
    > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
    >
    > <skipped>
    > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
    >
    > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
    > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
    > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
    > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    > 	at java.base/java.lang.Thread.run(Thread.java:832)
    >
    > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
    >
    > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
    > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
    > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
    > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
    >
    > Thank you,
    > Daniil
    >
    > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
    >
    >      
    >      
    >      On 12/6/19 5:59 AM, Bob Vandette wrote:
    >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
    >      >>
    >      >>
    >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
    >      >>
    >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
    >      
    >      I thought that the error case we are referring to is limit == 0 which
    >      indicates something unexpected goes wrong.  So the compatibility concern
    >      should be low.  This is very specific to Metrics implementation for
    >      cgroup v1 and let me know if I'm wrong.
    >      
    >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
    >      >>
    >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
    >      >>    // in this case we return "information unavailable" code -1.
    >      >>
    >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
    >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
    >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
    >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
    >      > limits.
    >      >
    >      
    >      It's important to consider carefully if the monitoring API indicates an
    >      error vs unavailable and an application should continue to run when the
    >      monitoring system fails to get the metrics.
    >      
    >      There are several choices to report "something goes wrong" scenarios
    >      (should unlikely happen???):
    >      1. fall back to a random positive value  (e.g. host value)
    >      2. return a negative value
    >      3. throw an exception
    >      
    >      #3 is not an option as the application is not expecting this.  For #2,
    >      the application can filter bad values if desirable.
    >      
    >      I'm okay if you want to file a JBS issue to follow up and thoroughly
    >      look at the cases that the metrics are unavailable and the cases when
    >      fails to obtain.
    >      
    >      >> ---
    >      >>
    >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >      >>
    >      >> System.out.println(String.format(...)
    >      >>
    >      >> Why not simply
    >      >>
    >      >> System.out.printf(..)
    >      >>
    >      >> ?
    >      
    >      or simply (as I commented [1])
    >           System.out.format
    >      
    >      Mandy
    >      [1]
    >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
    >      
    >      
    >
    >
    
    
From richard.reingruber at sap.com  Tue Dec 10 21:45:28 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Tue, 10 Dec 2019 21:45:28 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the
 Presence of JVMTI Agents
Message-ID: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
change is being tested at SAP since I posted the first RFR some months ago.

The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
agents request capabilities that allow them to access local variable values. E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one should do so. With the
enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
you'll find more details.

Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on JDK-8227745
    http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch

From chris.plummer at oracle.com  Wed Dec 11 02:52:19 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 10 Dec 2019 18:52:19 -0800
Subject: Removal of SA javascript support
Message-ID: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>

Hi,

I like to propose the removal of SA javascript support. Few people even 
realize this support exists, and hopefully even fewer are using it since 
I'd like to remove it. Since I'm new to this myself, let me first 
explain what I know about it's existence, and then explain why I want to 
remove it.

If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't 
look for them in anything post JDK 8. I'll explain why later. jsload is 
used to load a javascript file. In that file you can register new clhsdb 
commands that are written in javascript. You can also evaluate 
javascript using the jseval command. Some of this is explained in [1], 
which is the only place I can find any reference to this support. It 
does not appear to be officially supported, nor is there any oracle 
provided documentation.

There also appear to be a few clhsdb commands that are written in 
javascript. Doing a grep for "registerCommand" in sa.js shows the following:

 ?registerCommand("class", "class name", "jclass");
 ?registerCommand("classes", "classes", "jclasses");
 ?registerCommand("dumpclass", "dumpclass { address | name } [ directory 
]", "dclass");
 ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
 ?registerCommand("mem", "mem address [ length ]", "printMem");
 ?registerCommand("sysprops", "sysprops", "sysProps");
 ?registerCommand("whatis", "whatis address", "printWhatis");

Once again, don't go looking for these in anything newer than JDK8. You 
won't find them. Again the only documentation I can fine is [1].

The other use of Javascript is the SOQL command (Simple Object Query 
Language), a tool used to query the heap, and also the JSDB command.? 
The only SOQL documentation I could find is the blog reference [2]. I 
could not find HSDB documentation, but I believe is is a javascript 
support for looking at hotspot. So once again, neither of these seem to 
be officially supported or documented.

The real purpose of the email is to propose removal of this support. 
Here are the reasons:

(1) It's broken, and has been since 9. See [3]. This is why you don't 
see the javascript related commands in clhsdb. Javascript fails to 
initialize, so none of the javascript related commands are registered.
(2) Nashorn is deprecated and will be removed eventually.
(3) We have very little understanding of the javascript support.
(4) No resources to work on it (unless there is a community volunteer).
(5) Very questionable value (lack of users). The fact this support has 
been broken since JDK 9 and no bug was filed until I did so this week is 
a good indication of that. Another is that there are no other SA 
Javascript related bugs filed. Lastly, the lack of any official 
documentation and only minimal mention of it on the web is another good 
indication of it's (lack of) value.

Also, regarding the 7 commands listed above that would be lost (but 
currently don't work now anyway), if they are really wanted, they could 
be implemented in java instead of javascript.

I'd like to remove javascript support in two steps. The first is simply 
disable the clhsdb code that tries to initialize the javascript support. 
I'd like to do this in 14 (actually as soon as possible). I'd like to 
actually do this now even if we decide to keep javascript support and 
eventually fix it because it will get rid of the warning you see 
whenever you attach from clhsdb:

 ???? Warning! JS Engine can't start, some commands will not be available.

This warning will become more of an issue for the clhsdb tests after I 
push [4] because then you will also see the full stacktrace for the 
underlying exception that caused the Javascript to fail to start. 
Besides being unnecessary noise in passing test cases, it can also be 
misleading in any test that fails because the exception will be 
unrelated to the failure. This is actually what got me going down this 
path of what the javascript support is all about.

The next step would be to strip out all Javascript related code, 
including the SOQL and JSDB tools. This would be done in 15.

Please let me know what you think.

thanks,

Chris

[1] 
https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
[2] 
http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
[3] https://bugs.openjdk.java.net/browse/JDK-8235594
[4] https://bugs.openjdk.java.net/browse/JDK-8234277


From rednaxelafx at gmail.com  Wed Dec 11 03:26:54 2019
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Tue, 10 Dec 2019 19:26:54 -0800
Subject: Removal of SA javascript support
In-Reply-To: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
Message-ID: <CA+cQ+tR5yxEjctALG3b+QByOYD7Y7UyBH7Ocipx-VFoRhRS8hw@mail.gmail.com>

Hi Chris,

Thanks for the proposal. I used to be one of the few heavy users of jsload
/ jseval in CLHSDB back in the JDK6 to JDK8 era. The way I used to use it
is to quickly prototype new functionality in JS and later bake it into Java
code, and also for exploring heap dumps beneath the existing commands
available in CLHSDB (i.e. the underlying SA API is far more powerful than
the set of commands exposed in HSDB).

I even collected my own library of SA-based JS functions for easy
navigation of Java heap dumps. e.g. this objtree command:
https://gist.github.com/rednaxelafx/1393698#file-objtree-js

I'm sad to see it go but given its current state I'd +1 on the proposal to
remove it now.

Best regards,
Kris

On Tue, Dec 10, 2019 at 6:52 PM Chris Plummer <chris.plummer at oracle.com>
wrote:

> Hi,
>
> I like to propose the removal of SA javascript support. Few people even
> realize this support exists, and hopefully even fewer are using it since
> I'd like to remove it. Since I'm new to this myself, let me first
> explain what I know about it's existence, and then explain why I want to
> remove it.
>
> If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't
> look for them in anything post JDK 8. I'll explain why later. jsload is
> used to load a javascript file. In that file you can register new clhsdb
> commands that are written in javascript. You can also evaluate
> javascript using the jseval command. Some of this is explained in [1],
> which is the only place I can find any reference to this support. It
> does not appear to be officially supported, nor is there any oracle
> provided documentation.
>
> There also appear to be a few clhsdb commands that are written in
> javascript. Doing a grep for "registerCommand" in sa.js shows the
> following:
>
>   registerCommand("class", "class name", "jclass");
>   registerCommand("classes", "classes", "jclasses");
>   registerCommand("dumpclass", "dumpclass { address | name } [ directory
> ]", "dclass");
>   registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>   registerCommand("mem", "mem address [ length ]", "printMem");
>   registerCommand("sysprops", "sysprops", "sysProps");
>   registerCommand("whatis", "whatis address", "printWhatis");
>
> Once again, don't go looking for these in anything newer than JDK8. You
> won't find them. Again the only documentation I can fine is [1].
>
> The other use of Javascript is the SOQL command (Simple Object Query
> Language), a tool used to query the heap, and also the JSDB command.
> The only SOQL documentation I could find is the blog reference [2]. I
> could not find HSDB documentation, but I believe is is a javascript
> support for looking at hotspot. So once again, neither of these seem to
> be officially supported or documented.
>
> The real purpose of the email is to propose removal of this support.
> Here are the reasons:
>
> (1) It's broken, and has been since 9. See [3]. This is why you don't
> see the javascript related commands in clhsdb. Javascript fails to
> initialize, so none of the javascript related commands are registered.
> (2) Nashorn is deprecated and will be removed eventually.
> (3) We have very little understanding of the javascript support.
> (4) No resources to work on it (unless there is a community volunteer).
> (5) Very questionable value (lack of users). The fact this support has
> been broken since JDK 9 and no bug was filed until I did so this week is
> a good indication of that. Another is that there are no other SA
> Javascript related bugs filed. Lastly, the lack of any official
> documentation and only minimal mention of it on the web is another good
> indication of it's (lack of) value.
>
> Also, regarding the 7 commands listed above that would be lost (but
> currently don't work now anyway), if they are really wanted, they could
> be implemented in java instead of javascript.
>
> I'd like to remove javascript support in two steps. The first is simply
> disable the clhsdb code that tries to initialize the javascript support.
> I'd like to do this in 14 (actually as soon as possible). I'd like to
> actually do this now even if we decide to keep javascript support and
> eventually fix it because it will get rid of the warning you see
> whenever you attach from clhsdb:
>
>       Warning! JS Engine can't start, some commands will not be available.
>
> This warning will become more of an issue for the clhsdb tests after I
> push [4] because then you will also see the full stacktrace for the
> underlying exception that caused the Javascript to fail to start.
> Besides being unnecessary noise in passing test cases, it can also be
> misleading in any test that fails because the exception will be
> unrelated to the failure. This is actually what got me going down this
> path of what the javascript support is all about.
>
> The next step would be to strip out all Javascript related code,
> including the SOQL and JSDB tools. This would be done in 15.
>
> Please let me know what you think.
>
> thanks,
>
> Chris
>
> [1]
>
> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
> [2]
>
> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
> [3] https://bugs.openjdk.java.net/browse/JDK-8235594
> [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191210/3c16f1e8/attachment-0001.htm>

From sundararajan.athijegannathan at oracle.com  Wed Dec 11 03:38:16 2019
From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com)
Date: Wed, 11 Dec 2019 09:08:16 +0530
Subject: Removal of SA javascript support
In-Reply-To: <CA+cQ+tR5yxEjctALG3b+QByOYD7Y7UyBH7Ocipx-VFoRhRS8hw@mail.gmail.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <CA+cQ+tR5yxEjctALG3b+QByOYD7Y7UyBH7Ocipx-VFoRhRS8hw@mail.gmail.com>
Message-ID: <b2bc72f0-18b1-fc76-69d7-a956ebc6de74@oracle.com>

Hi Kris,

Glad to hear that someone used JS interface of SA :) Quick prototyping + 
debugger interactive scripting were the goals of JS interface! As you 
mentioned, given the current state of SA JS interface, it has to be 
removed :(

Thanks

-Sundar

On 11/12/19 8:56 am, Krystal Mok wrote:
> Hi Chris,
>
> Thanks for the proposal. I used to be one of the few heavy users of 
> jsload / jseval in CLHSDB back in the JDK6 to JDK8 era. The way I used 
> to use it is to quickly prototype new functionality in JS and later 
> bake it into Java code, and also for exploring heap dumps beneath the 
> existing commands available in CLHSDB (i.e. the underlying SA API is 
> far more powerful than the set of commands exposed in HSDB).
>
> I even collected my own library of SA-based JS functions for easy 
> navigation of Java heap dumps. e.g. this objtree command: 
> https://gist.github.com/rednaxelafx/1393698#file-objtree-js
>
> I'm sad to see it go but given its current state I'd?+1 on the 
> proposal to remove it now.
>
> Best regards,
> Kris
>
> On Tue, Dec 10, 2019 at 6:52 PM Chris Plummer 
> <chris.plummer at oracle.com <mailto:chris.plummer at oracle.com>> wrote:
>
>     Hi,
>
>     I like to propose the removal of SA javascript support. Few people
>     even
>     realize this support exists, and hopefully even fewer are using it
>     since
>     I'd like to remove it. Since I'm new to this myself, let me first
>     explain what I know about it's existence, and then explain why I
>     want to
>     remove it.
>
>     If you run "jhsdb clhsdb", there are jsload and jseval commands.
>     Don't
>     look for them in anything post JDK 8. I'll explain why later.
>     jsload is
>     used to load a javascript file. In that file you can register new
>     clhsdb
>     commands that are written in javascript. You can also evaluate
>     javascript using the jseval command. Some of this is explained in
>     [1],
>     which is the only place I can find any reference to this support. It
>     does not appear to be officially supported, nor is there any oracle
>     provided documentation.
>
>     There also appear to be a few clhsdb commands that are written in
>     javascript. Doing a grep for "registerCommand" in sa.js shows the
>     following:
>
>     ??registerCommand("class", "class name", "jclass");
>     ??registerCommand("classes", "classes", "jclasses");
>     ??registerCommand("dumpclass", "dumpclass { address | name } [
>     directory
>     ]", "dclass");
>     ??registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>     ??registerCommand("mem", "mem address [ length ]", "printMem");
>     ??registerCommand("sysprops", "sysprops", "sysProps");
>     ??registerCommand("whatis", "whatis address", "printWhatis");
>
>     Once again, don't go looking for these in anything newer than
>     JDK8. You
>     won't find them. Again the only documentation I can fine is [1].
>
>     The other use of Javascript is the SOQL command (Simple Object Query
>     Language), a tool used to query the heap, and also the JSDB command.
>     The only SOQL documentation I could find is the blog reference [2]. I
>     could not find HSDB documentation, but I believe is is a javascript
>     support for looking at hotspot. So once again, neither of these
>     seem to
>     be officially supported or documented.
>
>     The real purpose of the email is to propose removal of this support.
>     Here are the reasons:
>
>     (1) It's broken, and has been since 9. See [3]. This is why you don't
>     see the javascript related commands in clhsdb. Javascript fails to
>     initialize, so none of the javascript related commands are registered.
>     (2) Nashorn is deprecated and will be removed eventually.
>     (3) We have very little understanding of the javascript support.
>     (4) No resources to work on it (unless there is a community
>     volunteer).
>     (5) Very questionable value (lack of users). The fact this support
>     has
>     been broken since JDK 9 and no bug was filed until I did so this
>     week is
>     a good indication of that. Another is that there are no other SA
>     Javascript related bugs filed. Lastly, the lack of any official
>     documentation and only minimal mention of it on the web is another
>     good
>     indication of it's (lack of) value.
>
>     Also, regarding the 7 commands listed above that would be lost (but
>     currently don't work now anyway), if they are really wanted, they
>     could
>     be implemented in java instead of javascript.
>
>     I'd like to remove javascript support in two steps. The first is
>     simply
>     disable the clhsdb code that tries to initialize the javascript
>     support.
>     I'd like to do this in 14 (actually as soon as possible). I'd like to
>     actually do this now even if we decide to keep javascript support and
>     eventually fix it because it will get rid of the warning you see
>     whenever you attach from clhsdb:
>
>     ????? Warning! JS Engine can't start, some commands will not be
>     available.
>
>     This warning will become more of an issue for the clhsdb tests
>     after I
>     push [4] because then you will also see the full stacktrace for the
>     underlying exception that caused the Javascript to fail to start.
>     Besides being unnecessary noise in passing test cases, it can also be
>     misleading in any test that fails because the exception will be
>     unrelated to the failure. This is actually what got me going down
>     this
>     path of what the javascript support is all about.
>
>     The next step would be to strip out all Javascript related code,
>     including the SOQL and JSDB tools. This would be done in 15.
>
>     Please let me know what you think.
>
>     thanks,
>
>     Chris
>
>     [1]
>     https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>     [2]
>     http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>     [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>     [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191211/2d055600/attachment.htm>

From rednaxelafx at gmail.com  Wed Dec 11 03:49:37 2019
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Tue, 10 Dec 2019 19:49:37 -0800
Subject: Removal of SA javascript support
In-Reply-To: <b2bc72f0-18b1-fc76-69d7-a956ebc6de74@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <CA+cQ+tR5yxEjctALG3b+QByOYD7Y7UyBH7Ocipx-VFoRhRS8hw@mail.gmail.com>
 <b2bc72f0-18b1-fc76-69d7-a956ebc6de74@oracle.com>
Message-ID: <CA+cQ+tRVxPFYotAxZR5fDhTip-EU5N8kNBGK5gWuJ3JOPhXF5w@mail.gmail.com>

Thank you very much for the work on the JS support in SA, Sundar! I really
loved it and depended on it. That's why back in the day when the JS support
was broken from time to time I'd get affected and annoyed and send fixes
for them...

A bit of suggestion: I still see a lot of value for having a proper REPL
beyond CLHSDB for the same purposes as the JS support. Would it be possible
for the serviceability team or someone from the community to invest in
integrating JShell into the SA world?

Thanks,
Kris

On Tue, Dec 10, 2019 at 7:38 PM <sundararajan.athijegannathan at oracle.com>
wrote:

> Hi Kris,
>
> Glad to hear that someone used JS interface of SA :) Quick prototyping +
> debugger interactive scripting were the goals of JS interface! As you
> mentioned, given the current state of SA JS interface, it has to be removed
> :(
>
> Thanks
>
> -Sundar
> On 11/12/19 8:56 am, Krystal Mok wrote:
>
> Hi Chris,
>
> Thanks for the proposal. I used to be one of the few heavy users of jsload
> / jseval in CLHSDB back in the JDK6 to JDK8 era. The way I used to use it
> is to quickly prototype new functionality in JS and later bake it into Java
> code, and also for exploring heap dumps beneath the existing commands
> available in CLHSDB (i.e. the underlying SA API is far more powerful than
> the set of commands exposed in HSDB).
>
> I even collected my own library of SA-based JS functions for easy
> navigation of Java heap dumps. e.g. this objtree command:
> https://gist.github.com/rednaxelafx/1393698#file-objtree-js
>
> I'm sad to see it go but given its current state I'd +1 on the proposal to
> remove it now.
>
> Best regards,
> Kris
>
> On Tue, Dec 10, 2019 at 6:52 PM Chris Plummer <chris.plummer at oracle.com>
> wrote:
>
>> Hi,
>>
>> I like to propose the removal of SA javascript support. Few people even
>> realize this support exists, and hopefully even fewer are using it since
>> I'd like to remove it. Since I'm new to this myself, let me first
>> explain what I know about it's existence, and then explain why I want to
>> remove it.
>>
>> If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't
>> look for them in anything post JDK 8. I'll explain why later. jsload is
>> used to load a javascript file. In that file you can register new clhsdb
>> commands that are written in javascript. You can also evaluate
>> javascript using the jseval command. Some of this is explained in [1],
>> which is the only place I can find any reference to this support. It
>> does not appear to be officially supported, nor is there any oracle
>> provided documentation.
>>
>> There also appear to be a few clhsdb commands that are written in
>> javascript. Doing a grep for "registerCommand" in sa.js shows the
>> following:
>>
>>   registerCommand("class", "class name", "jclass");
>>   registerCommand("classes", "classes", "jclasses");
>>   registerCommand("dumpclass", "dumpclass { address | name } [ directory
>> ]", "dclass");
>>   registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>   registerCommand("mem", "mem address [ length ]", "printMem");
>>   registerCommand("sysprops", "sysprops", "sysProps");
>>   registerCommand("whatis", "whatis address", "printWhatis");
>>
>> Once again, don't go looking for these in anything newer than JDK8. You
>> won't find them. Again the only documentation I can fine is [1].
>>
>> The other use of Javascript is the SOQL command (Simple Object Query
>> Language), a tool used to query the heap, and also the JSDB command.
>> The only SOQL documentation I could find is the blog reference [2]. I
>> could not find HSDB documentation, but I believe is is a javascript
>> support for looking at hotspot. So once again, neither of these seem to
>> be officially supported or documented.
>>
>> The real purpose of the email is to propose removal of this support.
>> Here are the reasons:
>>
>> (1) It's broken, and has been since 9. See [3]. This is why you don't
>> see the javascript related commands in clhsdb. Javascript fails to
>> initialize, so none of the javascript related commands are registered.
>> (2) Nashorn is deprecated and will be removed eventually.
>> (3) We have very little understanding of the javascript support.
>> (4) No resources to work on it (unless there is a community volunteer).
>> (5) Very questionable value (lack of users). The fact this support has
>> been broken since JDK 9 and no bug was filed until I did so this week is
>> a good indication of that. Another is that there are no other SA
>> Javascript related bugs filed. Lastly, the lack of any official
>> documentation and only minimal mention of it on the web is another good
>> indication of it's (lack of) value.
>>
>> Also, regarding the 7 commands listed above that would be lost (but
>> currently don't work now anyway), if they are really wanted, they could
>> be implemented in java instead of javascript.
>>
>> I'd like to remove javascript support in two steps. The first is simply
>> disable the clhsdb code that tries to initialize the javascript support.
>> I'd like to do this in 14 (actually as soon as possible). I'd like to
>> actually do this now even if we decide to keep javascript support and
>> eventually fix it because it will get rid of the warning you see
>> whenever you attach from clhsdb:
>>
>>       Warning! JS Engine can't start, some commands will not be available.
>>
>> This warning will become more of an issue for the clhsdb tests after I
>> push [4] because then you will also see the full stacktrace for the
>> underlying exception that caused the Javascript to fail to start.
>> Besides being unnecessary noise in passing test cases, it can also be
>> misleading in any test that fails because the exception will be
>> unrelated to the failure. This is actually what got me going down this
>> path of what the javascript support is all about.
>>
>> The next step would be to strip out all Javascript related code,
>> including the SOQL and JSDB tools. This would be done in 15.
>>
>> Please let me know what you think.
>>
>> thanks,
>>
>> Chris
>>
>> [1]
>>
>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>> [2]
>>
>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>> [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>> [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191210/33d14b4b/attachment-0001.htm>

From suenaga at oss.nttdata.com  Wed Dec 11 05:33:41 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 11 Dec 2019 14:33:41 +0900
Subject: Removal of SA javascript support
In-Reply-To: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
Message-ID: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>

Hi Chris,

It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw.
However I want SA to implement pluggable feature.
I use custom script to list compiled codes in CodeCache.

I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash.


Thanks,

Yasumasa


On 2019/12/11 11:52, Chris Plummer wrote:
> Hi,
> 
> I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it.
> 
> If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation.
> 
> There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following:
> 
>  ?registerCommand("class", "class name", "jclass");
>  ?registerCommand("classes", "classes", "jclasses");
>  ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass");
>  ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>  ?registerCommand("mem", "mem address [ length ]", "printMem");
>  ?registerCommand("sysprops", "sysprops", "sysProps");
>  ?registerCommand("whatis", "whatis address", "printWhatis");
> 
> Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1].
> 
> The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented.
> 
> The real purpose of the email is to propose removal of this support. Here are the reasons:
> 
> (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered.
> (2) Nashorn is deprecated and will be removed eventually.
> (3) We have very little understanding of the javascript support.
> (4) No resources to work on it (unless there is a community volunteer).
> (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value.
> 
> Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript.
> 
> I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb:
> 
>  ???? Warning! JS Engine can't start, some commands will not be available.
> 
> This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about.
> 
> The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15.
> 
> Please let me know what you think.
> 
> thanks,
> 
> Chris
> 
> [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
> [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
> [3] https://bugs.openjdk.java.net/browse/JDK-8235594
> [4] https://bugs.openjdk.java.net/browse/JDK-8234277
> 

From rednaxelafx at gmail.com  Wed Dec 11 05:39:27 2019
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Tue, 10 Dec 2019 21:39:27 -0800
Subject: Removal of SA javascript support
In-Reply-To: <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
Message-ID: <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>

Hi Yasumasa,

That's a very nice idea. Basically what you're asking for is exposing the
Command interface [1] so that plugins can implement it and get dynamically
loaded / registered into CLHSDB / HSDB, right?

[1]:
http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246

- Kris

On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga <suenaga at oss.nttdata.com>
wrote:

> Hi Chris,
>
> It's a sad proposal, but I agree with you. To maintain SA in JS is
> difficult since Jigsaw.
> However I want SA to implement pluggable feature.
> I use custom script to list compiled codes in CodeCache.
>
> I guess other troubleshooters also want similar feature (via jsload) in
> future if they encounter JVM crash.
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2019/12/11 11:52, Chris Plummer wrote:
> > Hi,
> >
> > I like to propose the removal of SA javascript support. Few people even
> realize this support exists, and hopefully even fewer are using it since
> I'd like to remove it. Since I'm new to this myself, let me first explain
> what I know about it's existence, and then explain why I want to remove it.
> >
> > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't
> look for them in anything post JDK 8. I'll explain why later. jsload is
> used to load a javascript file. In that file you can register new clhsdb
> commands that are written in javascript. You can also evaluate javascript
> using the jseval command. Some of this is explained in [1], which is the
> only place I can find any reference to this support. It does not appear to
> be officially supported, nor is there any oracle provided documentation.
> >
> > There also appear to be a few clhsdb commands that are written in
> javascript. Doing a grep for "registerCommand" in sa.js shows the following:
> >
> >   registerCommand("class", "class name", "jclass");
> >   registerCommand("classes", "classes", "jclasses");
> >   registerCommand("dumpclass", "dumpclass { address | name } [ directory
> ]", "dclass");
> >   registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
> >   registerCommand("mem", "mem address [ length ]", "printMem");
> >   registerCommand("sysprops", "sysprops", "sysProps");
> >   registerCommand("whatis", "whatis address", "printWhatis");
> >
> > Once again, don't go looking for these in anything newer than JDK8. You
> won't find them. Again the only documentation I can fine is [1].
> >
> > The other use of Javascript is the SOQL command (Simple Object Query
> Language), a tool used to query the heap, and also the JSDB command. The
> only SOQL documentation I could find is the blog reference [2]. I could not
> find HSDB documentation, but I believe is is a javascript support for
> looking at hotspot. So once again, neither of these seem to be officially
> supported or documented.
> >
> > The real purpose of the email is to propose removal of this support.
> Here are the reasons:
> >
> > (1) It's broken, and has been since 9. See [3]. This is why you don't
> see the javascript related commands in clhsdb. Javascript fails to
> initialize, so none of the javascript related commands are registered.
> > (2) Nashorn is deprecated and will be removed eventually.
> > (3) We have very little understanding of the javascript support.
> > (4) No resources to work on it (unless there is a community volunteer).
> > (5) Very questionable value (lack of users). The fact this support has
> been broken since JDK 9 and no bug was filed until I did so this week is a
> good indication of that. Another is that there are no other SA Javascript
> related bugs filed. Lastly, the lack of any official documentation and only
> minimal mention of it on the web is another good indication of it's (lack
> of) value.
> >
> > Also, regarding the 7 commands listed above that would be lost (but
> currently don't work now anyway), if they are really wanted, they could be
> implemented in java instead of javascript.
> >
> > I'd like to remove javascript support in two steps. The first is simply
> disable the clhsdb code that tries to initialize the javascript support.
> I'd like to do this in 14 (actually as soon as possible). I'd like to
> actually do this now even if we decide to keep javascript support and
> eventually fix it because it will get rid of the warning you see whenever
> you attach from clhsdb:
> >
> >       Warning! JS Engine can't start, some commands will not be
> available.
> >
> > This warning will become more of an issue for the clhsdb tests after I
> push [4] because then you will also see the full stacktrace for the
> underlying exception that caused the Javascript to fail to start. Besides
> being unnecessary noise in passing test cases, it can also be misleading in
> any test that fails because the exception will be unrelated to the failure.
> This is actually what got me going down this path of what the javascript
> support is all about.
> >
> > The next step would be to strip out all Javascript related code,
> including the SOQL and JSDB tools. This would be done in 15.
> >
> > Please let me know what you think.
> >
> > thanks,
> >
> > Chris
> >
> > [1]
> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
> > [2]
> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
> > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
> > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191210/be78609a/attachment.htm>

From suenaga at oss.nttdata.com  Wed Dec 11 05:56:46 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 11 Dec 2019 14:56:46 +0900
Subject: Removal of SA javascript support
In-Reply-To: <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
Message-ID: <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>

On 2019/12/11 14:39, Krystal Mok wrote:
> Hi?Yasumasa,
> 
> That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right?

Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc...


Yasumasa


> [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
> 
> - Kris
> 
> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
> 
>     Hi Chris,
> 
>     It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw.
>     However I want SA to implement pluggable feature.
>     I use custom script to list compiled codes in CodeCache.
> 
>     I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash.
> 
> 
>     Thanks,
> 
>     Yasumasa
> 
> 
>     On 2019/12/11 11:52, Chris Plummer wrote:
>      > Hi,
>      >
>      > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it.
>      >
>      > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation.
>      >
>      > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following:
>      >
>      >? ?registerCommand("class", "class name", "jclass");
>      >? ?registerCommand("classes", "classes", "jclasses");
>      >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass");
>      >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>      >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>      >? ?registerCommand("sysprops", "sysprops", "sysProps");
>      >? ?registerCommand("whatis", "whatis address", "printWhatis");
>      >
>      > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1].
>      >
>      > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented.
>      >
>      > The real purpose of the email is to propose removal of this support. Here are the reasons:
>      >
>      > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered.
>      > (2) Nashorn is deprecated and will be removed eventually.
>      > (3) We have very little understanding of the javascript support.
>      > (4) No resources to work on it (unless there is a community volunteer).
>      > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value.
>      >
>      > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript.
>      >
>      > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb:
>      >
>      >? ???? Warning! JS Engine can't start, some commands will not be available.
>      >
>      > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about.
>      >
>      > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15.
>      >
>      > Please let me know what you think.
>      >
>      > thanks,
>      >
>      > Chris
>      >
>      > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>      > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>      > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>      > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>      >
> 

From chris.plummer at oracle.com  Wed Dec 11 06:00:05 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 10 Dec 2019 22:00:05 -0800
Subject: Removal of SA javascript support
In-Reply-To: <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
Message-ID: <3a11d62c-60fa-e0ac-f4f2-475445729cb3@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191210/6bc14f96/attachment.htm>

From chris.plummer at oracle.com  Wed Dec 11 06:00:54 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 10 Dec 2019 22:00:54 -0800
Subject: Removal of SA javascript support
In-Reply-To: <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
Message-ID: <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>

On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
> On 2019/12/11 14:39, Krystal Mok wrote:
>> Hi?Yasumasa,
>>
>> That's a very nice idea. Basically what you're asking for is exposing 
>> the Command interface [1] so that plugins can implement it and get 
>> dynamically loaded / registered into CLHSDB / HSDB, right?
>
> Yes, but we also need proxy API to access internal SA objects e.g. 
> CodeCache, JavaThread, TypeDataBase, etc...
>
Yes, or export them. I should have read this email before posting my 
previous one.

Chris
>
> Yasumasa
>
>
>> [1]: 
>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>
>> - Kris
>>
>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga 
>> <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>
>> ??? Hi Chris,
>>
>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS 
>> is difficult since Jigsaw.
>> ??? However I want SA to implement pluggable feature.
>> ??? I use custom script to list compiled codes in CodeCache.
>>
>> ??? I guess other troubleshooters also want similar feature (via 
>> jsload) in future if they encounter JVM crash.
>>
>>
>> ??? Thanks,
>>
>> ??? Yasumasa
>>
>>
>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>> ???? > Hi,
>> ???? >
>> ???? > I like to propose the removal of SA javascript support. Few 
>> people even realize this support exists, and hopefully even fewer are 
>> using it since I'd like to remove it. Since I'm new to this myself, 
>> let me first explain what I know about it's existence, and then 
>> explain why I want to remove it.
>> ???? >
>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval 
>> commands. Don't look for them in anything post JDK 8. I'll explain 
>> why later. jsload is used to load a javascript file. In that file you 
>> can register new clhsdb commands that are written in javascript. You 
>> can also evaluate javascript using the jseval command. Some of this 
>> is explained in [1], which is the only place I can find any reference 
>> to this support. It does not appear to be officially supported, nor 
>> is there any oracle provided documentation.
>> ???? >
>> ???? > There also appear to be a few clhsdb commands that are written 
>> in javascript. Doing a grep for "registerCommand" in sa.js shows the 
>> following:
>> ???? >
>> ???? >? ?registerCommand("class", "class name", "jclass");
>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ 
>> directory ]", "dclass");
>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>> ???? >
>> ???? > Once again, don't go looking for these in anything newer than 
>> JDK8. You won't find them. Again the only documentation I can fine is 
>> [1].
>> ???? >
>> ???? > The other use of Javascript is the SOQL command (Simple Object 
>> Query Language), a tool used to query the heap, and also the JSDB 
>> command. The only SOQL documentation I could find is the blog 
>> reference [2]. I could not find HSDB documentation, but I believe is 
>> is a javascript support for looking at hotspot. So once again, 
>> neither of these seem to be officially supported or documented.
>> ???? >
>> ???? > The real purpose of the email is to propose removal of this 
>> support. Here are the reasons:
>> ???? >
>> ???? > (1) It's broken, and has been since 9. See [3]. This is why 
>> you don't see the javascript related commands in clhsdb. Javascript 
>> fails to initialize, so none of the javascript related commands are 
>> registered.
>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>> ???? > (3) We have very little understanding of the javascript support.
>> ???? > (4) No resources to work on it (unless there is a community 
>> volunteer).
>> ???? > (5) Very questionable value (lack of users). The fact this 
>> support has been broken since JDK 9 and no bug was filed until I did 
>> so this week is a good indication of that. Another is that there are 
>> no other SA Javascript related bugs filed. Lastly, the lack of any 
>> official documentation and only minimal mention of it on the web is 
>> another good indication of it's (lack of) value.
>> ???? >
>> ???? > Also, regarding the 7 commands listed above that would be lost 
>> (but currently don't work now anyway), if they are really wanted, 
>> they could be implemented in java instead of javascript.
>> ???? >
>> ???? > I'd like to remove javascript support in two steps. The first 
>> is simply disable the clhsdb code that tries to initialize the 
>> javascript support. I'd like to do this in 14 (actually as soon as 
>> possible). I'd like to actually do this now even if we decide to keep 
>> javascript support and eventually fix it because it will get rid of 
>> the warning you see whenever you attach from clhsdb:
>> ???? >
>> ???? >? ???? Warning! JS Engine can't start, some commands will not 
>> be available.
>> ???? >
>> ???? > This warning will become more of an issue for the clhsdb tests 
>> after I push [4] because then you will also see the full stacktrace 
>> for the underlying exception that caused the Javascript to fail to 
>> start. Besides being unnecessary noise in passing test cases, it can 
>> also be misleading in any test that fails because the exception will 
>> be unrelated to the failure. This is actually what got me going down 
>> this path of what the javascript support is all about.
>> ???? >
>> ???? > The next step would be to strip out all Javascript related 
>> code, including the SOQL and JSDB tools. This would be done in 15.
>> ???? >
>> ???? > Please let me know what you think.
>> ???? >
>> ???? > thanks,
>> ???? >
>> ???? > Chris
>> ???? >
>> ???? > [1] 
>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>> ???? > [2] 
>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>> ???? >
>>


From suenaga at oss.nttdata.com  Wed Dec 11 06:27:40 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Wed, 11 Dec 2019 15:27:40 +0900
Subject: Removal of SA javascript support
In-Reply-To: <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
Message-ID: <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>

Hi,

IMHO we need to export all packages in SA if we do not provide new API for SA.
sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 (before Jigsaw), so we could make various functions if we need.

OTOH we cannot know what classes are needed by the SA users. All packages in jdk.hotspot.agent module provides features, and they require other packages. For example, sun.jvm.hotspot.oops.Oop requires sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger .
It is difficult to track and to export minimally.
(I worked for it in JDK-8157947, but I gave up...)

Thus I guess it is a big challenge to export SA classes without refactoring.
If we provide new API for SA plugin, I guess we need to work some refactoring.


Yasumasa


On 2019/12/11 15:00, Chris Plummer wrote:
> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>> On 2019/12/11 14:39, Krystal Mok wrote:
>>> Hi?Yasumasa,
>>>
>>> That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right?
>>
>> Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc...
>>
> Yes, or export them. I should have read this email before posting my previous one.
> 
> Chris
>>
>> Yasumasa
>>
>>
>>> [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>
>>> - Kris
>>>
>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>
>>> ??? Hi Chris,
>>>
>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw.
>>> ??? However I want SA to implement pluggable feature.
>>> ??? I use custom script to list compiled codes in CodeCache.
>>>
>>> ??? I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash.
>>>
>>>
>>> ??? Thanks,
>>>
>>> ??? Yasumasa
>>>
>>>
>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>> ???? > Hi,
>>> ???? >
>>> ???? > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it.
>>> ???? >
>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation.
>>> ???? >
>>> ???? > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following:
>>> ???? >
>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass");
>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>> ???? >
>>> ???? > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1].
>>> ???? >
>>> ???? > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented.
>>> ???? >
>>> ???? > The real purpose of the email is to propose removal of this support. Here are the reasons:
>>> ???? >
>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered.
>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>> ???? > (3) We have very little understanding of the javascript support.
>>> ???? > (4) No resources to work on it (unless there is a community volunteer).
>>> ???? > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value.
>>> ???? >
>>> ???? > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript.
>>> ???? >
>>> ???? > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb:
>>> ???? >
>>> ???? >? ???? Warning! JS Engine can't start, some commands will not be available.
>>> ???? >
>>> ???? > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about.
>>> ???? >
>>> ???? > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15.
>>> ???? >
>>> ???? > Please let me know what you think.
>>> ???? >
>>> ???? > thanks,
>>> ???? >
>>> ???? > Chris
>>> ???? >
>>> ???? > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>> ???? > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>> ???? >
>>>
> 
> 

From david.holmes at oracle.com  Wed Dec 11 07:02:31 2019
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 11 Dec 2019 17:02:31 +1000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>

Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> Hi,
> 
> I would like to get reviews please for
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> 
> Corresponding RFE:
> https://bugs.openjdk.java.net/browse/JDK-8227745
> 
> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
> 
> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
> change is being tested at SAP since I posted the first RFR some months ago.
> 
> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
> agents request capabilities that allow them to access local variable values. E.g. if you start-up
> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
> from the beginning, well before a debugger attaches -- if ever one should do so. With the
> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
> you'll find more details.

Most of the details here are in areas I can comment on in detail, but I 
did take an initial general look at things.

The only thing that jumped out at me is that I think the 
DeoptimizeObjectsALotThread should be a hidden thread.

+  bool is_hidden_from_external_view() const { return true; }

Also I don't see any testing of the DeoptimizeObjectsALotThread. Without 
active testing this will just bit-rot.

Also on the tests I don't understand your @requires clause:

  @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & 
(vm.opt.TieredCompilation != true))

This seems to require that TieredCompilation is disabled, but tiered is 
our normal mode of operation. ??

Thanks,
David

> Thanks,
> Richard.
> 
> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>      http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
> 

From dms at samersoff.net  Wed Dec 11 07:44:32 2019
From: dms at samersoff.net (Dmitry Samersoff)
Date: Wed, 11 Dec 2019 10:44:32 +0300
Subject: Removal of SA javascript support
In-Reply-To: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
Message-ID: <030fbae7-f75a-1714-a98c-7139e817967f@samersoff.net>

Hello Chris,

I'm supporting you with this decision.

PS: For people who want SA scripting -

One thing I experimented with a long time ago -
has been exporting of some SA capabilities to jython.
This might be the way to go.

-Dmitry


On 11.12.19 05:52, Chris Plummer wrote:
> Hi,
> 
> I like to propose the removal of SA javascript support. Few people even
> realize this support exists, and hopefully even fewer are using it since
> I'd like to remove it. Since I'm new to this myself, let me first
> explain what I know about it's existence, and then explain why I want to
> remove it.
> 
> If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't
> look for them in anything post JDK 8. I'll explain why later. jsload is
> used to load a javascript file. In that file you can register new clhsdb
> commands that are written in javascript. You can also evaluate
> javascript using the jseval command. Some of this is explained in [1],
> which is the only place I can find any reference to this support. It
> does not appear to be officially supported, nor is there any oracle
> provided documentation.
> 
> There also appear to be a few clhsdb commands that are written in
> javascript. Doing a grep for "registerCommand" in sa.js shows the
> following:
> 
> ?registerCommand("class", "class name", "jclass");
> ?registerCommand("classes", "classes", "jclasses");
> ?registerCommand("dumpclass", "dumpclass { address | name } [ directory
> ]", "dclass");
> ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
> ?registerCommand("mem", "mem address [ length ]", "printMem");
> ?registerCommand("sysprops", "sysprops", "sysProps");
> ?registerCommand("whatis", "whatis address", "printWhatis");
> 
> Once again, don't go looking for these in anything newer than JDK8. You
> won't find them. Again the only documentation I can fine is [1].
> 
> The other use of Javascript is the SOQL command (Simple Object Query
> Language), a tool used to query the heap, and also the JSDB command.?
> The only SOQL documentation I could find is the blog reference [2]. I
> could not find HSDB documentation, but I believe is is a javascript
> support for looking at hotspot. So once again, neither of these seem to
> be officially supported or documented.
> 
> The real purpose of the email is to propose removal of this support.
> Here are the reasons:
> 
> (1) It's broken, and has been since 9. See [3]. This is why you don't
> see the javascript related commands in clhsdb. Javascript fails to
> initialize, so none of the javascript related commands are registered.
> (2) Nashorn is deprecated and will be removed eventually.
> (3) We have very little understanding of the javascript support.
> (4) No resources to work on it (unless there is a community volunteer).
> (5) Very questionable value (lack of users). The fact this support has
> been broken since JDK 9 and no bug was filed until I did so this week is
> a good indication of that. Another is that there are no other SA
> Javascript related bugs filed. Lastly, the lack of any official
> documentation and only minimal mention of it on the web is another good
> indication of it's (lack of) value.
> 
> Also, regarding the 7 commands listed above that would be lost (but
> currently don't work now anyway), if they are really wanted, they could
> be implemented in java instead of javascript.
> 
> I'd like to remove javascript support in two steps. The first is simply
> disable the clhsdb code that tries to initialize the javascript support.
> I'd like to do this in 14 (actually as soon as possible). I'd like to
> actually do this now even if we decide to keep javascript support and
> eventually fix it because it will get rid of the warning you see
> whenever you attach from clhsdb:
> 
> ???? Warning! JS Engine can't start, some commands will not be available.
> 
> This warning will become more of an issue for the clhsdb tests after I
> push [4] because then you will also see the full stacktrace for the
> underlying exception that caused the Javascript to fail to start.
> Besides being unnecessary noise in passing test cases, it can also be
> misleading in any test that fails because the exception will be
> unrelated to the failure. This is actually what got me going down this
> path of what the javascript support is all about.
> 
> The next step would be to strip out all Javascript related code,
> including the SOQL and JSDB tools. This would be done in 15.
> 
> Please let me know what you think.
> 
> thanks,
> 
> Chris
> 
> [1]
> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
> 
> [2]
> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
> 
> [3] https://bugs.openjdk.java.net/browse/JDK-8235594
> [4] https://bugs.openjdk.java.net/browse/JDK-8234277
> 

From sundararajan.athijegannathan at oracle.com  Wed Dec 11 12:47:15 2019
From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com)
Date: Wed, 11 Dec 2019 18:17:15 +0530
Subject: Removal of SA javascript support
In-Reply-To: <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
Message-ID: <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>

Effectively you're asking for SA as API. I don't think that is a good 
idea. That implies supporting hotspot data structures as Java *API*. 
That will be maintainability nightmare - we've to keep tracking hotspot 
data structures in SA code. That itself is problematic. API would be 
next level nightmare.

-Sundar

On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
> Hi,
>
> IMHO we need to export all packages in SA if we do not provide new API 
> for SA.
> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 
> (before Jigsaw), so we could make various functions if we need.
>
> OTOH we cannot know what classes are needed by the SA users. All 
> packages in jdk.hotspot.agent module provides features, and they 
> require other packages. For example, sun.jvm.hotspot.oops.Oop requires 
> sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger .
> It is difficult to track and to export minimally.
> (I worked for it in JDK-8157947, but I gave up...)
>
> Thus I guess it is a big challenge to export SA classes without 
> refactoring.
> If we provide new API for SA plugin, I guess we need to work some 
> refactoring.
>
>
> Yasumasa
>
>
> On 2019/12/11 15:00, Chris Plummer wrote:
>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>> Hi?Yasumasa,
>>>>
>>>> That's a very nice idea. Basically what you're asking for is 
>>>> exposing the Command interface [1] so that plugins can implement it 
>>>> and get dynamically loaded / registered into CLHSDB / HSDB, right?
>>>
>>> Yes, but we also need proxy API to access internal SA objects e.g. 
>>> CodeCache, JavaThread, TypeDataBase, etc...
>>>
>> Yes, or export them. I should have read this email before posting my 
>> previous one.
>>
>> Chris
>>>
>>> Yasumasa
>>>
>>>
>>>> [1]: 
>>>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>
>>>> - Kris
>>>>
>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga 
>>>> <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>
>>>> ??? Hi Chris,
>>>>
>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS 
>>>> is difficult since Jigsaw.
>>>> ??? However I want SA to implement pluggable feature.
>>>> ??? I use custom script to list compiled codes in CodeCache.
>>>>
>>>> ??? I guess other troubleshooters also want similar feature (via 
>>>> jsload) in future if they encounter JVM crash.
>>>>
>>>>
>>>> ??? Thanks,
>>>>
>>>> ??? Yasumasa
>>>>
>>>>
>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>> ???? > Hi,
>>>> ???? >
>>>> ???? > I like to propose the removal of SA javascript support. Few 
>>>> people even realize this support exists, and hopefully even fewer 
>>>> are using it since I'd like to remove it. Since I'm new to this 
>>>> myself, let me first explain what I know about it's existence, and 
>>>> then explain why I want to remove it.
>>>> ???? >
>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval 
>>>> commands. Don't look for them in anything post JDK 8. I'll explain 
>>>> why later. jsload is used to load a javascript file. In that file 
>>>> you can register new clhsdb commands that are written in 
>>>> javascript. You can also evaluate javascript using the jseval 
>>>> command. Some of this is explained in [1], which is the only place 
>>>> I can find any reference to this support. It does not appear to be 
>>>> officially supported, nor is there any oracle provided documentation.
>>>> ???? >
>>>> ???? > There also appear to be a few clhsdb commands that are 
>>>> written in javascript. Doing a grep for "registerCommand" in sa.js 
>>>> shows the following:
>>>> ???? >
>>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } 
>>>> [ directory ]", "dclass");
>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>>> ???? >
>>>> ???? > Once again, don't go looking for these in anything newer 
>>>> than JDK8. You won't find them. Again the only documentation I can 
>>>> fine is [1].
>>>> ???? >
>>>> ???? > The other use of Javascript is the SOQL command (Simple 
>>>> Object Query Language), a tool used to query the heap, and also the 
>>>> JSDB command. The only SOQL documentation I could find is the blog 
>>>> reference [2]. I could not find HSDB documentation, but I believe 
>>>> is is a javascript support for looking at hotspot. So once again, 
>>>> neither of these seem to be officially supported or documented.
>>>> ???? >
>>>> ???? > The real purpose of the email is to propose removal of this 
>>>> support. Here are the reasons:
>>>> ???? >
>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why 
>>>> you don't see the javascript related commands in clhsdb. Javascript 
>>>> fails to initialize, so none of the javascript related commands are 
>>>> registered.
>>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>> ???? > (3) We have very little understanding of the javascript 
>>>> support.
>>>> ???? > (4) No resources to work on it (unless there is a community 
>>>> volunteer).
>>>> ???? > (5) Very questionable value (lack of users). The fact this 
>>>> support has been broken since JDK 9 and no bug was filed until I 
>>>> did so this week is a good indication of that. Another is that 
>>>> there are no other SA Javascript related bugs filed. Lastly, the 
>>>> lack of any official documentation and only minimal mention of it 
>>>> on the web is another good indication of it's (lack of) value.
>>>> ???? >
>>>> ???? > Also, regarding the 7 commands listed above that would be 
>>>> lost (but currently don't work now anyway), if they are really 
>>>> wanted, they could be implemented in java instead of javascript.
>>>> ???? >
>>>> ???? > I'd like to remove javascript support in two steps. The 
>>>> first is simply disable the clhsdb code that tries to initialize 
>>>> the javascript support. I'd like to do this in 14 (actually as soon 
>>>> as possible). I'd like to actually do this now even if we decide to 
>>>> keep javascript support and eventually fix it because it will get 
>>>> rid of the warning you see whenever you attach from clhsdb:
>>>> ???? >
>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not 
>>>> be available.
>>>> ???? >
>>>> ???? > This warning will become more of an issue for the clhsdb 
>>>> tests after I push [4] because then you will also see the full 
>>>> stacktrace for the underlying exception that caused the Javascript 
>>>> to fail to start. Besides being unnecessary noise in passing test 
>>>> cases, it can also be misleading in any test that fails because the 
>>>> exception will be unrelated to the failure. This is actually what 
>>>> got me going down this path of what the javascript support is all 
>>>> about.
>>>> ???? >
>>>> ???? > The next step would be to strip out all Javascript related 
>>>> code, including the SOQL and JSDB tools. This would be done in 15.
>>>> ???? >
>>>> ???? > Please let me know what you think.
>>>> ???? >
>>>> ???? > thanks,
>>>> ???? >
>>>> ???? > Chris
>>>> ???? >
>>>> ???? > [1] 
>>>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>> ???? > [2] 
>>>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>> ???? >
>>>>
>>
>>

From sundararajan.athijegannathan at oracle.com  Wed Dec 11 12:49:21 2019
From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com)
Date: Wed, 11 Dec 2019 18:19:21 +0530
Subject: Removal of SA javascript support
In-Reply-To: <030fbae7-f75a-1714-a98c-7139e817967f@samersoff.net>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <030fbae7-f75a-1714-a98c-7139e817967f@samersoff.net>
Message-ID: <0ca8ae29-0645-e00b-5896-9248f71645fe@oracle.com>

Replacing one scripting language with another (jython) does not solve 
anything. You'd still face the same issues - accessing module private 
stuff from SA module from scripts. Besides you'll have a new problem in 
addition. How to bundle jython? We've been using bundled scripting 
engine (nashorn) so far.

-Sundar

On 11/12/19 1:14 pm, Dmitry Samersoff wrote:
> Hello Chris,
>
> I'm supporting you with this decision.
>
> PS: For people who want SA scripting -
>
> One thing I experimented with a long time ago -
> has been exporting of some SA capabilities to jython.
> This might be the way to go.
>
> -Dmitry
>
>
>
> On 11.12.19 05:52, Chris Plummer wrote:
>> Hi,
>>
>> I like to propose the removal of SA javascript support. Few people even
>> realize this support exists, and hopefully even fewer are using it since
>> I'd like to remove it. Since I'm new to this myself, let me first
>> explain what I know about it's existence, and then explain why I want to
>> remove it.
>>
>> If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't
>> look for them in anything post JDK 8. I'll explain why later. jsload is
>> used to load a javascript file. In that file you can register new clhsdb
>> commands that are written in javascript. You can also evaluate
>> javascript using the jseval command. Some of this is explained in [1],
>> which is the only place I can find any reference to this support. It
>> does not appear to be officially supported, nor is there any oracle
>> provided documentation.
>>
>> There also appear to be a few clhsdb commands that are written in
>> javascript. Doing a grep for "registerCommand" in sa.js shows the
>> following:
>>
>>  ?registerCommand("class", "class name", "jclass");
>>  ?registerCommand("classes", "classes", "jclasses");
>>  ?registerCommand("dumpclass", "dumpclass { address | name } [ directory
>> ]", "dclass");
>>  ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>  ?registerCommand("mem", "mem address [ length ]", "printMem");
>>  ?registerCommand("sysprops", "sysprops", "sysProps");
>>  ?registerCommand("whatis", "whatis address", "printWhatis");
>>
>> Once again, don't go looking for these in anything newer than JDK8. You
>> won't find them. Again the only documentation I can fine is [1].
>>
>> The other use of Javascript is the SOQL command (Simple Object Query
>> Language), a tool used to query the heap, and also the JSDB command.
>> The only SOQL documentation I could find is the blog reference [2]. I
>> could not find HSDB documentation, but I believe is is a javascript
>> support for looking at hotspot. So once again, neither of these seem to
>> be officially supported or documented.
>>
>> The real purpose of the email is to propose removal of this support.
>> Here are the reasons:
>>
>> (1) It's broken, and has been since 9. See [3]. This is why you don't
>> see the javascript related commands in clhsdb. Javascript fails to
>> initialize, so none of the javascript related commands are registered.
>> (2) Nashorn is deprecated and will be removed eventually.
>> (3) We have very little understanding of the javascript support.
>> (4) No resources to work on it (unless there is a community volunteer).
>> (5) Very questionable value (lack of users). The fact this support has
>> been broken since JDK 9 and no bug was filed until I did so this week is
>> a good indication of that. Another is that there are no other SA
>> Javascript related bugs filed. Lastly, the lack of any official
>> documentation and only minimal mention of it on the web is another good
>> indication of it's (lack of) value.
>>
>> Also, regarding the 7 commands listed above that would be lost (but
>> currently don't work now anyway), if they are really wanted, they could
>> be implemented in java instead of javascript.
>>
>> I'd like to remove javascript support in two steps. The first is simply
>> disable the clhsdb code that tries to initialize the javascript support.
>> I'd like to do this in 14 (actually as soon as possible). I'd like to
>> actually do this now even if we decide to keep javascript support and
>> eventually fix it because it will get rid of the warning you see
>> whenever you attach from clhsdb:
>>
>>  ???? Warning! JS Engine can't start, some commands will not be available.
>>
>> This warning will become more of an issue for the clhsdb tests after I
>> push [4] because then you will also see the full stacktrace for the
>> underlying exception that caused the Javascript to fail to start.
>> Besides being unnecessary noise in passing test cases, it can also be
>> misleading in any test that fails because the exception will be
>> unrelated to the failure. This is actually what got me going down this
>> path of what the javascript support is all about.
>>
>> The next step would be to strip out all Javascript related code,
>> including the SOQL and JSDB tools. This would be done in 15.
>>
>> Please let me know what you think.
>>
>> thanks,
>>
>> Chris
>>
>> [1]
>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>
>> [2]
>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>
>> [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>> [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>

From dms at samersoff.net  Wed Dec 11 15:03:10 2019
From: dms at samersoff.net (Dmitry Samersoff)
Date: Wed, 11 Dec 2019 18:03:10 +0300
Subject: Removal of SA javascript support
In-Reply-To: <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
Message-ID: <6faf0cb5-7b4a-5e35-ed7c-90b817235031@samersoff.net>

Sundar,

Supporting hotspot data structure in SA is already a maintenance
nightmare ;)

So we can consider to provide high level API, like find_class_by_name to
script writer.

  It allows anybody who are interesting with quick prototyping write his
own program on top of SA with any language they want.

-Dmitry

On 11.12.19 15:47, sundararajan.athijegannathan at oracle.com wrote:
> Effectively you're asking for SA as API. I don't think that is a good
> idea. That implies supporting hotspot data structures as Java *API*.
> That will be maintainability nightmare - we've to keep tracking hotspot
> data structures in SA code. That itself is problematic. API would be
> next level nightmare.
> 
> -Sundar
> 
> On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
>> Hi,
>>
>> IMHO we need to export all packages in SA if we do not provide new API
>> for SA.
>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8
>> (before Jigsaw), so we could make various functions if we need.
>>
>> OTOH we cannot know what classes are needed by the SA users. All
>> packages in jdk.hotspot.agent module provides features, and they
>> require other packages. For example, sun.jvm.hotspot.oops.Oop requires
>> sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger .
>> It is difficult to track and to export minimally.
>> (I worked for it in JDK-8157947, but I gave up...)
>>
>> Thus I guess it is a big challenge to export SA classes without
>> refactoring.
>> If we provide new API for SA plugin, I guess we need to work some
>> refactoring.
>>
>>
>> Yasumasa
>>
>>
>> On 2019/12/11 15:00, Chris Plummer wrote:
>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>>> Hi?Yasumasa,
>>>>>
>>>>> That's a very nice idea. Basically what you're asking for is
>>>>> exposing the Command interface [1] so that plugins can implement it
>>>>> and get dynamically loaded / registered into CLHSDB / HSDB, right?
>>>>
>>>> Yes, but we also need proxy API to access internal SA objects e.g.
>>>> CodeCache, JavaThread, TypeDataBase, etc...
>>>>
>>> Yes, or export them. I should have read this email before posting my
>>> previous one.
>>>
>>> Chris
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>> [1]:
>>>>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>>
>>>>>
>>>>> - Kris
>>>>>
>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga
>>>>> <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>>
>>>>> ??? Hi Chris,
>>>>>
>>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS
>>>>> is difficult since Jigsaw.
>>>>> ??? However I want SA to implement pluggable feature.
>>>>> ??? I use custom script to list compiled codes in CodeCache.
>>>>>
>>>>> ??? I guess other troubleshooters also want similar feature (via
>>>>> jsload) in future if they encounter JVM crash.
>>>>>
>>>>>
>>>>> ??? Thanks,
>>>>>
>>>>> ??? Yasumasa
>>>>>
>>>>>
>>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>>> ???? > Hi,
>>>>> ???? >
>>>>> ???? > I like to propose the removal of SA javascript support. Few
>>>>> people even realize this support exists, and hopefully even fewer
>>>>> are using it since I'd like to remove it. Since I'm new to this
>>>>> myself, let me first explain what I know about it's existence, and
>>>>> then explain why I want to remove it.
>>>>> ???? >
>>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval
>>>>> commands. Don't look for them in anything post JDK 8. I'll explain
>>>>> why later. jsload is used to load a javascript file. In that file
>>>>> you can register new clhsdb commands that are written in
>>>>> javascript. You can also evaluate javascript using the jseval
>>>>> command. Some of this is explained in [1], which is the only place
>>>>> I can find any reference to this support. It does not appear to be
>>>>> officially supported, nor is there any oracle provided documentation.
>>>>> ???? >
>>>>> ???? > There also appear to be a few clhsdb commands that are
>>>>> written in javascript. Doing a grep for "registerCommand" in sa.js
>>>>> shows the following:
>>>>> ???? >
>>>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name }
>>>>> [ directory ]", "dclass");
>>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>>>> ???? >
>>>>> ???? > Once again, don't go looking for these in anything newer
>>>>> than JDK8. You won't find them. Again the only documentation I can
>>>>> fine is [1].
>>>>> ???? >
>>>>> ???? > The other use of Javascript is the SOQL command (Simple
>>>>> Object Query Language), a tool used to query the heap, and also the
>>>>> JSDB command. The only SOQL documentation I could find is the blog
>>>>> reference [2]. I could not find HSDB documentation, but I believe
>>>>> is is a javascript support for looking at hotspot. So once again,
>>>>> neither of these seem to be officially supported or documented.
>>>>> ???? >
>>>>> ???? > The real purpose of the email is to propose removal of this
>>>>> support. Here are the reasons:
>>>>> ???? >
>>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why
>>>>> you don't see the javascript related commands in clhsdb. Javascript
>>>>> fails to initialize, so none of the javascript related commands are
>>>>> registered.
>>>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>>> ???? > (3) We have very little understanding of the javascript
>>>>> support.
>>>>> ???? > (4) No resources to work on it (unless there is a community
>>>>> volunteer).
>>>>> ???? > (5) Very questionable value (lack of users). The fact this
>>>>> support has been broken since JDK 9 and no bug was filed until I
>>>>> did so this week is a good indication of that. Another is that
>>>>> there are no other SA Javascript related bugs filed. Lastly, the
>>>>> lack of any official documentation and only minimal mention of it
>>>>> on the web is another good indication of it's (lack of) value.
>>>>> ???? >
>>>>> ???? > Also, regarding the 7 commands listed above that would be
>>>>> lost (but currently don't work now anyway), if they are really
>>>>> wanted, they could be implemented in java instead of javascript.
>>>>> ???? >
>>>>> ???? > I'd like to remove javascript support in two steps. The
>>>>> first is simply disable the clhsdb code that tries to initialize
>>>>> the javascript support. I'd like to do this in 14 (actually as soon
>>>>> as possible). I'd like to actually do this now even if we decide to
>>>>> keep javascript support and eventually fix it because it will get
>>>>> rid of the warning you see whenever you attach from clhsdb:
>>>>> ???? >
>>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not
>>>>> be available.
>>>>> ???? >
>>>>> ???? > This warning will become more of an issue for the clhsdb
>>>>> tests after I push [4] because then you will also see the full
>>>>> stacktrace for the underlying exception that caused the Javascript
>>>>> to fail to start. Besides being unnecessary noise in passing test
>>>>> cases, it can also be misleading in any test that fails because the
>>>>> exception will be unrelated to the failure. This is actually what
>>>>> got me going down this path of what the javascript support is all
>>>>> about.
>>>>> ???? >
>>>>> ???? > The next step would be to strip out all Javascript related
>>>>> code, including the SOQL and JSDB tools. This would be done in 15.
>>>>> ???? >
>>>>> ???? > Please let me know what you think.
>>>>> ???? >
>>>>> ???? > thanks,
>>>>> ???? >
>>>>> ???? > Chris
>>>>> ???? >
>>>>> ???? > [1]
>>>>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>>>
>>>>> ???? > [2]
>>>>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>>>
>>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>>> ???? >
>>>>>
>>>
>>>

From richard.reingruber at sap.com  Wed Dec 11 15:07:29 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Wed, 11 Dec 2019 15:07:29 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
Message-ID: <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi David,

  > Most of the details here are in areas I can comment on in detail, but I 
  > did take an initial general look at things.

Thanks for taking the time!

  > The only thing that jumped out at me is that I think the 
  > DeoptimizeObjectsALotThread should be a hidden thread.
  > 
  > +  bool is_hidden_from_external_view() const { return true; }

Yes, it should. Will add the method like above.

  > Also I don't see any testing of the DeoptimizeObjectsALotThread. Without 
  > active testing this will just bit-rot.

DeoptimizeObjectsALot is meant for stress testing with a larger workload. I will add a minimal test
to keep it fresh.

  > Also on the tests I don't understand your @requires clause:
  > 
  >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & 
  > (vm.opt.TieredCompilation != true))
  > 
  > This seems to require that TieredCompilation is disabled, but tiered is 
  > our normal mode of operation. ??
  > 

I removed the clause. I guess I wanted to target the tests towards the code they are supposed to
test, and it's easier to analyze failures w/o tiered compilation and with just one compiler thread.

Additionally I will make use of compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.

Thanks,
Richard.

-----Original Message-----
From: David Holmes <david.holmes at oracle.com> 
Sent: Mittwoch, 11. Dezember 2019 08:03
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> Hi,
> 
> I would like to get reviews please for
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> 
> Corresponding RFE:
> https://bugs.openjdk.java.net/browse/JDK-8227745
> 
> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
> 
> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
> change is being tested at SAP since I posted the first RFR some months ago.
> 
> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
> agents request capabilities that allow them to access local variable values. E.g. if you start-up
> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
> from the beginning, well before a debugger attaches -- if ever one should do so. With the
> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
> you'll find more details.

Most of the details here are in areas I can comment on in detail, but I 
did take an initial general look at things.

The only thing that jumped out at me is that I think the 
DeoptimizeObjectsALotThread should be a hidden thread.

+  bool is_hidden_from_external_view() const { return true; }

Also I don't see any testing of the DeoptimizeObjectsALotThread. Without 
active testing this will just bit-rot.

Also on the tests I don't understand your @requires clause:

  @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & 
(vm.opt.TieredCompilation != true))

This seems to require that TieredCompilation is disabled, but tiered is 
our normal mode of operation. ??

Thanks,
David

> Thanks,
> Richard.
> 
> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>      http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
> 

From dms at samersoff.net  Wed Dec 11 15:34:02 2019
From: dms at samersoff.net (Dmitry Samersoff)
Date: Wed, 11 Dec 2019 18:34:02 +0300
Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <de7b6cb5-88c3-dc0c-48d3-a3ecf66eff9f@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <de7b6cb5-88c3-dc0c-48d3-a3ecf66eff9f@oss.nttdata.com>
Message-ID: <2c066d67-aa3c-83e2-632a-1ba3114d1538@samersoff.net>

Hello Yasumasa,

Please,

1. Consider to use mmap for reading elf sections.

2. Please move all platfrom-specific parts of native code to a separate
file/directory. Current patch will brake AARCH64 build.

3. I didn't find any tests here. How did your test the changes?


libproc_impl.c

131: If is not necessary, free handles NULLPTR gracefully.


-Dmitry


On 04.12.19 03:54, Yasumasa Suenaga wrote:
> PING: Could you review it?
> 
> ? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
> 
> This bug is targeted to JDK 14.
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2019/11/28 21:39, Yasumasa Suenaga wrote:
>> Hi,
>>
>> I refactored LinuxAMD64CFrame.java . It works fine in
>> serviceability/sa tests and
>> all tests on submit repo
>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>> Could you review new webrev?
>>
>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>
>> The diff from previous webrev is here:
>> ?? http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> Please review this change:
>>>
>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>
>>>
>>> According to 2.7 Stack Unwind Algorithm in System V Application
>>> Binary Interface AMD64
>>> Architecture Processor Supplement [1], we need to use DWARF in
>>> .eh_frame or .debug_frame
>>> for stack unwinding.
>>>
>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since
>>> GCC 4.6, so system
>>> library (e.g. libc) might be compiled with this feature.
>>>
>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer
>>> register (RBP).
>>> So it might be lack of stack frames.
>>>
>>> I guess JDK-8219201 is caused by same issue.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> [1]
>>> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
>>>

From daniil.x.titov at oracle.com  Wed Dec 11 17:12:40 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 11 Dec 2019 09:12:40 -0800 (PST)
Subject: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
Message-ID: <2A286F91-6CB8-4A32-B103-2C28D52E5A1A@oracle.com>

Hi Serguei,

Thank you for your comments. I will correct this nits before pushing the changes.

Hi Bob and David,

> [Mandy Chung]
>> I reviewed Metrics and Subsystem in this version.
>> I don't need to see a new webrev.

As I understood Mandy finished reviewing this fix. Just wanted to confirm with you if you are okey with that version of the fix (webrev.06) ?

Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests passed. 

Thank you,
Daniil


?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:

    Hi Daniil,
    
    It is not a full review, just some minor comments.
    In fact, I do not see real problems yet.
    
    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
    
       55     public long getTotalSwapSpaceSize() {
       56         if (containerMetrics != null) {
       57             long limit = containerMetrics.getMemoryAndSwapLimit();
       58             // The memory limit metrics is not available if JVM 
    runs on Linux host ( not in a docker container)
       59             // or if a docker container was started without 
    specifying a memory limit ( without '--memory='
       60             // Docker option). In latter case there is no limit on 
    how much memory the container can use and
       61             // it can use as much memory as the host's OS allows.
       62             long memLimit = containerMetrics.getMemoryLimit();
       63             if (limit >= 0 && memLimit >= 0) {
       64                 return limit - memLimit;
       65             }
       66         }
       67         return getTotalSwapSpaceSize0();
       68     }
    
       Unneeded space after brackets '('.
       Do we need to check if the (limit - memLimit) value is negative?
       The same question is for getFreeSwapSpaceSize():
         memSwapLimit - memLimit - (memSwapUsage - memUsage)
    
       and getFreeMemorySize():
         101 return limit - usage;
    
       81                         // If this happens just retry the loop for 
    a few iterations
    
       Dot is missed at the end of comment.
    
    
    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
    
       34 System.out.println(String.format("Runtime.availableProcessors: 
    %d", Runtime.getRuntime().availableProcessors()));
       35 
    System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: 
    %d", osBean.getAvailableProcessors()));
       36 
    System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: 
    %d", osBean.getTotalMemorySize()));
       37 
    System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
    %d", osBean.getTotalPhysicalMemorySize()));
       38 
    System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: 
    %d", osBean.getFreeMemorySize()));
       39 
    System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: 
    %d", osBean.getFreePhysicalMemorySize()));
       40 
    System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
    %d", osBean.getTotalSwapSpaceSize()));
       41 
    System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
    %d", osBean.getFreeSwapSpaceSize()));
       42 
    System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
    osBean.getCpuLoad()));
       43 
    System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: 
    %f", osBean.getSystemCpuLoad()));
    
    
       To make the above lines a little bit shorter I'd suggest to define a 
    log() method like this:
          private static void log(String msg) ( System.out.println(msg(; }
    
       34         log(String.format("Runtime.availableProcessors: %d", 
    Runtime.getRuntime().availableProcessors()));
       35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: 
    %d", osBean.getAvailableProcessors()));
       36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", 
    osBean.getTotalMemorySize()));
       37 
    log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
    %d", osBean.getTotalPhysicalMemorySize()));
       38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", 
    osBean.getFreeMemorySize()));
       39 
    log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", 
    osBean.getFreePhysicalMemorySize()));
       40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
    %d", osBean.getTotalSwapSpaceSize()));
       41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
    %d", osBean.getFreeSwapSpaceSize()));
       42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
    osBean.getCpuLoad()));
       43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", 
    osBean.getSystemCpuLoad()));
    
    
    Thanks,
    Serguei
    
    
    On 12/6/19 17:41, Daniil Titov wrote:
    > Hi David, Mandy, and Bob,
    >
    > Thank you for reviewing this fix.
    >
    > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
    > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
    > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
    >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
    >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
    >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
    >       I filed the follow-up issue [4] as Mandy suggested.
    > 3.  The legacy methods were renamed as David suggested.
    >
    >
    >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >> !     static int initialized=1;
    >>
    >>   Am I reading this right that the code currently fails to actually do the
    >> initialization because of this ???
    > Yes, currently the code fails to do the initialization but it was unnoticed since method
    > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
    > was always -1.
    >
    >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>
    >> System.out.println(String.format(...)
    >>
    >> Why not simply
    >>
    >> System.out.printf(..)
    > As I tried explain it earlier it would make the tests unstable.
    > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
    > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
    > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
    > in the output.
    >
    > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
    > and "1030762496".
    >
    > <skipped>
    > [0.304s][trace][os,container] Memory Usage is: 42983424
    > OperatingSystemMXBean.getFreeMemorySize: 1030758400
    > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    > [0.305s][trace][os,container] Memory Usage is: 42979328
    > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
    > 1030762496
    > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
    >
    > <skipped>
    > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
    >
    > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
    > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
    > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
    > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    > 	at java.base/java.lang.Thread.run(Thread.java:832)
    >
    > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
    >
    > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
    > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
    > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
    > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
    >
    > Thank you,
    > Daniil
    >
    > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
    >
    >      
    >      
    >      On 12/6/19 5:59 AM, Bob Vandette wrote:
    >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
    >      >>
    >      >>
    >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
    >      >>
    >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
    >      
    >      I thought that the error case we are referring to is limit == 0 which
    >      indicates something unexpected goes wrong.  So the compatibility concern
    >      should be low.  This is very specific to Metrics implementation for
    >      cgroup v1 and let me know if I'm wrong.
    >      
    >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
    >      >>
    >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
    >      >>    // in this case we return "information unavailable" code -1.
    >      >>
    >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
    >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
    >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
    >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
    >      > limits.
    >      >
    >      
    >      It's important to consider carefully if the monitoring API indicates an
    >      error vs unavailable and an application should continue to run when the
    >      monitoring system fails to get the metrics.
    >      
    >      There are several choices to report "something goes wrong" scenarios
    >      (should unlikely happen???):
    >      1. fall back to a random positive value  (e.g. host value)
    >      2. return a negative value
    >      3. throw an exception
    >      
    >      #3 is not an option as the application is not expecting this.  For #2,
    >      the application can filter bad values if desirable.
    >      
    >      I'm okay if you want to file a JBS issue to follow up and thoroughly
    >      look at the cases that the metrics are unavailable and the cases when
    >      fails to obtain.
    >      
    >      >> ---
    >      >>
    >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >      >>
    >      >> System.out.println(String.format(...)
    >      >>
    >      >> Why not simply
    >      >>
    >      >> System.out.printf(..)
    >      >>
    >      >> ?
    >      
    >      or simply (as I commented [1])
    >           System.out.format
    >      
    >      Mandy
    >      [1]
    >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
    >      
    >      
    >
    >
    
    
From bob.vandette at oracle.com  Wed Dec 11 17:21:11 2019
From: bob.vandette at oracle.com (Bob Vandette)
Date: Wed, 11 Dec 2019 12:21:11 -0500
Subject: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <2A286F91-6CB8-4A32-B103-2C28D52E5A1A@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
 <2A286F91-6CB8-4A32-B103-2C28D52E5A1A@oracle.com>
Message-ID: <EB00EA71-C340-418F-9D9C-9A3EA77EEDBA@oracle.com>

Yes, I defer to Mandy on the best way to express the various Java exceptions.
I?m ok with the changes.

Thanks for getting this done for JDK14!

Bob.

> On Dec 11, 2019, at 12:12 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
> 
> Hi Serguei,
> 
> Thank you for your comments. I will correct this nits before pushing the changes.
> 
> Hi Bob and David,
> 
>> [Mandy Chung]
>>> I reviewed Metrics and Subsystem in this version.
>>> I don't need to see a new webrev.
> 
> As I understood Mandy finished reviewing this fix. Just wanted to confirm with you if you are okey with that version of the fix (webrev.06) ?
> 
> Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests passed. 
> 
> Thank you,
> Daniil
> 
> 
> 
> ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
> 
>    Hi Daniil,
> 
>    It is not a full review, just some minor comments.
>    In fact, I do not see real problems yet.
> 
>    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
> 
>       55     public long getTotalSwapSpaceSize() {
>       56         if (containerMetrics != null) {
>       57             long limit = containerMetrics.getMemoryAndSwapLimit();
>       58             // The memory limit metrics is not available if JVM 
>    runs on Linux host ( not in a docker container)
>       59             // or if a docker container was started without 
>    specifying a memory limit ( without '--memory='
>       60             // Docker option). In latter case there is no limit on 
>    how much memory the container can use and
>       61             // it can use as much memory as the host's OS allows.
>       62             long memLimit = containerMetrics.getMemoryLimit();
>       63             if (limit >= 0 && memLimit >= 0) {
>       64                 return limit - memLimit;
>       65             }
>       66         }
>       67         return getTotalSwapSpaceSize0();
>       68     }
> 
>       Unneeded space after brackets '('.
>       Do we need to check if the (limit - memLimit) value is negative?
>       The same question is for getFreeSwapSpaceSize():
>         memSwapLimit - memLimit - (memSwapUsage - memUsage)
> 
>       and getFreeMemorySize():
>         101 return limit - usage;
> 
>       81                         // If this happens just retry the loop for 
>    a few iterations
> 
>       Dot is missed at the end of comment.
> 
> 
>    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
> 
>       34 System.out.println(String.format("Runtime.availableProcessors: 
>    %d", Runtime.getRuntime().availableProcessors()));
>       35 
>    System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: 
>    %d", osBean.getAvailableProcessors()));
>       36 
>    System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: 
>    %d", osBean.getTotalMemorySize()));
>       37 
>    System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
>    %d", osBean.getTotalPhysicalMemorySize()));
>       38 
>    System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: 
>    %d", osBean.getFreeMemorySize()));
>       39 
>    System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: 
>    %d", osBean.getFreePhysicalMemorySize()));
>       40 
>    System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
>    %d", osBean.getTotalSwapSpaceSize()));
>       41 
>    System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
>    %d", osBean.getFreeSwapSpaceSize()));
>       42 
>    System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
>    osBean.getCpuLoad()));
>       43 
>    System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: 
>    %f", osBean.getSystemCpuLoad()));
> 
> 
>       To make the above lines a little bit shorter I'd suggest to define a 
>    log() method like this:
>          private static void log(String msg) ( System.out.println(msg(; }
> 
>       34         log(String.format("Runtime.availableProcessors: %d", 
>    Runtime.getRuntime().availableProcessors()));
>       35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: 
>    %d", osBean.getAvailableProcessors()));
>       36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", 
>    osBean.getTotalMemorySize()));
>       37 
>    log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
>    %d", osBean.getTotalPhysicalMemorySize()));
>       38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", 
>    osBean.getFreeMemorySize()));
>       39 
>    log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", 
>    osBean.getFreePhysicalMemorySize()));
>       40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
>    %d", osBean.getTotalSwapSpaceSize()));
>       41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
>    %d", osBean.getFreeSwapSpaceSize()));
>       42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
>    osBean.getCpuLoad()));
>       43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", 
>    osBean.getSystemCpuLoad()));
> 
> 
>    Thanks,
>    Serguei
> 
> 
> 
>    On 12/6/19 17:41, Daniil Titov wrote:
>> Hi David, Mandy, and Bob,
>> 
>> Thank you for reviewing this fix.
>> 
>> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
>> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
>> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
>>      was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
>>      I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
>>      but I agree that the changes proposed in the previous version of the webrev increase such probability.
>>      I filed the follow-up issue [4] as Mandy suggested.
>> 3.  The legacy methods were renamed as David suggested.
>> 
>> 
>>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>>> !     static int initialized=1;
>>> 
>>>  Am I reading this right that the code currently fails to actually do the
>>> initialization because of this ???
>> Yes, currently the code fails to do the initialization but it was unnoticed since method
>> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
>> was always -1.
>> 
>>>  test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>> 
>>> System.out.println(String.format(...)
>>> 
>>> Why not simply
>>> 
>>> System.out.printf(..)
>> As I tried explain it earlier it would make the tests unstable.
>> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
>> Instead it parses the format string into a list of FormatString objects and then iterates over the list.
>> As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
>> in the output.
>> 
>> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
>> and "1030762496".
>> 
>> <skipped>
>> [0.304s][trace][os,container] Memory Usage is: 42983424
>> OperatingSystemMXBean.getFreeMemorySize: 1030758400
>> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>> [0.305s][trace][os,container] Memory Usage is: 42979328
>> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
>> 1030762496
>> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
>> 
>> <skipped>
>> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
>> 
>> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
>> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
>> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
>> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
>> 	at java.base/java.lang.Thread.run(Thread.java:832)
>> 
>> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
>> 
>> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
>> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
>> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
>> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
>> 
>> Thank you,
>> Daniil
>> 
>> ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
>> 
>> 
>> 
>>     On 12/6/19 5:59 AM, Bob Vandette wrote:
>>>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>>>> 
>>>> 
>>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>>>> 
>>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
>> 
>>     I thought that the error case we are referring to is limit == 0 which
>>     indicates something unexpected goes wrong.  So the compatibility concern
>>     should be low.  This is very specific to Metrics implementation for
>>     cgroup v1 and let me know if I'm wrong.
>> 
>>>> Surely there must always be some information available from the operating environment? I see from the impl file:
>>>> 
>>>>    // the host data, value 0 indicates that something went wrong while the metric was read and
>>>>   // in this case we return "information unavailable" code -1.
>>>> 
>>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
>>> I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
>>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
>>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
>>> limits.
>>> 
>> 
>>     It's important to consider carefully if the monitoring API indicates an
>>     error vs unavailable and an application should continue to run when the
>>     monitoring system fails to get the metrics.
>> 
>>     There are several choices to report "something goes wrong" scenarios
>>     (should unlikely happen???):
>>     1. fall back to a random positive value  (e.g. host value)
>>     2. return a negative value
>>     3. throw an exception
>> 
>>     #3 is not an option as the application is not expecting this.  For #2,
>>     the application can filter bad values if desirable.
>> 
>>     I'm okay if you want to file a JBS issue to follow up and thoroughly
>>     look at the cases that the metrics are unavailable and the cases when
>>     fails to obtain.
>> 
>>>> ---
>>>> 
>>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>>> 
>>>> System.out.println(String.format(...)
>>>> 
>>>> Why not simply
>>>> 
>>>> System.out.printf(..)
>>>> 
>>>> ?
>> 
>>     or simply (as I commented [1])
>>          System.out.format
>> 
>>     Mandy
>>     [1]
>>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
>> 
>> 
>> 
>> 
> 
> 
> 
> 


From daniil.x.titov at oracle.com  Thu Dec  5 01:43:39 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 04 Dec 2019 17:43:39 -0800
Subject: RFR(S): 8234277:ClhsdbLauncher should enable verbose exceptions
 and do a better job of detecting SA failures
In-Reply-To: <0871624E-D09F-4F07-B49C-B4043EDCBEF8@oracle.com>
References: <c54eaee2-1db3-f23c-7cea-5ef623bc0f71@oracle.com>
 <8a972120-8ba3-a35e-b73f-e3d5faf68ce6@oracle.com>
 <0871624E-D09F-4F07-B49C-B4043EDCBEF8@oracle.com>
Message-ID: <AAB432E1-859E-4C3E-B6BE-746D37D1C8BE@oracle.com>

Hi Chris,

The change looks good to me.

Best regards,
Daniil

?On 12/4/19, 5:39 PM, "serviceability-dev on behalf of Chris Plummer" <serviceability-dev-bounces at openjdk.java.net on behalf of chris.plummer at oracle.com> wrote:

    Can I get one more review please?
    
    thanks,
    
    Chris
    
    On 12/3/19 1:10 PM, serguei.spitsyn at oracle.com wrote:
    > Hi Chris,
    >
    > It looks good.
    >
    > Thanks,
    > Serguei
    >
    > On 12/3/19 12:45 PM, Chris Plummer wrote:
    >> Hello,
    >>
    >> Please review the following:
    >>
    >> https://bugs.openjdk.java.net/browse/JDK-8234277
    >> http://cr.openjdk.java.net/~cjplummer/8234277/webrev.00/
    >>
    >> No longer redirect stderr for the jhsdb/clhsdb process. It results in 
    >> not seeing attach failures in the output, so OutputAnalyer can't 
    >> check for them.
    >>
    >> Execute "verbose true" as the first clhsdb command after launching. 
    >> This will result in verboseExceptions being true in 
    >> CommandProcessor.java, so full exception traces will appear in the 
    >> output. This will make debugging future SA test failures a lot easier.
    >>
    >> Add an extra check for any DebuggerException. This is mainly for 
    >> detecting that the attached failed. This previously was going 
    >> un-noticed, and instead the test would later fail because it noticed 
    >> some other issue, like missing output, which isn't very informative.
    >>
    >> Add checks for other unexpected SA exceptions that are caught and 
    >> printed by CommandProcessor. These will always have an "Error: " 
    >> prefix, making them easy to detect.
    >>
    >> Problem list ClhsdbScanOops.java. With the new error checking, it 
    >> will now always fail on windows due to JDK-8230731 and on macos and 
    >> linux due to JDK-8235220. These failures are not "new" per se, but 
    >> are just now being properly detected.
    >>
    >> thanks,
    >>
    >> Chris
    >
    
    
From david.holmes at oracle.com  Wed Dec 11 21:02:57 2019
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 12 Dec 2019 07:02:57 +1000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>

On 12/12/2019 1:07 am, Reingruber, Richard wrote:
> Hi David,
> 
>    > Most of the details here are in areas I can comment on in detail, but I
>    > did take an initial general look at things.
> 
> Thanks for taking the time!

Apologies the above should read:

"Most of the details here are in areas I *can't* comment on in detail ..."

David

>    > The only thing that jumped out at me is that I think the
>    > DeoptimizeObjectsALotThread should be a hidden thread.
>    >
>    > +  bool is_hidden_from_external_view() const { return true; }
> 
> Yes, it should. Will add the method like above.
> 
>    > Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
>    > active testing this will just bit-rot.
> 
> DeoptimizeObjectsALot is meant for stress testing with a larger workload. I will add a minimal test
> to keep it fresh.
> 
>    > Also on the tests I don't understand your @requires clause:
>    >
>    >   @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>    > (vm.opt.TieredCompilation != true))
>    >
>    > This seems to require that TieredCompilation is disabled, but tiered is
>    > our normal mode of operation. ??
>    >
> 
> I removed the clause. I guess I wanted to target the tests towards the code they are supposed to
> test, and it's easier to analyze failures w/o tiered compilation and with just one compiler thread.
> 
> Additionally I will make use of compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
> 
> Thanks,
> Richard.
> 
> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Mittwoch, 11. Dezember 2019 08:03
> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>> Hi,
>>
>> I would like to get reviews please for
>>
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>
>> Corresponding RFE:
>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>
>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>
>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
>> change is being tested at SAP since I posted the first RFR some months ago.
>>
>> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
>> agents request capabilities that allow them to access local variable values. E.g. if you start-up
>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
>> from the beginning, well before a debugger attaches -- if ever one should do so. With the
>> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
>> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
>> you'll find more details.
> 
> Most of the details here are in areas I can comment on in detail, but I
> did take an initial general look at things.
> 
> The only thing that jumped out at me is that I think the
> DeoptimizeObjectsALotThread should be a hidden thread.
> 
> +  bool is_hidden_from_external_view() const { return true; }
> 
> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
> active testing this will just bit-rot.
> 
> Also on the tests I don't understand your @requires clause:
> 
>    @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
> (vm.opt.TieredCompilation != true))
> 
> This seems to require that TieredCompilation is disabled, but tiered is
> our normal mode of operation. ??
> 
> Thanks,
> David
> 
>> Thanks,
>> Richard.
>>
>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>       http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
>>

From ioi.lam at oracle.com  Wed Dec 11 22:24:02 2019
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 11 Dec 2019 14:24:02 -0800
Subject: Removal of SA javascript support
In-Reply-To: <6faf0cb5-7b4a-5e35-ed7c-90b817235031@samersoff.net>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
 <6faf0cb5-7b4a-5e35-ed7c-90b817235031@samersoff.net>
Message-ID: <6e887f3b-7930-728e-8f84-89b8498e446d@oracle.com>

Regarding maintaining hotspot data structures in SA, I think we often 
break it without knowing, especially when we are adding data structures 
that are not currently exposed by SA.

Does anyone have a sense of the state of SA in newer versions of the 
JDK. Is SA still doing what you expect, or do you see a declining level 
of usefulness because SA is getting more out-of-sync?

Thanks


On 12/11/19 7:03 AM, Dmitry Samersoff wrote:
> Sundar,
>
> Supporting hotspot data structure in SA is already a maintenance
> nightmare ;)
>
> So we can consider to provide high level API, like find_class_by_name to
> script writer.
>
>    It allows anybody who are interesting with quick prototyping write his
> own program on top of SA with any language they want.
>
> -Dmitry
>
> On 11.12.19 15:47, sundararajan.athijegannathan at oracle.com wrote:
>> Effectively you're asking for SA as API. I don't think that is a good
>> idea. That implies supporting hotspot data structures as Java *API*.
>> That will be maintainability nightmare - we've to keep tracking hotspot
>> data structures in SA code. That itself is problematic. API would be
>> next level nightmare.
>>
>> -Sundar
>>
>> On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
>>> Hi,
>>>
>>> IMHO we need to export all packages in SA if we do not provide new API
>>> for SA.
>>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8
>>> (before Jigsaw), so we could make various functions if we need.
>>>
>>> OTOH we cannot know what classes are needed by the SA users. All
>>> packages in jdk.hotspot.agent module provides features, and they
>>> require other packages. For example, sun.jvm.hotspot.oops.Oop requires
>>> sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger .
>>> It is difficult to track and to export minimally.
>>> (I worked for it in JDK-8157947, but I gave up...)
>>>
>>> Thus I guess it is a big challenge to export SA classes without
>>> refactoring.
>>> If we provide new API for SA plugin, I guess we need to work some
>>> refactoring.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2019/12/11 15:00, Chris Plummer wrote:
>>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>>>> Hi?Yasumasa,
>>>>>>
>>>>>> That's a very nice idea. Basically what you're asking for is
>>>>>> exposing the Command interface [1] so that plugins can implement it
>>>>>> and get dynamically loaded / registered into CLHSDB / HSDB, right?
>>>>> Yes, but we also need proxy API to access internal SA objects e.g.
>>>>> CodeCache, JavaThread, TypeDataBase, etc...
>>>>>
>>>> Yes, or export them. I should have read this email before posting my
>>>> previous one.
>>>>
>>>> Chris
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> [1]:
>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>>>
>>>>>>
>>>>>> - Kris
>>>>>>
>>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga
>>>>>> <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>>>
>>>>>>  ??? Hi Chris,
>>>>>>
>>>>>>  ??? It's a sad proposal, but I agree with you. To maintain SA in JS
>>>>>> is difficult since Jigsaw.
>>>>>>  ??? However I want SA to implement pluggable feature.
>>>>>>  ??? I use custom script to list compiled codes in CodeCache.
>>>>>>
>>>>>>  ??? I guess other troubleshooters also want similar feature (via
>>>>>> jsload) in future if they encounter JVM crash.
>>>>>>
>>>>>>
>>>>>>  ??? Thanks,
>>>>>>
>>>>>>  ??? Yasumasa
>>>>>>
>>>>>>
>>>>>>  ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>>>>  ???? > Hi,
>>>>>>  ???? >
>>>>>>  ???? > I like to propose the removal of SA javascript support. Few
>>>>>> people even realize this support exists, and hopefully even fewer
>>>>>> are using it since I'd like to remove it. Since I'm new to this
>>>>>> myself, let me first explain what I know about it's existence, and
>>>>>> then explain why I want to remove it.
>>>>>>  ???? >
>>>>>>  ???? > If you run "jhsdb clhsdb", there are jsload and jseval
>>>>>> commands. Don't look for them in anything post JDK 8. I'll explain
>>>>>> why later. jsload is used to load a javascript file. In that file
>>>>>> you can register new clhsdb commands that are written in
>>>>>> javascript. You can also evaluate javascript using the jseval
>>>>>> command. Some of this is explained in [1], which is the only place
>>>>>> I can find any reference to this support. It does not appear to be
>>>>>> officially supported, nor is there any oracle provided documentation.
>>>>>>  ???? >
>>>>>>  ???? > There also appear to be a few clhsdb commands that are
>>>>>> written in javascript. Doing a grep for "registerCommand" in sa.js
>>>>>> shows the following:
>>>>>>  ???? >
>>>>>>  ???? >? ?registerCommand("class", "class name", "jclass");
>>>>>>  ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>>>>  ???? >? ?registerCommand("dumpclass", "dumpclass { address | name }
>>>>>> [ directory ]", "dclass");
>>>>>>  ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>>>>>  ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>>>>>>  ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>>>>  ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>>>>>  ???? >
>>>>>>  ???? > Once again, don't go looking for these in anything newer
>>>>>> than JDK8. You won't find them. Again the only documentation I can
>>>>>> fine is [1].
>>>>>>  ???? >
>>>>>>  ???? > The other use of Javascript is the SOQL command (Simple
>>>>>> Object Query Language), a tool used to query the heap, and also the
>>>>>> JSDB command. The only SOQL documentation I could find is the blog
>>>>>> reference [2]. I could not find HSDB documentation, but I believe
>>>>>> is is a javascript support for looking at hotspot. So once again,
>>>>>> neither of these seem to be officially supported or documented.
>>>>>>  ???? >
>>>>>>  ???? > The real purpose of the email is to propose removal of this
>>>>>> support. Here are the reasons:
>>>>>>  ???? >
>>>>>>  ???? > (1) It's broken, and has been since 9. See [3]. This is why
>>>>>> you don't see the javascript related commands in clhsdb. Javascript
>>>>>> fails to initialize, so none of the javascript related commands are
>>>>>> registered.
>>>>>>  ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>>>>  ???? > (3) We have very little understanding of the javascript
>>>>>> support.
>>>>>>  ???? > (4) No resources to work on it (unless there is a community
>>>>>> volunteer).
>>>>>>  ???? > (5) Very questionable value (lack of users). The fact this
>>>>>> support has been broken since JDK 9 and no bug was filed until I
>>>>>> did so this week is a good indication of that. Another is that
>>>>>> there are no other SA Javascript related bugs filed. Lastly, the
>>>>>> lack of any official documentation and only minimal mention of it
>>>>>> on the web is another good indication of it's (lack of) value.
>>>>>>  ???? >
>>>>>>  ???? > Also, regarding the 7 commands listed above that would be
>>>>>> lost (but currently don't work now anyway), if they are really
>>>>>> wanted, they could be implemented in java instead of javascript.
>>>>>>  ???? >
>>>>>>  ???? > I'd like to remove javascript support in two steps. The
>>>>>> first is simply disable the clhsdb code that tries to initialize
>>>>>> the javascript support. I'd like to do this in 14 (actually as soon
>>>>>> as possible). I'd like to actually do this now even if we decide to
>>>>>> keep javascript support and eventually fix it because it will get
>>>>>> rid of the warning you see whenever you attach from clhsdb:
>>>>>>  ???? >
>>>>>>  ???? >? ???? Warning! JS Engine can't start, some commands will not
>>>>>> be available.
>>>>>>  ???? >
>>>>>>  ???? > This warning will become more of an issue for the clhsdb
>>>>>> tests after I push [4] because then you will also see the full
>>>>>> stacktrace for the underlying exception that caused the Javascript
>>>>>> to fail to start. Besides being unnecessary noise in passing test
>>>>>> cases, it can also be misleading in any test that fails because the
>>>>>> exception will be unrelated to the failure. This is actually what
>>>>>> got me going down this path of what the javascript support is all
>>>>>> about.
>>>>>>  ???? >
>>>>>>  ???? > The next step would be to strip out all Javascript related
>>>>>> code, including the SOQL and JSDB tools. This would be done in 15.
>>>>>>  ???? >
>>>>>>  ???? > Please let me know what you think.
>>>>>>  ???? >
>>>>>>  ???? > thanks,
>>>>>>  ???? >
>>>>>>  ???? > Chris
>>>>>>  ???? >
>>>>>>  ???? > [1]
>>>>>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>>>>
>>>>>>  ???? > [2]
>>>>>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>>>>
>>>>>>  ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>>>>  ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>>>>  ???? >
>>>>>>
>>>>


From serguei.spitsyn at oracle.com  Wed Dec 11 23:13:34 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Dec 2019 15:13:34 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
 <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com>
Message-ID: <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com>

Hi Daniil,

One my concerns was a non-atomic read of multiple metrics before comparison.
It creates a potential to get a mismatch in result.
However, the probability to get a negative value is pretty low, I think.
The other concern (if incorrect metrics are returned) is covered by 
JDK-8235522.
Revising all concerns in JDK-8235522 sounds good to me.

Thanks,
Serguei

On 12/10/19 10:29, Daniil Titov wrote:
> Hi Serguei,
>
>>        Do we need to check if the (limit - memLimit) value is negative?
>>        The same question is for getFreeSwapSpaceSize():
>>          memSwapLimit - memLimit - (memSwapUsage - memUsage)
>>    
>>        and getFreeMemorySize():
>>         101 return limit - usage;
> I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method
> returns would indicate this (currently the native methods already returns -1 if something went wrong).  But we could revise it in the follow
>   up issue I created for that [1].
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8235522
>
> Thank you,
> Daniil
>
> ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
>
>      Hi Daniil,
>      
>      It is not a full review, just some minor comments.
>      In fact, I do not see real problems yet.
>      
>      http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
>      
>         55     public long getTotalSwapSpaceSize() {
>         56         if (containerMetrics != null) {
>         57             long limit = containerMetrics.getMemoryAndSwapLimit();
>         58             // The memory limit metrics is not available if JVM
>      runs on Linux host ( not in a docker container)
>         59             // or if a docker container was started without
>      specifying a memory limit ( without '--memory='
>         60             // Docker option). In latter case there is no limit on
>      how much memory the container can use and
>         61             // it can use as much memory as the host's OS allows.
>         62             long memLimit = containerMetrics.getMemoryLimit();
>         63             if (limit >= 0 && memLimit >= 0) {
>         64                 return limit - memLimit;
>         65             }
>         66         }
>         67         return getTotalSwapSpaceSize0();
>         68     }
>      
>         Unneeded space after brackets '('.
>         Do we need to check if the (limit - memLimit) value is negative?
>         The same question is for getFreeSwapSpaceSize():
>           memSwapLimit - memLimit - (memSwapUsage - memUsage)
>      
>         and getFreeMemorySize():
>           101 return limit - usage;
>      
>         81                         // If this happens just retry the loop for
>      a few iterations
>      
>         Dot is missed at the end of comment.
>      
>      
>      http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
>      
>         34 System.out.println(String.format("Runtime.availableProcessors:
>      %d", Runtime.getRuntime().availableProcessors()));
>         35
>      System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors:
>      %d", osBean.getAvailableProcessors()));
>         36
>      System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize:
>      %d", osBean.getTotalMemorySize()));
>         37
>      System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
>      %d", osBean.getTotalPhysicalMemorySize()));
>         38
>      System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize:
>      %d", osBean.getFreeMemorySize()));
>         39
>      System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize:
>      %d", osBean.getFreePhysicalMemorySize()));
>         40
>      System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
>      %d", osBean.getTotalSwapSpaceSize()));
>         41
>      System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
>      %d", osBean.getFreeSwapSpaceSize()));
>         42
>      System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f",
>      osBean.getCpuLoad()));
>         43
>      System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad:
>      %f", osBean.getSystemCpuLoad()));
>      
>      
>         To make the above lines a little bit shorter I'd suggest to define a
>      log() method like this:
>            private static void log(String msg) ( System.out.println(msg(; }
>      
>         34         log(String.format("Runtime.availableProcessors: %d",
>      Runtime.getRuntime().availableProcessors()));
>         35 log(String.format("OperatingSystemMXBean.getAvailableProcessors:
>      %d", osBean.getAvailableProcessors()));
>         36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d",
>      osBean.getTotalMemorySize()));
>         37
>      log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
>      %d", osBean.getTotalPhysicalMemorySize()));
>         38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d",
>      osBean.getFreeMemorySize()));
>         39
>      log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d",
>      osBean.getFreePhysicalMemorySize()));
>         40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
>      %d", osBean.getTotalSwapSpaceSize()));
>         41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
>      %d", osBean.getFreeSwapSpaceSize()));
>         42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f",
>      osBean.getCpuLoad()));
>         43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f",
>      osBean.getSystemCpuLoad()));
>      
>      
>      Thanks,
>      Serguei
>      
>      
>      
>      On 12/6/19 17:41, Daniil Titov wrote:
>      > Hi David, Mandy, and Bob,
>      >
>      > Thank you for reviewing this fix.
>      >
>      > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
>      > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
>      > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
>      >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
>      >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
>      >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
>      >       I filed the follow-up issue [4] as Mandy suggested.
>      > 3.  The legacy methods were renamed as David suggested.
>      >
>      >
>      >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>      >> !     static int initialized=1;
>      >>
>      >>   Am I reading this right that the code currently fails to actually do the
>      >> initialization because of this ???
>      > Yes, currently the code fails to do the initialization but it was unnoticed since method
>      > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
>      > was always -1.
>      >
>      >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>      >>
>      >> System.out.println(String.format(...)
>      >>
>      >> Why not simply
>      >>
>      >> System.out.printf(..)
>      > As I tried explain it earlier it would make the tests unstable.
>      > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
>      > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
>      > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
>      > in the output.
>      >
>      > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
>      > and "1030762496".
>      >
>      > <skipped>
>      > [0.304s][trace][os,container] Memory Usage is: 42983424
>      > OperatingSystemMXBean.getFreeMemorySize: 1030758400
>      > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>      > [0.305s][trace][os,container] Memory Usage is: 42979328
>      > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>      > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
>      > 1030762496
>      > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
>      >
>      > <skipped>
>      > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
>      >
>      > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
>      > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
>      > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
>      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>      > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>      > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
>      > 	at java.base/java.lang.Thread.run(Thread.java:832)
>      >
>      > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
>      >
>      > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
>      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
>      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
>      > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
>      >
>      > Thank you,
>      > Daniil
>      >
>      > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
>      >
>      >
>      >
>      >      On 12/6/19 5:59 AM, Bob Vandette wrote:
>      >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>      >      >>
>      >      >>
>      >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>      >      >>
>      >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
>      >
>      >      I thought that the error case we are referring to is limit == 0 which
>      >      indicates something unexpected goes wrong.  So the compatibility concern
>      >      should be low.  This is very specific to Metrics implementation for
>      >      cgroup v1 and let me know if I'm wrong.
>      >
>      >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
>      >      >>
>      >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
>      >      >>    // in this case we return "information unavailable" code -1.
>      >      >>
>      >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
>      >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
>      >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
>      >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
>      >      > limits.
>      >      >
>      >
>      >      It's important to consider carefully if the monitoring API indicates an
>      >      error vs unavailable and an application should continue to run when the
>      >      monitoring system fails to get the metrics.
>      >
>      >      There are several choices to report "something goes wrong" scenarios
>      >      (should unlikely happen???):
>      >      1. fall back to a random positive value  (e.g. host value)
>      >      2. return a negative value
>      >      3. throw an exception
>      >
>      >      #3 is not an option as the application is not expecting this.  For #2,
>      >      the application can filter bad values if desirable.
>      >
>      >      I'm okay if you want to file a JBS issue to follow up and thoroughly
>      >      look at the cases that the metrics are unavailable and the cases when
>      >      fails to obtain.
>      >
>      >      >> ---
>      >      >>
>      >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>      >      >>
>      >      >> System.out.println(String.format(...)
>      >      >>
>      >      >> Why not simply
>      >      >>
>      >      >> System.out.printf(..)
>      >      >>
>      >      >> ?
>      >
>      >      or simply (as I commented [1])
>      >           System.out.format
>      >
>      >      Mandy
>      >      [1]
>      >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
>      >
>      >
>      >
>      >
>      
>      
>
>


From daniil.x.titov at oracle.com  Wed Dec 11 23:33:05 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 11 Dec 2019 15:33:05 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
 <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com>
 <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com>
Message-ID: <D124C665-4626-48DD-8FF3-3D40642B868F@oracle.com>

Hi Serguei,

Thank you for reviewing this change.

Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and  
memoryAndSwapLimit). The  "limit" metrics (memoryLimit and memoryAndSwapLimit) are set 
when the container starts and  are not subjects to change. The only method that reads  more than one
 "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries  if the calculated swapUsage
is negative as a result of non-atomic reads.


Thank you,
Daniil

?On 12/11/19, 3:13 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:

    Hi Daniil,
    
    One my concerns was a non-atomic read of multiple metrics before comparison.
    It creates a potential to get a mismatch in result.
    However, the probability to get a negative value is pretty low, I think.
    The other concern (if incorrect metrics are returned) is covered by 
    JDK-8235522.
    Revising all concerns in JDK-8235522 sounds good to me.
    
    Thanks,
    Serguei
    
    On 12/10/19 10:29, Daniil Titov wrote:
    > Hi Serguei,
    >
    >>        Do we need to check if the (limit - memLimit) value is negative?
    >>        The same question is for getFreeSwapSpaceSize():
    >>          memSwapLimit - memLimit - (memSwapUsage - memUsage)
    >>    
    >>        and getFreeMemorySize():
    >>         101 return limit - usage;
    > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method
    > returns would indicate this (currently the native methods already returns -1 if something went wrong).  But we could revise it in the follow
    >   up issue I created for that [1].
    >
    > [1] https://bugs.openjdk.java.net/browse/JDK-8235522
    >
    > Thank you,
    > Daniil
    >
    > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
    >
    >      Hi Daniil,
    >      
    >      It is not a full review, just some minor comments.
    >      In fact, I do not see real problems yet.
    >      
    >      http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
    >      
    >         55     public long getTotalSwapSpaceSize() {
    >         56         if (containerMetrics != null) {
    >         57             long limit = containerMetrics.getMemoryAndSwapLimit();
    >         58             // The memory limit metrics is not available if JVM
    >      runs on Linux host ( not in a docker container)
    >         59             // or if a docker container was started without
    >      specifying a memory limit ( without '--memory='
    >         60             // Docker option). In latter case there is no limit on
    >      how much memory the container can use and
    >         61             // it can use as much memory as the host's OS allows.
    >         62             long memLimit = containerMetrics.getMemoryLimit();
    >         63             if (limit >= 0 && memLimit >= 0) {
    >         64                 return limit - memLimit;
    >         65             }
    >         66         }
    >         67         return getTotalSwapSpaceSize0();
    >         68     }
    >      
    >         Unneeded space after brackets '('.
    >         Do we need to check if the (limit - memLimit) value is negative?
    >         The same question is for getFreeSwapSpaceSize():
    >           memSwapLimit - memLimit - (memSwapUsage - memUsage)
    >      
    >         and getFreeMemorySize():
    >           101 return limit - usage;
    >      
    >         81                         // If this happens just retry the loop for
    >      a few iterations
    >      
    >         Dot is missed at the end of comment.
    >      
    >      
    >      http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
    >      
    >         34 System.out.println(String.format("Runtime.availableProcessors:
    >      %d", Runtime.getRuntime().availableProcessors()));
    >         35
    >      System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors:
    >      %d", osBean.getAvailableProcessors()));
    >         36
    >      System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize:
    >      %d", osBean.getTotalMemorySize()));
    >         37
    >      System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
    >      %d", osBean.getTotalPhysicalMemorySize()));
    >         38
    >      System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize:
    >      %d", osBean.getFreeMemorySize()));
    >         39
    >      System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize:
    >      %d", osBean.getFreePhysicalMemorySize()));
    >         40
    >      System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
    >      %d", osBean.getTotalSwapSpaceSize()));
    >         41
    >      System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
    >      %d", osBean.getFreeSwapSpaceSize()));
    >         42
    >      System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f",
    >      osBean.getCpuLoad()));
    >         43
    >      System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad:
    >      %f", osBean.getSystemCpuLoad()));
    >      
    >      
    >         To make the above lines a little bit shorter I'd suggest to define a
    >      log() method like this:
    >            private static void log(String msg) ( System.out.println(msg(; }
    >      
    >         34         log(String.format("Runtime.availableProcessors: %d",
    >      Runtime.getRuntime().availableProcessors()));
    >         35 log(String.format("OperatingSystemMXBean.getAvailableProcessors:
    >      %d", osBean.getAvailableProcessors()));
    >         36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d",
    >      osBean.getTotalMemorySize()));
    >         37
    >      log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
    >      %d", osBean.getTotalPhysicalMemorySize()));
    >         38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d",
    >      osBean.getFreeMemorySize()));
    >         39
    >      log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d",
    >      osBean.getFreePhysicalMemorySize()));
    >         40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
    >      %d", osBean.getTotalSwapSpaceSize()));
    >         41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
    >      %d", osBean.getFreeSwapSpaceSize()));
    >         42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f",
    >      osBean.getCpuLoad()));
    >         43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f",
    >      osBean.getSystemCpuLoad()));
    >      
    >      
    >      Thanks,
    >      Serguei
    >      
    >      
    >      
    >      On 12/6/19 17:41, Daniil Titov wrote:
    >      > Hi David, Mandy, and Bob,
    >      >
    >      > Thank you for reviewing this fix.
    >      >
    >      > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
    >      > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
    >      > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
    >      >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
    >      >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
    >      >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
    >      >       I filed the follow-up issue [4] as Mandy suggested.
    >      > 3.  The legacy methods were renamed as David suggested.
    >      >
    >      >
    >      >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >      >> !     static int initialized=1;
    >      >>
    >      >>   Am I reading this right that the code currently fails to actually do the
    >      >> initialization because of this ???
    >      > Yes, currently the code fails to do the initialization but it was unnoticed since method
    >      > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
    >      > was always -1.
    >      >
    >      >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >      >>
    >      >> System.out.println(String.format(...)
    >      >>
    >      >> Why not simply
    >      >>
    >      >> System.out.printf(..)
    >      > As I tried explain it earlier it would make the tests unstable.
    >      > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
    >      > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
    >      > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
    >      > in the output.
    >      >
    >      > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
    >      > and "1030762496".
    >      >
    >      > <skipped>
    >      > [0.304s][trace][os,container] Memory Usage is: 42983424
    >      > OperatingSystemMXBean.getFreeMemorySize: 1030758400
    >      > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    >      > [0.305s][trace][os,container] Memory Usage is: 42979328
    >      > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    >      > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
    >      > 1030762496
    >      > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
    >      >
    >      > <skipped>
    >      > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
    >      >
    >      > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
    >      > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
    >      > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
    >      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    >      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    >      > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    >      > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    >      > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    >      > 	at java.base/java.lang.Thread.run(Thread.java:832)
    >      >
    >      > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
    >      >
    >      > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
    >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
    >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
    >      > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
    >      >
    >      > Thank you,
    >      > Daniil
    >      >
    >      > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
    >      >
    >      >
    >      >
    >      >      On 12/6/19 5:59 AM, Bob Vandette wrote:
    >      >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
    >      >      >>
    >      >      >>
    >      >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
    >      >      >>
    >      >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
    >      >
    >      >      I thought that the error case we are referring to is limit == 0 which
    >      >      indicates something unexpected goes wrong.  So the compatibility concern
    >      >      should be low.  This is very specific to Metrics implementation for
    >      >      cgroup v1 and let me know if I'm wrong.
    >      >
    >      >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
    >      >      >>
    >      >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
    >      >      >>    // in this case we return "information unavailable" code -1.
    >      >      >>
    >      >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
    >      >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
    >      >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
    >      >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
    >      >      > limits.
    >      >      >
    >      >
    >      >      It's important to consider carefully if the monitoring API indicates an
    >      >      error vs unavailable and an application should continue to run when the
    >      >      monitoring system fails to get the metrics.
    >      >
    >      >      There are several choices to report "something goes wrong" scenarios
    >      >      (should unlikely happen???):
    >      >      1. fall back to a random positive value  (e.g. host value)
    >      >      2. return a negative value
    >      >      3. throw an exception
    >      >
    >      >      #3 is not an option as the application is not expecting this.  For #2,
    >      >      the application can filter bad values if desirable.
    >      >
    >      >      I'm okay if you want to file a JBS issue to follow up and thoroughly
    >      >      look at the cases that the metrics are unavailable and the cases when
    >      >      fails to obtain.
    >      >
    >      >      >> ---
    >      >      >>
    >      >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >      >      >>
    >      >      >> System.out.println(String.format(...)
    >      >      >>
    >      >      >> Why not simply
    >      >      >>
    >      >      >> System.out.printf(..)
    >      >      >>
    >      >      >> ?
    >      >
    >      >      or simply (as I commented [1])
    >      >           System.out.format
    >      >
    >      >      Mandy
    >      >      [1]
    >      >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
    >      >
    >      >
    >      >
    >      >
    >      
    >      
    >
    >
    
    
From daniil.x.titov at oracle.com  Wed Dec 11 23:35:10 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 11 Dec 2019 15:35:10 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <E07AF01E-1369-4471-BE7A-630170F2898A@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
 <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com>
 <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com>
 <E07AF01E-1369-4471-BE7A-630170F2898A@oracle.com>
Message-ID: <8E4D2767-A223-46AC-A541-D51CC5D3D7AF@oracle.com>

Typo fixed...

.. that the only "volatile" metrics are "usage" ones ( memoryUsage and   *memoryAndSwapUsage*).

Best regards,
Daniil

?On 12/11/19, 3:33 PM, "Daniil Titov" <daniil.x.titov at oracle.com> wrote:

    Hi Serguei,
    
    Thank you for reviewing this change.
    
    Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and  
    memoryAndSwapLimit). The  "limit" metrics (memoryLimit and memoryAndSwapLimit) are set 
    when the container starts and  are not subjects to change. The only method that reads  more than one
     "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries  if the calculated swapUsage
    is negative as a result of non-atomic reads.
    
    
    Thank you,
    Daniil
    
    ?On 12/11/19, 3:13 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
    
        Hi Daniil,
        
        One my concerns was a non-atomic read of multiple metrics before comparison.
        It creates a potential to get a mismatch in result.
        However, the probability to get a negative value is pretty low, I think.
        The other concern (if incorrect metrics are returned) is covered by 
        JDK-8235522.
        Revising all concerns in JDK-8235522 sounds good to me.
        
        Thanks,
        Serguei
        
        On 12/10/19 10:29, Daniil Titov wrote:
        > Hi Serguei,
        >
        >>        Do we need to check if the (limit - memLimit) value is negative?
        >>        The same question is for getFreeSwapSpaceSize():
        >>          memSwapLimit - memLimit - (memSwapUsage - memUsage)
        >>    
        >>        and getFreeMemorySize():
        >>         101 return limit - usage;
        > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method
        > returns would indicate this (currently the native methods already returns -1 if something went wrong).  But we could revise it in the follow
        >   up issue I created for that [1].
        >
        > [1] https://bugs.openjdk.java.net/browse/JDK-8235522
        >
        > Thank you,
        > Daniil
        >
        > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
        >
        >      Hi Daniil,
        >      
        >      It is not a full review, just some minor comments.
        >      In fact, I do not see real problems yet.
        >      
        >      http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
        >      
        >         55     public long getTotalSwapSpaceSize() {
        >         56         if (containerMetrics != null) {
        >         57             long limit = containerMetrics.getMemoryAndSwapLimit();
        >         58             // The memory limit metrics is not available if JVM
        >      runs on Linux host ( not in a docker container)
        >         59             // or if a docker container was started without
        >      specifying a memory limit ( without '--memory='
        >         60             // Docker option). In latter case there is no limit on
        >      how much memory the container can use and
        >         61             // it can use as much memory as the host's OS allows.
        >         62             long memLimit = containerMetrics.getMemoryLimit();
        >         63             if (limit >= 0 && memLimit >= 0) {
        >         64                 return limit - memLimit;
        >         65             }
        >         66         }
        >         67         return getTotalSwapSpaceSize0();
        >         68     }
        >      
        >         Unneeded space after brackets '('.
        >         Do we need to check if the (limit - memLimit) value is negative?
        >         The same question is for getFreeSwapSpaceSize():
        >           memSwapLimit - memLimit - (memSwapUsage - memUsage)
        >      
        >         and getFreeMemorySize():
        >           101 return limit - usage;
        >      
        >         81                         // If this happens just retry the loop for
        >      a few iterations
        >      
        >         Dot is missed at the end of comment.
        >      
        >      
        >      http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
        >      
        >         34 System.out.println(String.format("Runtime.availableProcessors:
        >      %d", Runtime.getRuntime().availableProcessors()));
        >         35
        >      System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors:
        >      %d", osBean.getAvailableProcessors()));
        >         36
        >      System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize:
        >      %d", osBean.getTotalMemorySize()));
        >         37
        >      System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
        >      %d", osBean.getTotalPhysicalMemorySize()));
        >         38
        >      System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize:
        >      %d", osBean.getFreeMemorySize()));
        >         39
        >      System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize:
        >      %d", osBean.getFreePhysicalMemorySize()));
        >         40
        >      System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
        >      %d", osBean.getTotalSwapSpaceSize()));
        >         41
        >      System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
        >      %d", osBean.getFreeSwapSpaceSize()));
        >         42
        >      System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f",
        >      osBean.getCpuLoad()));
        >         43
        >      System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad:
        >      %f", osBean.getSystemCpuLoad()));
        >      
        >      
        >         To make the above lines a little bit shorter I'd suggest to define a
        >      log() method like this:
        >            private static void log(String msg) ( System.out.println(msg(; }
        >      
        >         34         log(String.format("Runtime.availableProcessors: %d",
        >      Runtime.getRuntime().availableProcessors()));
        >         35 log(String.format("OperatingSystemMXBean.getAvailableProcessors:
        >      %d", osBean.getAvailableProcessors()));
        >         36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d",
        >      osBean.getTotalMemorySize()));
        >         37
        >      log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
        >      %d", osBean.getTotalPhysicalMemorySize()));
        >         38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d",
        >      osBean.getFreeMemorySize()));
        >         39
        >      log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d",
        >      osBean.getFreePhysicalMemorySize()));
        >         40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
        >      %d", osBean.getTotalSwapSpaceSize()));
        >         41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
        >      %d", osBean.getFreeSwapSpaceSize()));
        >         42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f",
        >      osBean.getCpuLoad()));
        >         43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f",
        >      osBean.getSystemCpuLoad()));
        >      
        >      
        >      Thanks,
        >      Serguei
        >      
        >      
        >      
        >      On 12/6/19 17:41, Daniil Titov wrote:
        >      > Hi David, Mandy, and Bob,
        >      >
        >      > Thank you for reviewing this fix.
        >      >
        >      > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
        >      > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
        >      > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
        >      >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
        >      >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
        >      >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
        >      >       I filed the follow-up issue [4] as Mandy suggested.
        >      > 3.  The legacy methods were renamed as David suggested.
        >      >
        >      >
        >      >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
        >      >> !     static int initialized=1;
        >      >>
        >      >>   Am I reading this right that the code currently fails to actually do the
        >      >> initialization because of this ???
        >      > Yes, currently the code fails to do the initialization but it was unnoticed since method
        >      > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
        >      > was always -1.
        >      >
        >      >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
        >      >>
        >      >> System.out.println(String.format(...)
        >      >>
        >      >> Why not simply
        >      >>
        >      >> System.out.printf(..)
        >      > As I tried explain it earlier it would make the tests unstable.
        >      > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
        >      > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
        >      > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
        >      > in the output.
        >      >
        >      > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
        >      > and "1030762496".
        >      >
        >      > <skipped>
        >      > [0.304s][trace][os,container] Memory Usage is: 42983424
        >      > OperatingSystemMXBean.getFreeMemorySize: 1030758400
        >      > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
        >      > [0.305s][trace][os,container] Memory Usage is: 42979328
        >      > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
        >      > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
        >      > 1030762496
        >      > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
        >      >
        >      > <skipped>
        >      > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
        >      >
        >      > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
        >      > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
        >      > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
        >      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        >      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        >      > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        >      > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
        >      > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
        >      > 	at java.base/java.lang.Thread.run(Thread.java:832)
        >      >
        >      > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
        >      >
        >      > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
        >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
        >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
        >      > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
        >      >
        >      > Thank you,
        >      > Daniil
        >      >
        >      > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
        >      >
        >      >
        >      >
        >      >      On 12/6/19 5:59 AM, Bob Vandette wrote:
        >      >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
        >      >      >>
        >      >      >>
        >      >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
        >      >      >>
        >      >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
        >      >
        >      >      I thought that the error case we are referring to is limit == 0 which
        >      >      indicates something unexpected goes wrong.  So the compatibility concern
        >      >      should be low.  This is very specific to Metrics implementation for
        >      >      cgroup v1 and let me know if I'm wrong.
        >      >
        >      >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
        >      >      >>
        >      >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
        >      >      >>    // in this case we return "information unavailable" code -1.
        >      >      >>
        >      >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
        >      >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
        >      >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
        >      >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
        >      >      > limits.
        >      >      >
        >      >
        >      >      It's important to consider carefully if the monitoring API indicates an
        >      >      error vs unavailable and an application should continue to run when the
        >      >      monitoring system fails to get the metrics.
        >      >
        >      >      There are several choices to report "something goes wrong" scenarios
        >      >      (should unlikely happen???):
        >      >      1. fall back to a random positive value  (e.g. host value)
        >      >      2. return a negative value
        >      >      3. throw an exception
        >      >
        >      >      #3 is not an option as the application is not expecting this.  For #2,
        >      >      the application can filter bad values if desirable.
        >      >
        >      >      I'm okay if you want to file a JBS issue to follow up and thoroughly
        >      >      look at the cases that the metrics are unavailable and the cases when
        >      >      fails to obtain.
        >      >
        >      >      >> ---
        >      >      >>
        >      >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
        >      >      >>
        >      >      >> System.out.println(String.format(...)
        >      >      >>
        >      >      >> Why not simply
        >      >      >>
        >      >      >> System.out.printf(..)
        >      >      >>
        >      >      >> ?
        >      >
        >      >      or simply (as I commented [1])
        >      >           System.out.format
        >      >
        >      >      Mandy
        >      >      [1]
        >      >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
        >      >
        >      >
        >      >
        >      >
        >      
        >      
        >
        >
        
        
From serguei.spitsyn at oracle.com  Wed Dec 11 23:51:12 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Dec 2019 15:51:12 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <D124C665-4626-48DD-8FF3-3D40642B868F@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
 <7720D36B-0505-49E3-8424-76ACEACAF0AB@oracle.com>
 <3172f6e2-5330-91eb-35b4-03bec407ee5b@oracle.com>
 <D124C665-4626-48DD-8FF3-3D40642B868F@oracle.com>
Message-ID: <eb13ef1e-b208-9b32-6d5e-afd93c4e784c@oracle.com>

Hi Daniil,

Got it, thanks!
Serguei


On 12/11/19 15:33, Daniil Titov wrote:
> Hi Serguei,
>
> Thank you for reviewing this change.
>
> Just wanted to add that the only "volatile" metrics are "usage" ones ( memoryUsage and
> memoryAndSwapLimit). The  "limit" metrics (memoryLimit and memoryAndSwapLimit) are set
> when the container starts and  are not subjects to change. The only method that reads  more than one
>   "volatile" metric is getFreeSwapSpaceSize() and it has a code that retries  if the calculated swapUsage
> is negative as a result of non-atomic reads.
>
>
> Thank you,
> Daniil
>
> ?On 12/11/19, 3:13 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
>
>      Hi Daniil,
>      
>      One my concerns was a non-atomic read of multiple metrics before comparison.
>      It creates a potential to get a mismatch in result.
>      However, the probability to get a negative value is pretty low, I think.
>      The other concern (if incorrect metrics are returned) is covered by
>      JDK-8235522.
>      Revising all concerns in JDK-8235522 sounds good to me.
>      
>      Thanks,
>      Serguei
>      
>      On 12/10/19 10:29, Daniil Titov wrote:
>      > Hi Serguei,
>      >
>      >>        Do we need to check if the (limit - memLimit) value is negative?
>      >>        The same question is for getFreeSwapSpaceSize():
>      >>          memSwapLimit - memLimit - (memSwapUsage - memUsage)
>      >>
>      >>        and getFreeMemorySize():
>      >>         101 return limit - usage;
>      > I don't think we need such check here. If it happens in fact it means the serious system malfunction and a negative value this method
>      > returns would indicate this (currently the native methods already returns -1 if something went wrong).  But we could revise it in the follow
>      >   up issue I created for that [1].
>      >
>      > [1] https://bugs.openjdk.java.net/browse/JDK-8235522
>      >
>      > Thank you,
>      > Daniil
>      >
>      > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
>      >
>      >      Hi Daniil,
>      >
>      >      It is not a full review, just some minor comments.
>      >      In fact, I do not see real problems yet.
>      >
>      >      http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
>      >
>      >         55     public long getTotalSwapSpaceSize() {
>      >         56         if (containerMetrics != null) {
>      >         57             long limit = containerMetrics.getMemoryAndSwapLimit();
>      >         58             // The memory limit metrics is not available if JVM
>      >      runs on Linux host ( not in a docker container)
>      >         59             // or if a docker container was started without
>      >      specifying a memory limit ( without '--memory='
>      >         60             // Docker option). In latter case there is no limit on
>      >      how much memory the container can use and
>      >         61             // it can use as much memory as the host's OS allows.
>      >         62             long memLimit = containerMetrics.getMemoryLimit();
>      >         63             if (limit >= 0 && memLimit >= 0) {
>      >         64                 return limit - memLimit;
>      >         65             }
>      >         66         }
>      >         67         return getTotalSwapSpaceSize0();
>      >         68     }
>      >
>      >         Unneeded space after brackets '('.
>      >         Do we need to check if the (limit - memLimit) value is negative?
>      >         The same question is for getFreeSwapSpaceSize():
>      >           memSwapLimit - memLimit - (memSwapUsage - memUsage)
>      >
>      >         and getFreeMemorySize():
>      >           101 return limit - usage;
>      >
>      >         81                         // If this happens just retry the loop for
>      >      a few iterations
>      >
>      >         Dot is missed at the end of comment.
>      >
>      >
>      >      http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
>      >
>      >         34 System.out.println(String.format("Runtime.availableProcessors:
>      >      %d", Runtime.getRuntime().availableProcessors()));
>      >         35
>      >      System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors:
>      >      %d", osBean.getAvailableProcessors()));
>      >         36
>      >      System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize:
>      >      %d", osBean.getTotalMemorySize()));
>      >         37
>      >      System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
>      >      %d", osBean.getTotalPhysicalMemorySize()));
>      >         38
>      >      System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize:
>      >      %d", osBean.getFreeMemorySize()));
>      >         39
>      >      System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize:
>      >      %d", osBean.getFreePhysicalMemorySize()));
>      >         40
>      >      System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
>      >      %d", osBean.getTotalSwapSpaceSize()));
>      >         41
>      >      System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
>      >      %d", osBean.getFreeSwapSpaceSize()));
>      >         42
>      >      System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f",
>      >      osBean.getCpuLoad()));
>      >         43
>      >      System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad:
>      >      %f", osBean.getSystemCpuLoad()));
>      >
>      >
>      >         To make the above lines a little bit shorter I'd suggest to define a
>      >      log() method like this:
>      >            private static void log(String msg) ( System.out.println(msg(; }
>      >
>      >         34         log(String.format("Runtime.availableProcessors: %d",
>      >      Runtime.getRuntime().availableProcessors()));
>      >         35 log(String.format("OperatingSystemMXBean.getAvailableProcessors:
>      >      %d", osBean.getAvailableProcessors()));
>      >         36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d",
>      >      osBean.getTotalMemorySize()));
>      >         37
>      >      log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
>      >      %d", osBean.getTotalPhysicalMemorySize()));
>      >         38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d",
>      >      osBean.getFreeMemorySize()));
>      >         39
>      >      log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d",
>      >      osBean.getFreePhysicalMemorySize()));
>      >         40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
>      >      %d", osBean.getTotalSwapSpaceSize()));
>      >         41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
>      >      %d", osBean.getFreeSwapSpaceSize()));
>      >         42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f",
>      >      osBean.getCpuLoad()));
>      >         43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f",
>      >      osBean.getSystemCpuLoad()));
>      >
>      >
>      >      Thanks,
>      >      Serguei
>      >
>      >
>      >
>      >      On 12/6/19 17:41, Daniil Titov wrote:
>      >      > Hi David, Mandy, and Bob,
>      >      >
>      >      > Thank you for reviewing this fix.
>      >      >
>      >      > Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
>      >      > 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
>      >      > 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
>      >      >       was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
>      >      >       I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
>      >      >       but I agree that the changes proposed in the previous version of the webrev increase such probability.
>      >      >       I filed the follow-up issue [4] as Mandy suggested.
>      >      > 3.  The legacy methods were renamed as David suggested.
>      >      >
>      >      >
>      >      >> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>      >      >> !     static int initialized=1;
>      >      >>
>      >      >>   Am I reading this right that the code currently fails to actually do the
>      >      >> initialization because of this ???
>      >      > Yes, currently the code fails to do the initialization but it was unnoticed since method
>      >      > get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
>      >      > was always -1.
>      >      >
>      >      >>   test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>      >      >>
>      >      >> System.out.println(String.format(...)
>      >      >>
>      >      >> Why not simply
>      >      >>
>      >      >> System.out.printf(..)
>      >      > As I tried explain it earlier it would make the tests unstable.
>      >      > System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
>      >      > Instead it parses the format string into a list of FormatString objects and then iterates over the list.
>      >      > As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
>      >      > in the output.
>      >      >
>      >      > For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
>      >      > and "1030762496".
>      >      >
>      >      > <skipped>
>      >      > [0.304s][trace][os,container] Memory Usage is: 42983424
>      >      > OperatingSystemMXBean.getFreeMemorySize: 1030758400
>      >      > [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>      >      > [0.305s][trace][os,container] Memory Usage is: 42979328
>      >      > [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
>      >      > OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
>      >      > 1030762496
>      >      > OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
>      >      >
>      >      > <skipped>
>      >      > java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
>      >      >
>      >      > 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
>      >      > 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
>      >      > 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
>      >      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>      >      > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>      >      > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      >      > 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>      >      > 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
>      >      > 	at java.base/java.lang.Thread.run(Thread.java:832)
>      >      >
>      >      > Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
>      >      >
>      >      > [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
>      >      > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
>      >      > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
>      >      > [4] https://bugs.openjdk.java.net/browse/JDK-8235522
>      >      >
>      >      > Thank you,
>      >      > Daniil
>      >      >
>      >      > ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
>      >      >
>      >      >
>      >      >
>      >      >      On 12/6/19 5:59 AM, Bob Vandette wrote:
>      >      >      >> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
>      >      >      >>
>      >      >      >>
>      >      >      >> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>      >      >      >>
>      >      >      >> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
>      >      >
>      >      >      I thought that the error case we are referring to is limit == 0 which
>      >      >      indicates something unexpected goes wrong.  So the compatibility concern
>      >      >      should be low.  This is very specific to Metrics implementation for
>      >      >      cgroup v1 and let me know if I'm wrong.
>      >      >
>      >      >      >> Surely there must always be some information available from the operating environment? I see from the impl file:
>      >      >      >>
>      >      >      >>     // the host data, value 0 indicates that something went wrong while the metric was read and
>      >      >      >>    // in this case we return "information unavailable" code -1.
>      >      >      >>
>      >      >      >> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
>      >      >      > I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
>      >      >      > Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
>      >      >      > are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
>      >      >      > limits.
>      >      >      >
>      >      >
>      >      >      It's important to consider carefully if the monitoring API indicates an
>      >      >      error vs unavailable and an application should continue to run when the
>      >      >      monitoring system fails to get the metrics.
>      >      >
>      >      >      There are several choices to report "something goes wrong" scenarios
>      >      >      (should unlikely happen???):
>      >      >      1. fall back to a random positive value  (e.g. host value)
>      >      >      2. return a negative value
>      >      >      3. throw an exception
>      >      >
>      >      >      #3 is not an option as the application is not expecting this.  For #2,
>      >      >      the application can filter bad values if desirable.
>      >      >
>      >      >      I'm okay if you want to file a JBS issue to follow up and thoroughly
>      >      >      look at the cases that the metrics are unavailable and the cases when
>      >      >      fails to obtain.
>      >      >
>      >      >      >> ---
>      >      >      >>
>      >      >      >> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>      >      >      >>
>      >      >      >> System.out.println(String.format(...)
>      >      >      >>
>      >      >      >> Why not simply
>      >      >      >>
>      >      >      >> System.out.printf(..)
>      >      >      >>
>      >      >      >> ?
>      >      >
>      >      >      or simply (as I commented [1])
>      >      >           System.out.format
>      >      >
>      >      >      Mandy
>      >      >      [1]
>      >      >      https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
>      >      >
>      >      >
>      >      >
>      >      >
>      >
>      >
>      >
>      >
>      
>      
>
>


From suenaga at oss.nttdata.com  Thu Dec 12 01:07:37 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 12 Dec 2019 10:07:37 +0900
Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <2c066d67-aa3c-83e2-632a-1ba3114d1538@samersoff.net>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <de7b6cb5-88c3-dc0c-48d3-a3ecf66eff9f@oss.nttdata.com>
 <2c066d67-aa3c-83e2-632a-1ba3114d1538@samersoff.net>
Message-ID: <635516c4-8f89-c9e6-75c4-9debd8a315be@oss.nttdata.com>

Hi Dmitry,

Thanks for your comment!

On 2019/12/12 0:34, Dmitry Samersoff wrote:
> Hello Yasumasa,
> 
> Please,
> 
> 1. Consider to use mmap for reading elf sections.

Did you pointed `read_section_data()`?

   lib->eh_frame.data = read_section_data(lib->fd, &ehdr, sh);

I do not change implementation of `read_section_data()`.
If you want to change to use mmap, I think it should be fixed as another issue.


> 2. Please move all platfrom-specific parts of native code to a separate
> file/directory. Current patch will brake AARCH64 build.

Unfortunately JDK libraries (shared libraries excepts HotSpot) seem not to care CPU type in makefiles.

   http://hg.openjdk.java.net/jdk/jdk/file/f22d91b2d072/make/common/JdkNativeCompilation.gmk#l38

I believe my patch do not call platform-specific function(s).
Can you share your concern?


> 3. I didn't find any tests here. How did your test the changes?

It can be tested in TestJhsdbJstackMixed and ClhsdbPstack whether mixed jstack can work without error.

We can add the test whether native frames exist in the result, but I found same issue on Windows.
So I do not want to add it now.


> libproc_impl.c
> 
> 131: If is not necessary, free handles NULLPTR gracefully.

Thanks, I will fix it.


Yasumasa


> -Dmitry
> 
> 
> On 04.12.19 03:54, Yasumasa Suenaga wrote:
>> PING: Could you review it?
>>
>>  ? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>  ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>
>> This bug is targeted to JDK 14.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2019/11/28 21:39, Yasumasa Suenaga wrote:
>>> Hi,
>>>
>>> I refactored LinuxAMD64CFrame.java . It works fine in
>>> serviceability/sa tests and
>>> all tests on submit repo
>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>> Could you review new webrev?
>>>
>>>  ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>
>>> The diff from previous webrev is here:
>>>  ?? http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> Please review this change:
>>>>
>>>>  ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>  ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>
>>>>
>>>> According to 2.7 Stack Unwind Algorithm in System V Application
>>>> Binary Interface AMD64
>>>> Architecture Processor Supplement [1], we need to use DWARF in
>>>> .eh_frame or .debug_frame
>>>> for stack unwinding.
>>>>
>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since
>>>> GCC 4.6, so system
>>>> library (e.g. libc) might be compiled with this feature.
>>>>
>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer
>>>> register (RBP).
>>>> So it might be lack of stack frames.
>>>>
>>>> I guess JDK-8219201 is caused by same issue.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> [1]
>>>> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
>>>>

From suenaga at oss.nttdata.com  Thu Dec 12 01:27:07 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 12 Dec 2019 10:27:07 +0900
Subject: Removal of SA javascript support
In-Reply-To: <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
Message-ID: <1e497ba4-7897-8b23-a53a-255fa3ee0aea@oss.nttdata.com>

I discussed with Kris about this in OpenJDK Committers' Workshop last year.
In case of .NET Core, SOS is provided to integrate runtime debugging feature to native debugger.
If same feature will be provided, I'm very happy!

For example, GDB and WinDbg provides remote debug server.
If (CL)HSDB can connect to native debugger, we can gather data in Java layer via them.
I think we can delegate most of memory access to native debugger.
Also we can run custom scripts on GDB. Then I think we need minimum support for Java call frames, OOP, and SymbolTable.


Yasumasa


On 2019/12/11 21:47, sundararajan.athijegannathan at oracle.com wrote:
> Effectively you're asking for SA as API. I don't think that is a good idea. That implies supporting hotspot data structures as Java *API*. That will be maintainability nightmare - we've to keep tracking hotspot data structures in SA code. That itself is problematic. API would be next level nightmare.
> 
> -Sundar
> 
> On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
>> Hi,
>>
>> IMHO we need to export all packages in SA if we do not provide new API for SA.
>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 (before Jigsaw), so we could make various functions if we need.
>>
>> OTOH we cannot know what classes are needed by the SA users. All packages in jdk.hotspot.agent module provides features, and they require other packages. For example, sun.jvm.hotspot.oops.Oop requires sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger .
>> It is difficult to track and to export minimally.
>> (I worked for it in JDK-8157947, but I gave up...)
>>
>> Thus I guess it is a big challenge to export SA classes without refactoring.
>> If we provide new API for SA plugin, I guess we need to work some refactoring.
>>
>>
>> Yasumasa
>>
>>
>> On 2019/12/11 15:00, Chris Plummer wrote:
>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>>> Hi?Yasumasa,
>>>>>
>>>>> That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right?
>>>>
>>>> Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc...
>>>>
>>> Yes, or export them. I should have read this email before posting my previous one.
>>>
>>> Chris
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>> [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>>
>>>>> - Kris
>>>>>
>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>>
>>>>> ??? Hi Chris,
>>>>>
>>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw.
>>>>> ??? However I want SA to implement pluggable feature.
>>>>> ??? I use custom script to list compiled codes in CodeCache.
>>>>>
>>>>> ??? I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash.
>>>>>
>>>>>
>>>>> ??? Thanks,
>>>>>
>>>>> ??? Yasumasa
>>>>>
>>>>>
>>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>>> ???? > Hi,
>>>>> ???? >
>>>>> ???? > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it.
>>>>> ???? >
>>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation.
>>>>> ???? >
>>>>> ???? > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following:
>>>>> ???? >
>>>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass");
>>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>>>> ???? >
>>>>> ???? > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1].
>>>>> ???? >
>>>>> ???? > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented.
>>>>> ???? >
>>>>> ???? > The real purpose of the email is to propose removal of this support. Here are the reasons:
>>>>> ???? >
>>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered.
>>>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>>> ???? > (3) We have very little understanding of the javascript support.
>>>>> ???? > (4) No resources to work on it (unless there is a community volunteer).
>>>>> ???? > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value.
>>>>> ???? >
>>>>> ???? > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript.
>>>>> ???? >
>>>>> ???? > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb:
>>>>> ???? >
>>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not be available.
>>>>> ???? >
>>>>> ???? > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about.
>>>>> ???? >
>>>>> ???? > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15.
>>>>> ???? >
>>>>> ???? > Please let me know what you think.
>>>>> ???? >
>>>>> ???? > thanks,
>>>>> ???? >
>>>>> ???? > Chris
>>>>> ???? >
>>>>> ???? > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>>> ???? > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>>> ???? >
>>>>>
>>>
>>>

From fairoz.matte at oracle.com  Thu Dec 12 03:10:56 2019
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Wed, 11 Dec 2019 19:10:56 -0800 (PST)
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink
 is enabled
Message-ID: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>

Hi,

Please review this small change,
Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call. 

JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/

This patch is provided by Yasumasa Suenaga

Thanks,
Fairoz

From daniil.x.titov at oracle.com  Thu Dec 12 03:25:49 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 11 Dec 2019 19:25:49 -0800
Subject: RFR: 8226575: OperatingSystemMXBean should be made container aware
In-Reply-To: <EB00EA71-C340-418F-9D9C-9A3EA77EEDBA@oracle.com>
References: <99DD810A-E931-47D9-9968-065385C91752@oracle.com>
 <e84ab494-3271-c226-fcde-d6a73980160f@oracle.com>
 <6BDD7672-2B08-4988-997D-F5942FF3C9E6@oracle.com>
 <427C2F30-9D91-4DFF-B3C2-F81DF697B881@oracle.com>
 <b731bc1e-842b-bae9-a801-69b1b55e0e52@oracle.com>
 <71B27770-C551-415B-8A26-1C6C38E7819B@oracle.com>
 <E464A7E7-541C-4932-AF28-6DEDB14C2A4F@oracle.com>
 <e627ad92-9ba8-66af-72e4-ef2071f2b7c4@oracle.com>
 <6CEB6280-69C7-4524-BF24-17CDB8364706@oracle.com>
 <cbd5e520-783c-b01d-4837-2d6fdb6bbab7@oracle.com>
 <E6D51CF8-84F4-426B-AA99-2753E780E8E5@oracle.com>
 <4e4f0bef-c222-22c9-adef-a58a0d43f116@oracle.com>
 <FC74225E-09B0-4CFC-A00D-5EBBB0C4FCC0@oracle.com>
 <072d6861-1374-8190-135d-e30ece2ee380@oracle.com>
 <2A286F91-6CB8-4A32-B103-2C28D52E5A1A@oracle.com>
 <EB00EA71-C340-418F-9D9C-9A3EA77EEDBA@oracle.com>
Message-ID: <0569FA2A-AEB3-41E1-8306-11290159B030@oracle.com>

Hi Bob, David, Mandy, and Serguei,

Thank you for reviewing this change!

Best regards,
Daniil

?On 12/11/19, 9:21 AM, "Bob Vandette" <bob.vandette at oracle.com> wrote:

    Yes, I defer to Mandy on the best way to express the various Java exceptions.
    I?m ok with the changes.
    
    Thanks for getting this done for JDK14!
    
    Bob.
    
    > On Dec 11, 2019, at 12:12 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
    > 
    > Hi Serguei,
    > 
    > Thank you for your comments. I will correct this nits before pushing the changes.
    > 
    > Hi Bob and David,
    > 
    >> [Mandy Chung]
    >>> I reviewed Metrics and Subsystem in this version.
    >>> I don't need to see a new webrev.
    > 
    > As I understood Mandy finished reviewing this fix. Just wanted to confirm with you if you are okey with that version of the fix (webrev.06) ?
    > 
    > Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests passed. 
    > 
    > Thank you,
    > Daniil
    > 
    > 
    > 
    > ?On 12/9/19, 6:02 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
    > 
    >    Hi Daniil,
    > 
    >    It is not a full review, just some minor comments.
    >    In fact, I do not see real problems yet.
    > 
    >    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
    > 
    >       55     public long getTotalSwapSpaceSize() {
    >       56         if (containerMetrics != null) {
    >       57             long limit = containerMetrics.getMemoryAndSwapLimit();
    >       58             // The memory limit metrics is not available if JVM 
    >    runs on Linux host ( not in a docker container)
    >       59             // or if a docker container was started without 
    >    specifying a memory limit ( without '--memory='
    >       60             // Docker option). In latter case there is no limit on 
    >    how much memory the container can use and
    >       61             // it can use as much memory as the host's OS allows.
    >       62             long memLimit = containerMetrics.getMemoryLimit();
    >       63             if (limit >= 0 && memLimit >= 0) {
    >       64                 return limit - memLimit;
    >       65             }
    >       66         }
    >       67         return getTotalSwapSpaceSize0();
    >       68     }
    > 
    >       Unneeded space after brackets '('.
    >       Do we need to check if the (limit - memLimit) value is negative?
    >       The same question is for getFreeSwapSpaceSize():
    >         memSwapLimit - memLimit - (memSwapUsage - memUsage)
    > 
    >       and getFreeMemorySize():
    >         101 return limit - usage;
    > 
    >       81                         // If this happens just retry the loop for 
    >    a few iterations
    > 
    >       Dot is missed at the end of comment.
    > 
    > 
    >    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
    > 
    >       34 System.out.println(String.format("Runtime.availableProcessors: 
    >    %d", Runtime.getRuntime().availableProcessors()));
    >       35 
    >    System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: 
    >    %d", osBean.getAvailableProcessors()));
    >       36 
    >    System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: 
    >    %d", osBean.getTotalMemorySize()));
    >       37 
    >    System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
    >    %d", osBean.getTotalPhysicalMemorySize()));
    >       38 
    >    System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: 
    >    %d", osBean.getFreeMemorySize()));
    >       39 
    >    System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: 
    >    %d", osBean.getFreePhysicalMemorySize()));
    >       40 
    >    System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
    >    %d", osBean.getTotalSwapSpaceSize()));
    >       41 
    >    System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
    >    %d", osBean.getFreeSwapSpaceSize()));
    >       42 
    >    System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
    >    osBean.getCpuLoad()));
    >       43 
    >    System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: 
    >    %f", osBean.getSystemCpuLoad()));
    > 
    > 
    >       To make the above lines a little bit shorter I'd suggest to define a 
    >    log() method like this:
    >          private static void log(String msg) ( System.out.println(msg(; }
    > 
    >       34         log(String.format("Runtime.availableProcessors: %d", 
    >    Runtime.getRuntime().availableProcessors()));
    >       35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: 
    >    %d", osBean.getAvailableProcessors()));
    >       36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", 
    >    osBean.getTotalMemorySize()));
    >       37 
    >    log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
    >    %d", osBean.getTotalPhysicalMemorySize()));
    >       38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", 
    >    osBean.getFreeMemorySize()));
    >       39 
    >    log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", 
    >    osBean.getFreePhysicalMemorySize()));
    >       40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
    >    %d", osBean.getTotalSwapSpaceSize()));
    >       41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
    >    %d", osBean.getFreeSwapSpaceSize()));
    >       42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
    >    osBean.getCpuLoad()));
    >       43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", 
    >    osBean.getSystemCpuLoad()));
    > 
    > 
    >    Thanks,
    >    Serguei
    > 
    > 
    > 
    >    On 12/6/19 17:41, Daniil Titov wrote:
    >> Hi David, Mandy, and Bob,
    >> 
    >> Thank you for reviewing this fix.
    >> 
    >> Please review a new version of the fix [1] that includes the following changes comparing to the previous version of the webrev ( webrev.04)
    >> 1. The changes in Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.
    >> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, getFreeSwapSpaceSize and getTotalSwapSpaceSize
    >>      was also reverted to webrev.03 version that return host's values if the metrics are unavailable or cannot be properly read.
    >>      I would like to mention that  currently the native implementation of these methods de-facto may return -1 at some circumstances,
    >>      but I agree that the changes proposed in the previous version of the webrev increase such probability.
    >>      I filed the follow-up issue [4] as Mandy suggested.
    >> 3.  The legacy methods were renamed as David suggested.
    >> 
    >> 
    >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >>> !     static int initialized=1;
    >>> 
    >>>  Am I reading this right that the code currently fails to actually do the
    >>> initialization because of this ???
    >> Yes, currently the code fails to do the initialization but it was unnoticed since method
    >> get_cpuload_internal(...) was never called for a specific CPU, the first parameter "which"
    >> was always -1.
    >> 
    >>>  test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>> 
    >>> System.out.println(String.format(...)
    >>> 
    >>> Why not simply
    >>> 
    >>> System.out.printf(..)
    >> As I tried explain it earlier it would make the tests unstable.
    >> System.out.printf(...) just delegates the call to System.out.format(...) that doesn't emit the string atomically.
    >> Instead it parses the format string into a list of FormatString objects and then iterates over the list.
    >> As a result, the other traces occasionally got printed between these iterations  and break the pattern the test is expected to find
    >> in the output.
    >> 
    >> For example, here is the sample of a such output that has the trace message printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"
    >> and "1030762496".
    >> 
    >> <skipped>
    >> [0.304s][trace][os,container] Memory Usage is: 42983424
    >> OperatingSystemMXBean.getFreeMemorySize: 1030758400
    >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    >> [0.305s][trace][os,container] Memory Usage is: 42979328
    >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
    >> OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory Usage is: 42975232
    >> 1030762496
    >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
    >> 
    >> <skipped>
    >> java.lang.RuntimeException: 'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from stdout/stderr
    >> 
    >> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
    >> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
    >> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
    >> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    >> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    >> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    >> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    >> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    >> 	at java.base/java.lang.Thread.run(Thread.java:832)
    >> 
    >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.
    >> 
    >> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
    >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
    >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
    >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
    >> 
    >> Thank you,
    >> Daniil
    >> 
    >> ?On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
    >> 
    >> 
    >> 
    >>     On 12/6/19 5:59 AM, Bob Vandette wrote:
    >>>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes at oracle.com>  wrote:
    >>>> 
    >>>> 
    >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
    >>>> 
    >>>> The changes to allow for a return of -1 are somewhat more extensive than we have previously discussed. These methods previously were (per the spec) guaranteed to return some (assumably) meaningful value but now they are effectively allowed to fail by returning -1. No existing code is expecting to have to handle a return of -1 so I see this as a significant compatibility issue.
    >> 
    >>     I thought that the error case we are referring to is limit == 0 which
    >>     indicates something unexpected goes wrong.  So the compatibility concern
    >>     should be low.  This is very specific to Metrics implementation for
    >>     cgroup v1 and let me know if I'm wrong.
    >> 
    >>>> Surely there must always be some information available from the operating environment? I see from the impl file:
    >>>> 
    >>>>    // the host data, value 0 indicates that something went wrong while the metric was read and
    >>>>   // in this case we return "information unavailable" code -1.
    >>>> 
    >>>> I don't agree with this. If the container metrics are messed up somehow we should either fallback to the host value or else abort with some kind of exception. Returning -1 is not an option here IMO.
    >>> I agree with David on the compatibility concern.  I originally thought that -1 was already a specified return for all of these methods.
    >>> Since the 0 return failure from the Metrics API should only occur if one of the cgroup subsystems is not enabled while others
    >>> are, I?d suggest we keep Daniil?s original logic to fall back to the host value since a disabled subsystem is equivalent to no
    >>> limits.
    >>> 
    >> 
    >>     It's important to consider carefully if the monitoring API indicates an
    >>     error vs unavailable and an application should continue to run when the
    >>     monitoring system fails to get the metrics.
    >> 
    >>     There are several choices to report "something goes wrong" scenarios
    >>     (should unlikely happen???):
    >>     1. fall back to a random positive value  (e.g. host value)
    >>     2. return a negative value
    >>     3. throw an exception
    >> 
    >>     #3 is not an option as the application is not expecting this.  For #2,
    >>     the application can filter bad values if desirable.
    >> 
    >>     I'm okay if you want to file a JBS issue to follow up and thoroughly
    >>     look at the cases that the metrics are unavailable and the cases when
    >>     fails to obtain.
    >> 
    >>>> ---
    >>>> 
    >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>>> 
    >>>> System.out.println(String.format(...)
    >>>> 
    >>>> Why not simply
    >>>> 
    >>>> System.out.printf(..)
    >>>> 
    >>>> ?
    >> 
    >>     or simply (as I commented [1])
    >>          System.out.format
    >> 
    >>     Mandy
    >>     [1]
    >>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
    >> 
    >> 
    >> 
    >> 
    > 
    > 
    > 
    > 
    
    
From suenaga at oss.nttdata.com  Thu Dec 12 03:43:40 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 12 Dec 2019 12:43:40 +0900
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
Message-ID: <2ec88d9c-ddd1-9da1-f919-781ccfa8099f@oss.nttdata.com>

Hi Fairoz,

Looks good!
I want you to backport this change to both jdk11u and 8u.


Thanks,

Yasumasa


On 2019/12/12 12:10, Fairoz Matte wrote:
> Hi,
> 
> Please review this small change,
> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call.
> 
> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/
> 
> This patch is provided by Yasumasa Suenaga
> 
> Thanks,
> Fairoz
> 

From fairoz.matte at oracle.com  Thu Dec 12 08:30:13 2019
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Thu, 12 Dec 2019 00:30:13 -0800 (PST)
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <2ec88d9c-ddd1-9da1-f919-781ccfa8099f@oss.nttdata.com>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
 <2ec88d9c-ddd1-9da1-f919-781ccfa8099f@oss.nttdata.com>
Message-ID: <e5423010-4d9d-46cb-83f5-ba9b3b82505d@default>

Hi Yasumasa,

Thanks for the review.
Sure, I will get them on 8u and 11u.

Thanks,
Fairoz

> -----Original Message-----
> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
> Sent: Thursday, December 12, 2019 9:14 AM
> To: Fairoz Matte <fairoz.matte at oracle.com>; serviceability-
> dev at openjdk.java.net
> Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
> prelink is enabled
> 
> Hi Fairoz,
> 
> Looks good!
> I want you to backport this change to both jdk11u and 8u.
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2019/12/12 12:10, Fairoz Matte wrote:
> > Hi,
> >
> > Please review this small change,
> > Updating error handling, to make sure "lib_base_diff = 0" is still a valid
> scenario even after calc_prelinked_load_address() call.
> >
> > JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
> > Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/
> >
> > This patch is provided by Yasumasa Suenaga
> >
> > Thanks,
> > Fairoz
> >

From christoph.langer at sap.com  Thu Dec 12 10:00:30 2019
From: christoph.langer at sap.com (Langer, Christoph)
Date: Thu, 12 Dec 2019 10:00:30 +0000
Subject: RFR [XS]: 8234968: check calloc rv in libinstrument
 InvocationAdapter
In-Reply-To: <AM6PR02MB50781EF60E251FE2536021FC93460@AM6PR02MB5078.eurprd02.prod.outlook.com>
References: <AM6PR02MB5078DF8D13F8D5C7DC08117893470@AM6PR02MB5078.eurprd02.prod.outlook.com>
 <CAA-vtUziv2i+7YueBLn=Pvd8jfAjUi_RsxFqHWjjFbHwUMBRDg@mail.gmail.com>
 <AM6PR02MB50781EF60E251FE2536021FC93460@AM6PR02MB5078.eurprd02.prod.outlook.com>
Message-ID: <AM6PR02MB55410D34D1D0E92EF886DE808A550@AM6PR02MB5541.eurprd02.prod.outlook.com>

Hi Matthias,

I think your current patch is good as it is ? at least it wouldn?t make things worse, AFAICS.

Further improvements can probably be done under another issue.

Cheers
Christoph

From: serviceability-dev <serviceability-dev-bounces at openjdk.java.net> On Behalf Of Baesken, Matthias
Sent: Freitag, 29. November 2019 08:18
To: Thomas St?fe <thomas.stuefe at gmail.com>
Cc: serviceability-dev at openjdk.java.net
Subject: [CAUTION] RE: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter

Hi Thomas, Christoph, thanks for the comments .  Of course the init of  * decodedLen  must be added .
In  case of  returning  NULL  from  decodePath   ,   we would have  tmp == NULL  (in char* tmp = func;  )     , assign  tmp to res  and  then  we   jplis_assert   , see :

#define TRANSFORM(res,func) {    \
    char* tmp = func;            \
    if (tmp != res) {            \
        free(res);               \
        res = tmp;               \
    }                            \
    jplis_assert((void*)res != (void*)NULL);     \
}
?.
TRANSFORM(path, decodePath(path,&len));


New webrev :

http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.2/


Best regards, Matthias


From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
Sent: Freitag, 29. November 2019 07:30
To: Baesken, Matthias <matthias.baesken at sap.com<mailto:matthias.baesken at sap.com>>
Cc: serviceability-dev at openjdk.java.net<mailto:serviceability-dev at openjdk.java.net>
Subject: Re: RFR [XS]: 8234968: check calloc rv in libinstrument InvocationAdapter

Hi Matthias,

I am not certain the callers are prepared to handle NULL.

This is used in a chain of TRANSFORM macro calls which AFAICS do not handle NULL; e.g. , at 872, we pass the returned pointer to convertUft8ToPlatformString which passes it on (on Windows) to MultiByteToWideChar, which does not handle NULL input.

So I wonder whether a clear error message with an exit would be better in this case. Otherwise we may get a crash just some instructions later.

Cheers, Thomas


On Thu, Nov 28, 2019 at 5:21 PM Baesken, Matthias <matthias.baesken at sap.com<mailto:matthias.baesken at sap.com>> wrote:
Hello, please review this small  patch .
It adds return value checking for calloc at one place where it is missing .

Thanks, Matthias

Bug/webrev :

https://bugs.openjdk.java.net/browse/JDK-8234968

http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.1/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191212/8b401c1d/attachment-0001.htm>

From Alan.Bateman at oracle.com  Thu Dec 12 11:37:33 2019
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 12 Dec 2019 11:37:33 +0000
Subject: RFR [XS]: 8234968: check calloc rv in libinstrument
 InvocationAdapter
In-Reply-To: <AM6PR02MB55410D34D1D0E92EF886DE808A550@AM6PR02MB5541.eurprd02.prod.outlook.com>
References: <AM6PR02MB5078DF8D13F8D5C7DC08117893470@AM6PR02MB5078.eurprd02.prod.outlook.com>
 <CAA-vtUziv2i+7YueBLn=Pvd8jfAjUi_RsxFqHWjjFbHwUMBRDg@mail.gmail.com>
 <AM6PR02MB50781EF60E251FE2536021FC93460@AM6PR02MB5078.eurprd02.prod.outlook.com>
 <AM6PR02MB55410D34D1D0E92EF886DE808A550@AM6PR02MB5541.eurprd02.prod.outlook.com>
Message-ID: <155511e7-f336-5280-2d9d-06c48270b1f2@oracle.com>


On 12/12/2019 10:00, Langer, Christoph wrote:
>
> Hi Matthias,
>
> I think your current patch is good as it is ? at least it wouldn?t 
> make things worse, AFAICS.
>
> Further improvements can probably be done under another issue.
>
>
Yes, another issue is fine. If decodePath can't allocate during the 
onload phase (-javaagent case) then it would be better to have the VM 
initialization abort. The late binding agent case is trickery but it is 
wrong to continue with the Boot-Class-Path attribute ignored.

-Alan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191212/c102355c/attachment.htm>

From stefan.karlsson at oracle.com  Thu Dec 12 12:01:05 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 12 Dec 2019 13:01:05 +0100
Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java fails
 with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min
 generation capacity > max generation capacity)
Message-ID: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com>

Hi all,

Please review this patch to fix a problem with unintialized values in 
our generation counters.

https://cr.openjdk.java.net/~stefank/8226797/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8226797

The jstat values NGCMN and OGCMN both return uninitialized values.

I stumbled upon this while creating a patch to remove the GenerationSpec 
class.

GenerationSpec::_min_size is never initialized, and then used to create 
the generations:

     case Generation::DefNew:
       return new DefNewGeneration(rs, _init_size, _min_size, _max_size);

     case Generation::MarkSweepCompact:
       return new TenuredGeneration(rs, _init_size, _min_size, 
_max_size, remset);

That in turn uses it to initialize the perf counters:
DefNewGeneration::DefNewGeneration(ReservedSpace rs,
                                    size_t initial_size,
                                    size_t min_size,
                                    size_t max_size,
                                    const char* policy)
...
   _gen_counters = new GenerationCounters("new", 0, 3,
       min_size, max_size, &_virtual_space);

I'm setting the value to _init_size, because it reflects how MinNewSize 
and MinOldSize relates to NewSize and OldSize.

Thanks,
StefanK

From stefan.karlsson at oracle.com  Thu Dec 12 15:23:09 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 12 Dec 2019 16:23:09 +0100
Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java
 fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min
 generation capacity > max generation capacity)
In-Reply-To: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com>
References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com>
Message-ID: <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com>

In the interest to get this integrated before the RDP cut-off I'm going 
to push this ASAP. This has gone through tier1-tier3 testing.

StefanK

On 2019-12-12 13:01, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to fix a problem with unintialized values in 
> our generation counters.
> 
> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8226797
> 
> The jstat values NGCMN and OGCMN both return uninitialized values.
> 
> I stumbled upon this while creating a patch to remove the GenerationSpec 
> class.
> 
> GenerationSpec::_min_size is never initialized, and then used to create 
> the generations:
> 
>  ??? case Generation::DefNew:
>  ????? return new DefNewGeneration(rs, _init_size, _min_size, _max_size);
> 
>  ??? case Generation::MarkSweepCompact:
>  ????? return new TenuredGeneration(rs, _init_size, _min_size, 
> _max_size, remset);
> 
> That in turn uses it to initialize the perf counters:
> DefNewGeneration::DefNewGeneration(ReservedSpace rs,
>  ?????????????????????????????????? size_t initial_size,
>  ?????????????????????????????????? size_t min_size,
>  ?????????????????????????????????? size_t max_size,
>  ?????????????????????????????????? const char* policy)
> ...
>  ? _gen_counters = new GenerationCounters("new", 0, 3,
>  ????? min_size, max_size, &_virtual_space);
> 
> I'm setting the value to _init_size, because it reflects how MinNewSize 
> and MinOldSize relates to NewSize and OldSize.
> 
> Thanks,
> StefanK

From daniel.daugherty at oracle.com  Thu Dec 12 16:06:14 2019
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 12 Dec 2019 11:06:14 -0500
Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java
 fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min
 generation capacity > max generation capacity)
In-Reply-To: <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com>
References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com>
 <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com>
Message-ID: <8ee949e7-b7c2-b8b7-e7fc-eaaff444e59f@oracle.com>

src/hotspot/share/gc/shared/generationSpec.hpp
 ??? No comments.

test/hotspot/jtreg/serviceability/tmtools/jstat/utils/JstatGcCapacityResults.java
 ??? No comments.

Thumbs up.

Dan


On 12/12/19 10:23 AM, Stefan Karlsson wrote:
> In the interest to get this integrated before the RDP cut-off I'm 
> going to push this ASAP. This has gone through tier1-tier3 testing.
>
> StefanK
>
> On 2019-12-12 13:01, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to fix a problem with unintialized values in 
>> our generation counters.
>>
>> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8226797
>>
>> The jstat values NGCMN and OGCMN both return uninitialized values.
>>
>> I stumbled upon this while creating a patch to remove the 
>> GenerationSpec class.
>>
>> GenerationSpec::_min_size is never initialized, and then used to 
>> create the generations:
>>
>> ???? case Generation::DefNew:
>> ?????? return new DefNewGeneration(rs, _init_size, _min_size, 
>> _max_size);
>>
>> ???? case Generation::MarkSweepCompact:
>> ?????? return new TenuredGeneration(rs, _init_size, _min_size, 
>> _max_size, remset);
>>
>> That in turn uses it to initialize the perf counters:
>> DefNewGeneration::DefNewGeneration(ReservedSpace rs,
>> ??????????????????????????????????? size_t initial_size,
>> ??????????????????????????????????? size_t min_size,
>> ??????????????????????????????????? size_t max_size,
>> ??????????????????????????????????? const char* policy)
>> ...
>> ?? _gen_counters = new GenerationCounters("new", 0, 3,
>> ?????? min_size, max_size, &_virtual_space);
>>
>> I'm setting the value to _init_size, because it reflects how 
>> MinNewSize and MinOldSize relates to NewSize and OldSize.
>>
>> Thanks,
>> StefanK


From chris.plummer at oracle.com  Thu Dec 12 16:18:13 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 12 Dec 2019 08:18:13 -0800
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
Message-ID: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>

Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or 
LOAD_ADDRESS_ERROR.

Chris

On 12/11/19 7:10 PM, Fairoz Matte wrote:
> Hi,
>
> Please review this small change,
> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call.
>
> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/
>
> This patch is provided by Yasumasa Suenaga
>
> Thanks,
> Fairoz


From stefan.karlsson at oracle.com  Thu Dec 12 16:19:23 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 12 Dec 2019 17:19:23 +0100
Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java
 fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min
 generation capacity > max generation capacity)
In-Reply-To: <8ee949e7-b7c2-b8b7-e7fc-eaaff444e59f@oracle.com>
References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com>
 <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com>
 <8ee949e7-b7c2-b8b7-e7fc-eaaff444e59f@oracle.com>
Message-ID: <50c7ec6f-5cd1-3363-02be-1a9058942d89@oracle.com>

Thanks, Dan.

StefanK

On 2019-12-12 17:06, Daniel D. Daugherty wrote:
> src/hotspot/share/gc/shared/generationSpec.hpp
>  ??? No comments.
> 
> test/hotspot/jtreg/serviceability/tmtools/jstat/utils/JstatGcCapacityResults.java 
> 
>  ??? No comments.
> 
> Thumbs up.
> 
> Dan
> 
> 
> On 12/12/19 10:23 AM, Stefan Karlsson wrote:
>> In the interest to get this integrated before the RDP cut-off I'm 
>> going to push this ASAP. This has gone through tier1-tier3 testing.
>>
>> StefanK
>>
>> On 2019-12-12 13:01, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please review this patch to fix a problem with unintialized values in 
>>> our generation counters.
>>>
>>> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/
>>> https://bugs.openjdk.java.net/browse/JDK-8226797
>>>
>>> The jstat values NGCMN and OGCMN both return uninitialized values.
>>>
>>> I stumbled upon this while creating a patch to remove the 
>>> GenerationSpec class.
>>>
>>> GenerationSpec::_min_size is never initialized, and then used to 
>>> create the generations:
>>>
>>> ???? case Generation::DefNew:
>>> ?????? return new DefNewGeneration(rs, _init_size, _min_size, 
>>> _max_size);
>>>
>>> ???? case Generation::MarkSweepCompact:
>>> ?????? return new TenuredGeneration(rs, _init_size, _min_size, 
>>> _max_size, remset);
>>>
>>> That in turn uses it to initialize the perf counters:
>>> DefNewGeneration::DefNewGeneration(ReservedSpace rs,
>>> ??????????????????????????????????? size_t initial_size,
>>> ??????????????????????????????????? size_t min_size,
>>> ??????????????????????????????????? size_t max_size,
>>> ??????????????????????????????????? const char* policy)
>>> ...
>>> ?? _gen_counters = new GenerationCounters("new", 0, 3,
>>> ?????? min_size, max_size, &_virtual_space);
>>>
>>> I'm setting the value to _init_size, because it reflects how 
>>> MinNewSize and MinOldSize relates to NewSize and OldSize.
>>>
>>> Thanks,
>>> StefanK
> 

From vladimir.kozlov at oracle.com  Thu Dec 12 18:20:25 2019
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 12 Dec 2019 10:20:25 -0800
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <a690384a-2002-2d95-ad69-45fb66bc3452@oracle.com>
 <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com>
Message-ID: <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com>

Hi David,

Tiered is disabled because we don't want to see compilations and outputs 
from C1 compiler which does not have EA.

The test is specifically written for C2 only (not for C1 or Graal) to 
verify its Escape Analysis optimization.
I did not look in great details into test's code but its analysis may be 
affected if C1 compiler is also used.

Richard may clarify this.

thanks,
Vladimir

On 12/11/19 1:04 PM, David Holmes wrote:
> On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
>> I will do full review later. I want to comment about test command line.
>>
>> You don't need vm.opt.TieredCompilation != true in @requires because 
>> you specified -XX:-TieredCompilation in @run command.
> 
> And per my comment this should be being tested with tiered as well.
> 
> David
> 
>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip 
>> test from running in Interpreter mode too.
>>
>> Thanks,
>> Vladimir
>>
>> On 12/11/19 7:07 AM, Reingruber, Richard wrote:
>>> Hi David,
>>>
>>> ?? > Most of the details here are in areas I can comment on in 
>>> detail, but I
>>> ?? > did take an initial general look at things.
>>>
>>> Thanks for taking the time!
>>>
>>> ?? > The only thing that jumped out at me is that I think the
>>> ?? > DeoptimizeObjectsALotThread should be a hidden thread.
>>> ?? >
>>> ?? > +? bool is_hidden_from_external_view() const { return true; }
>>>
>>> Yes, it should. Will add the method like above.
>>>
>>> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>>> Without
>>> ?? > active testing this will just bit-rot.
>>>
>>> DeoptimizeObjectsALot is meant for stress testing with a larger 
>>> workload. I will add a minimal test
>>> to keep it fresh.
>>>
>>> ?? > Also on the tests I don't understand your @requires clause:
>>> ?? >
>>> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>> ?? > (vm.opt.TieredCompilation != true))
>>> ?? >
>>> ?? > This seems to require that TieredCompilation is disabled, but 
>>> tiered is
>>> ?? > our normal mode of operation. ??
>>> ?? >
>>>
>>> I removed the clause. I guess I wanted to target the tests towards 
>>> the code they are supposed to
>>> test, and it's easier to analyze failures w/o tiered compilation and 
>>> with just one compiler thread.
>>>
>>> Additionally I will make use of 
>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>
>>> Thanks,
>>> Richard.
>>>
>>> -----Original Message-----
>>> From: David Holmes <david.holmes at oracle.com>
>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>> To: Reingruber, Richard <richard.reingruber at sap.com>; 
>>> serviceability-dev at openjdk.java.net; 
>>> hotspot-compiler-dev at openjdk.java.net; 
>>> hotspot-runtime-dev at openjdk.java.net
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>>> Performance in the Presence of JVMTI Agents
>>>
>>> Hi Richard,
>>>
>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>> Hi,
>>>>
>>>> I would like to get reviews please for
>>>>
>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>
>>>> Corresponding RFE:
>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>
>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>
>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without 
>>>> issues (thanks!). In addition the
>>>> change is being tested at SAP since I posted the first RFR some 
>>>> months ago.
>>>>
>>>> The intention of this enhancement is to benefit performance wise 
>>>> from escape analysis even if JVMTI
>>>> agents request capabilities that allow them to access local variable 
>>>> values. E.g. if you start-up
>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then 
>>>> escape analysis is disabled right
>>>> from the beginning, well before a debugger attaches -- if ever one 
>>>> should do so. With the
>>>> enhancement, escape analysis will remain enabled until and after a 
>>>> debugger attaches. EA based
>>>> optimizations are reverted just before an agent acquires the 
>>>> reference to an object. In the JBS item
>>>> you'll find more details.
>>>
>>> Most of the details here are in areas I can comment on in detail, but I
>>> did take an initial general look at things.
>>>
>>> The only thing that jumped out at me is that I think the
>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>
>>> +? bool is_hidden_from_external_view() const { return true; }
>>>
>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
>>> active testing this will just bit-rot.
>>>
>>> Also on the tests I don't understand your @requires clause:
>>>
>>> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>> (vm.opt.TieredCompilation != true))
>>>
>>> This seems to require that TieredCompilation is disabled, but tiered is
>>> our normal mode of operation. ??
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 
>>>>
>>>>

From richard.reingruber at sap.com  Thu Dec 12 23:02:26 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Thu, 12 Dec 2019 23:02:26 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <a690384a-2002-2d95-ad69-45fb66bc3452@oracle.com>
 <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com>
 <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com>
Message-ID: <DB7PR02MB3612074D337EA1B024063A289B550@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hello Vladimir,

thanks for having a look.

  > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
  > test from running in Interpreter mode too.

Done.

  > You don't need vm.opt.TieredCompilation != true in @requires because you
  > specified -XX:-TieredCompilation in @run command.

Ok.

  > The test is specifically written for C2 only (not for C1 or Graal) to
  > verify its Escape Analysis optimization.
  > I did not look in great details into test's code but its analysis may be
  > affected if C1 compiler is also used.
  > 
  > Richard may clarify this.

The test cases aim to get their testmethod 'dontinline_testMethod' compiled by C2. If they get C1
compiled before doesn't matter all that much. I've got a slight preference to disabled tiered
compilation for simplicity.

Thanks, Richard.

-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com> 
Sent: Donnerstag, 12. Dezember 2019 19:20
To: David Holmes <david.holmes at oracle.com>; hotspot-runtime-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; Reingruber, Richard <richard.reingruber at sap.com>
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi David,

Tiered is disabled because we don't want to see compilations and outputs 
from C1 compiler which does not have EA.

The test is specifically written for C2 only (not for C1 or Graal) to 
verify its Escape Analysis optimization.
I did not look in great details into test's code but its analysis may be 
affected if C1 compiler is also used.

Richard may clarify this.

thanks,
Vladimir

On 12/11/19 1:04 PM, David Holmes wrote:
> On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
>> I will do full review later. I want to comment about test command line.
>>
>> You don't need vm.opt.TieredCompilation != true in @requires because 
>> you specified -XX:-TieredCompilation in @run command.
> 
> And per my comment this should be being tested with tiered as well.
> 
> David
> 
>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip 
>> test from running in Interpreter mode too.
>>
>> Thanks,
>> Vladimir
>>
>> On 12/11/19 7:07 AM, Reingruber, Richard wrote:
>>> Hi David,
>>>
>>> ?? > Most of the details here are in areas I can comment on in 
>>> detail, but I
>>> ?? > did take an initial general look at things.
>>>
>>> Thanks for taking the time!
>>>
>>> ?? > The only thing that jumped out at me is that I think the
>>> ?? > DeoptimizeObjectsALotThread should be a hidden thread.
>>> ?? >
>>> ?? > +? bool is_hidden_from_external_view() const { return true; }
>>>
>>> Yes, it should. Will add the method like above.
>>>
>>> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>>> Without
>>> ?? > active testing this will just bit-rot.
>>>
>>> DeoptimizeObjectsALot is meant for stress testing with a larger 
>>> workload. I will add a minimal test
>>> to keep it fresh.
>>>
>>> ?? > Also on the tests I don't understand your @requires clause:
>>> ?? >
>>> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>> ?? > (vm.opt.TieredCompilation != true))
>>> ?? >
>>> ?? > This seems to require that TieredCompilation is disabled, but 
>>> tiered is
>>> ?? > our normal mode of operation. ??
>>> ?? >
>>>
>>> I removed the clause. I guess I wanted to target the tests towards 
>>> the code they are supposed to
>>> test, and it's easier to analyze failures w/o tiered compilation and 
>>> with just one compiler thread.
>>>
>>> Additionally I will make use of 
>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>
>>> Thanks,
>>> Richard.
>>>
>>> -----Original Message-----
>>> From: David Holmes <david.holmes at oracle.com>
>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>> To: Reingruber, Richard <richard.reingruber at sap.com>; 
>>> serviceability-dev at openjdk.java.net; 
>>> hotspot-compiler-dev at openjdk.java.net; 
>>> hotspot-runtime-dev at openjdk.java.net
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>>> Performance in the Presence of JVMTI Agents
>>>
>>> Hi Richard,
>>>
>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>> Hi,
>>>>
>>>> I would like to get reviews please for
>>>>
>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>
>>>> Corresponding RFE:
>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>
>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>
>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without 
>>>> issues (thanks!). In addition the
>>>> change is being tested at SAP since I posted the first RFR some 
>>>> months ago.
>>>>
>>>> The intention of this enhancement is to benefit performance wise 
>>>> from escape analysis even if JVMTI
>>>> agents request capabilities that allow them to access local variable 
>>>> values. E.g. if you start-up
>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then 
>>>> escape analysis is disabled right
>>>> from the beginning, well before a debugger attaches -- if ever one 
>>>> should do so. With the
>>>> enhancement, escape analysis will remain enabled until and after a 
>>>> debugger attaches. EA based
>>>> optimizations are reverted just before an agent acquires the 
>>>> reference to an object. In the JBS item
>>>> you'll find more details.
>>>
>>> Most of the details here are in areas I can comment on in detail, but I
>>> did take an initial general look at things.
>>>
>>> The only thing that jumped out at me is that I think the
>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>
>>> +? bool is_hidden_from_external_view() const { return true; }
>>>
>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
>>> active testing this will just bit-rot.
>>>
>>> Also on the tests I don't understand your @requires clause:
>>>
>>> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>> (vm.opt.TieredCompilation != true))
>>>
>>> This seems to require that TieredCompilation is disabled, but tiered is
>>> our normal mode of operation. ??
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 
>>>>
>>>>

From david.holmes at oracle.com  Thu Dec 12 23:32:40 2019
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 13 Dec 2019 09:32:40 +1000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB7PR02MB3612074D337EA1B024063A289B550@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <a690384a-2002-2d95-ad69-45fb66bc3452@oracle.com>
 <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com>
 <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com>
 <DB7PR02MB3612074D337EA1B024063A289B550@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com>

On 13/12/2019 9:02 am, Reingruber, Richard wrote:
> Hello Vladimir,
> 
> thanks for having a look.
> 
>    > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
>    > test from running in Interpreter mode too.
> 
> Done.
> 
>    > You don't need vm.opt.TieredCompilation != true in @requires because you
>    > specified -XX:-TieredCompilation in @run command.
> 
> Ok.
> 
>    > The test is specifically written for C2 only (not for C1 or Graal) to
>    > verify its Escape Analysis optimization.
>    > I did not look in great details into test's code but its analysis may be
>    > affected if C1 compiler is also used.
>    >
>    > Richard may clarify this.
> 
> The test cases aim to get their testmethod 'dontinline_testMethod' compiled by C2. If they get C1
> compiled before doesn't matter all that much. I've got a slight preference to disabled tiered
> compilation for simplicity.

My concern - perhaps unfounded - is that this seems to be being tested 
only in a pure C2 environment when the actual changes will have to 
operate correctly in a tiered environment (and JVMCI).

Thanks,
David

> Thanks, Richard.
> 
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Donnerstag, 12. Dezember 2019 19:20
> To: David Holmes <david.holmes at oracle.com>; hotspot-runtime-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; Reingruber, Richard <richard.reingruber at sap.com>
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents
> 
> Hi David,
> 
> Tiered is disabled because we don't want to see compilations and outputs
> from C1 compiler which does not have EA.
> 
> The test is specifically written for C2 only (not for C1 or Graal) to
> verify its Escape Analysis optimization.
> I did not look in great details into test's code but its analysis may be
> affected if C1 compiler is also used.
> 
> Richard may clarify this.
> 
> thanks,
> Vladimir
> 
> On 12/11/19 1:04 PM, David Holmes wrote:
>> On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
>>> I will do full review later. I want to comment about test command line.
>>>
>>> You don't need vm.opt.TieredCompilation != true in @requires because
>>> you specified -XX:-TieredCompilation in @run command.
>>
>> And per my comment this should be being tested with tiered as well.
>>
>> David
>>
>>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
>>> test from running in Interpreter mode too.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 12/11/19 7:07 AM, Reingruber, Richard wrote:
>>>> Hi David,
>>>>
>>>>  ?? > Most of the details here are in areas I can comment on in
>>>> detail, but I
>>>>  ?? > did take an initial general look at things.
>>>>
>>>> Thanks for taking the time!
>>>>
>>>>  ?? > The only thing that jumped out at me is that I think the
>>>>  ?? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>  ?? >
>>>>  ?? > +? bool is_hidden_from_external_view() const { return true; }
>>>>
>>>> Yes, it should. Will add the method like above.
>>>>
>>>>  ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>> Without
>>>>  ?? > active testing this will just bit-rot.
>>>>
>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>> workload. I will add a minimal test
>>>> to keep it fresh.
>>>>
>>>>  ?? > Also on the tests I don't understand your @requires clause:
>>>>  ?? >
>>>>  ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>  ?? > (vm.opt.TieredCompilation != true))
>>>>  ?? >
>>>>  ?? > This seems to require that TieredCompilation is disabled, but
>>>> tiered is
>>>>  ?? > our normal mode of operation. ??
>>>>  ?? >
>>>>
>>>> I removed the clause. I guess I wanted to target the tests towards
>>>> the code they are supposed to
>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>> with just one compiler thread.
>>>>
>>>> Additionally I will make use of
>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>> -----Original Message-----
>>>> From: David Holmes <david.holmes at oracle.com>
>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>> serviceability-dev at openjdk.java.net;
>>>> hotspot-compiler-dev at openjdk.java.net;
>>>> hotspot-runtime-dev at openjdk.java.net
>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>> Performance in the Presence of JVMTI Agents
>>>>
>>>> Hi Richard,
>>>>
>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>> Hi,
>>>>>
>>>>> I would like to get reviews please for
>>>>>
>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>
>>>>> Corresponding RFE:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>
>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>
>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>> issues (thanks!). In addition the
>>>>> change is being tested at SAP since I posted the first RFR some
>>>>> months ago.
>>>>>
>>>>> The intention of this enhancement is to benefit performance wise
>>>>> from escape analysis even if JVMTI
>>>>> agents request capabilities that allow them to access local variable
>>>>> values. E.g. if you start-up
>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>> escape analysis is disabled right
>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>> should do so. With the
>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>> debugger attaches. EA based
>>>>> optimizations are reverted just before an agent acquires the
>>>>> reference to an object. In the JBS item
>>>>> you'll find more details.
>>>>
>>>> Most of the details here are in areas I can comment on in detail, but I
>>>> did take an initial general look at things.
>>>>
>>>> The only thing that jumped out at me is that I think the
>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>
>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>
>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
>>>> active testing this will just bit-rot.
>>>>
>>>> Also on the tests I don't understand your @requires clause:
>>>>
>>>>  ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>> (vm.opt.TieredCompilation != true))
>>>>
>>>> This seems to require that TieredCompilation is disabled, but tiered is
>>>> our normal mode of operation. ??
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
>>>>>
>>>>>

From david.holmes at oracle.com  Thu Dec 12 23:55:35 2019
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 13 Dec 2019 09:55:35 +1000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
Message-ID: <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>

Hi Richard,

Some further queries/concerns:

src/hotspot/share/runtime/objectMonitor.cpp

Can you please explain the changes to ObjectMonitor::wait:

!   _recursions = save      // restore the old recursion count
!                 + jt->get_and_reset_relock_count_after_wait(); // 
increased by the deferred relock count

what is the "deferred relock count"? I gather it relates to

"The code was extended to be able to deoptimize objects of a frame that 
is not the top frame and to let another thread than the owning thread do 
it."

which I don't like the sound of at all when it comes to ObjectMonitor 
state. So I'd like to understand in detail exactly what is going on here 
and why.  This is a very intrusive change that seems to badly break 
encapsulation and impacts future changes to ObjectMonitor that are under 
investigation.

---

src/hotspot/share/runtime/thread.cpp

Can you please explain why JavaThread::wait_for_object_deoptimization 
has to be handcrafted in this way rather than using proper transitions.

We got rid of "deopt suspend" some time ago and it is disturbing to see 
it being added back (effectively). This seems like it may be something 
that handshakes could be used for.

Thanks,
David
-----

On 12/12/2019 7:02 am, David Holmes wrote:
> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>> Hi David,
>>
>> ?? > Most of the details here are in areas I can comment on in detail, 
>> but I
>> ?? > did take an initial general look at things.
>>
>> Thanks for taking the time!
> 
> Apologies the above should read:
> 
> "Most of the details here are in areas I *can't* comment on in detail ..."
> 
> David
> 
>> ?? > The only thing that jumped out at me is that I think the
>> ?? > DeoptimizeObjectsALotThread should be a hidden thread.
>> ?? >
>> ?? > +? bool is_hidden_from_external_view() const { return true; }
>>
>> Yes, it should. Will add the method like above.
>>
>> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>> Without
>> ?? > active testing this will just bit-rot.
>>
>> DeoptimizeObjectsALot is meant for stress testing with a larger 
>> workload. I will add a minimal test
>> to keep it fresh.
>>
>> ?? > Also on the tests I don't understand your @requires clause:
>> ?? >
>> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>> ?? > (vm.opt.TieredCompilation != true))
>> ?? >
>> ?? > This seems to require that TieredCompilation is disabled, but 
>> tiered is
>> ?? > our normal mode of operation. ??
>> ?? >
>>
>> I removed the clause. I guess I wanted to target the tests towards the 
>> code they are supposed to
>> test, and it's easier to analyze failures w/o tiered compilation and 
>> with just one compiler thread.
>>
>> Additionally I will make use of 
>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>
>> Thanks,
>> Richard.
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Mittwoch, 11. Dezember 2019 08:03
>> To: Reingruber, Richard <richard.reingruber at sap.com>; 
>> serviceability-dev at openjdk.java.net; 
>> hotspot-compiler-dev at openjdk.java.net; 
>> hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>> Performance in the Presence of JVMTI Agents
>>
>> Hi Richard,
>>
>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>> Hi,
>>>
>>> I would like to get reviews please for
>>>
>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>
>>> Corresponding RFE:
>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>
>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>
>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without 
>>> issues (thanks!). In addition the
>>> change is being tested at SAP since I posted the first RFR some 
>>> months ago.
>>>
>>> The intention of this enhancement is to benefit performance wise from 
>>> escape analysis even if JVMTI
>>> agents request capabilities that allow them to access local variable 
>>> values. E.g. if you start-up
>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then 
>>> escape analysis is disabled right
>>> from the beginning, well before a debugger attaches -- if ever one 
>>> should do so. With the
>>> enhancement, escape analysis will remain enabled until and after a 
>>> debugger attaches. EA based
>>> optimizations are reverted just before an agent acquires the 
>>> reference to an object. In the JBS item
>>> you'll find more details.
>>
>> Most of the details here are in areas I can comment on in detail, but I
>> did take an initial general look at things.
>>
>> The only thing that jumped out at me is that I think the
>> DeoptimizeObjectsALotThread should be a hidden thread.
>>
>> +? bool is_hidden_from_external_view() const { return true; }
>>
>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
>> active testing this will just bit-rot.
>>
>> Also on the tests I don't understand your @requires clause:
>>
>> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>> (vm.opt.TieredCompilation != true))
>>
>> This seems to require that TieredCompilation is disabled, but tiered is
>> our normal mode of operation. ??
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Richard.
>>>
>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>       
>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 
>>>
>>>

From serguei.spitsyn at oracle.com  Fri Dec 13 00:41:12 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 12 Dec 2019 16:41:12 -0800
Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java
 fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min
 generation capacity > max generation capacity)
In-Reply-To: <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com>
References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com>
 <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com>
Message-ID: <35c25ed3-948b-482d-2f21-1ffffdf1afd9@oracle.com>

Hi Stefan,

It looks good to me.

Sorry, I was on the meeting, wrote this email and forgot to push 'send' 
button.
Just now discovered that it has not been really sent. :(

Thanks,
Serguei


On 12/12/19 07:23, Stefan Karlsson wrote:
> In the interest to get this integrated before the RDP cut-off I'm 
> going to push this ASAP. This has gone through tier1-tier3 testing.
>
> StefanK
>
> On 2019-12-12 13:01, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to fix a problem with unintialized values in 
>> our generation counters.
>>
>> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8226797
>>
>> The jstat values NGCMN and OGCMN both return uninitialized values.
>>
>> I stumbled upon this while creating a patch to remove the 
>> GenerationSpec class.
>>
>> GenerationSpec::_min_size is never initialized, and then used to 
>> create the generations:
>>
>> ???? case Generation::DefNew:
>> ?????? return new DefNewGeneration(rs, _init_size, _min_size, 
>> _max_size);
>>
>> ???? case Generation::MarkSweepCompact:
>> ?????? return new TenuredGeneration(rs, _init_size, _min_size, 
>> _max_size, remset);
>>
>> That in turn uses it to initialize the perf counters:
>> DefNewGeneration::DefNewGeneration(ReservedSpace rs,
>> ??????????????????????????????????? size_t initial_size,
>> ??????????????????????????????????? size_t min_size,
>> ??????????????????????????????????? size_t max_size,
>> ??????????????????????????????????? const char* policy)
>> ...
>> ?? _gen_counters = new GenerationCounters("new", 0, 3,
>> ?????? min_size, max_size, &_virtual_space);
>>
>> I'm setting the value to _init_size, because it reflects how 
>> MinNewSize and MinOldSize relates to NewSize and OldSize.
>>
>> Thanks,
>> StefanK


From vladimir.kozlov at oracle.com  Fri Dec 13 00:56:16 2019
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 12 Dec 2019 16:56:16 -0800
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <a690384a-2002-2d95-ad69-45fb66bc3452@oracle.com>
 <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com>
 <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com>
 <DB7PR02MB3612074D337EA1B024063A289B550@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com>
Message-ID: <b71d94e4-abfb-5384-c6fd-1a302f8fd2f2@oracle.com>

Yes, David

You are correct these changes touch all part of VM and may affect Graal (which also has EA) too.
Changes should be tested in all our modes: tiered, C1 only, Graal, Interpreter. And I realized that 
I only ran tier3-graal testing so I submitted the rest of Graal's tiers now.

I had assumed that our current testing (I ran all from tier1 to tier8) should exercise all paths in 
VM these changes touch. But I may be wrong and it is correct to ask author to add testing in all VM 
modes to make sure new code in VM's runtime and JVMTI is tested.

I do like to keep what current test is doing with C2. May be add an other test for other modes or 
modify current one to enable to run it in other modes.

Thanks,
Vladimir

On 12/12/19 3:32 PM, David Holmes wrote:
> On 13/12/2019 9:02 am, Reingruber, Richard wrote:
>> Hello Vladimir,
>>
>> thanks for having a look.
>>
>> ?? > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
>> ?? > test from running in Interpreter mode too.
>>
>> Done.
>>
>> ?? > You don't need vm.opt.TieredCompilation != true in @requires because you
>> ?? > specified -XX:-TieredCompilation in @run command.
>>
>> Ok.
>>
>> ?? > The test is specifically written for C2 only (not for C1 or Graal) to
>> ?? > verify its Escape Analysis optimization.
>> ?? > I did not look in great details into test's code but its analysis may be
>> ?? > affected if C1 compiler is also used.
>> ?? >
>> ?? > Richard may clarify this.
>>
>> The test cases aim to get their testmethod 'dontinline_testMethod' compiled by C2. If they get C1
>> compiled before doesn't matter all that much. I've got a slight preference to disabled tiered
>> compilation for simplicity.
> 
> My concern - perhaps unfounded - is that this seems to be being tested only in a pure C2 environment 
> when the actual changes will have to operate correctly in a tiered environment (and JVMCI).
> 
> Thanks,
> David
> 
>> Thanks, Richard.
>>
>> -----Original Message-----
>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>> Sent: Donnerstag, 12. Dezember 2019 19:20
>> To: David Holmes <david.holmes at oracle.com>; hotspot-runtime-dev at openjdk.java.net; 
>> hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net; Reingruber, Richard 
>> <richard.reingruber at sap.com>
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of 
>> JVMTI Agents
>>
>> Hi David,
>>
>> Tiered is disabled because we don't want to see compilations and outputs
>> from C1 compiler which does not have EA.
>>
>> The test is specifically written for C2 only (not for C1 or Graal) to
>> verify its Escape Analysis optimization.
>> I did not look in great details into test's code but its analysis may be
>> affected if C1 compiler is also used.
>>
>> Richard may clarify this.
>>
>> thanks,
>> Vladimir
>>
>> On 12/11/19 1:04 PM, David Holmes wrote:
>>> On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
>>>> I will do full review later. I want to comment about test command line.
>>>>
>>>> You don't need vm.opt.TieredCompilation != true in @requires because
>>>> you specified -XX:-TieredCompilation in @run command.
>>>
>>> And per my comment this should be being tested with tiered as well.
>>>
>>> David
>>>
>>>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
>>>> test from running in Interpreter mode too.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 12/11/19 7:07 AM, Reingruber, Richard wrote:
>>>>> Hi David,
>>>>>
>>>>> ??? > Most of the details here are in areas I can comment on in
>>>>> detail, but I
>>>>> ??? > did take an initial general look at things.
>>>>>
>>>>> Thanks for taking the time!
>>>>>
>>>>> ??? > The only thing that jumped out at me is that I think the
>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>> ??? >
>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>>
>>>>> Yes, it should. Will add the method like above.
>>>>>
>>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>> Without
>>>>> ??? > active testing this will just bit-rot.
>>>>>
>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>>> workload. I will add a minimal test
>>>>> to keep it fresh.
>>>>>
>>>>> ??? > Also on the tests I don't understand your @requires clause:
>>>>> ??? >
>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>> ??? > (vm.opt.TieredCompilation != true))
>>>>> ??? >
>>>>> ??? > This seems to require that TieredCompilation is disabled, but
>>>>> tiered is
>>>>> ??? > our normal mode of operation. ??
>>>>> ??? >
>>>>>
>>>>> I removed the clause. I guess I wanted to target the tests towards
>>>>> the code they are supposed to
>>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>>> with just one compiler thread.
>>>>>
>>>>> Additionally I will make use of
>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>> serviceability-dev at openjdk.java.net;
>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>> Performance in the Presence of JVMTI Agents
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I would like to get reviews please for
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>>
>>>>>> Corresponding RFE:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>>
>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>>
>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>>> issues (thanks!). In addition the
>>>>>> change is being tested at SAP since I posted the first RFR some
>>>>>> months ago.
>>>>>>
>>>>>> The intention of this enhancement is to benefit performance wise
>>>>>> from escape analysis even if JVMTI
>>>>>> agents request capabilities that allow them to access local variable
>>>>>> values. E.g. if you start-up
>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>>> escape analysis is disabled right
>>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>>> should do so. With the
>>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>>> debugger attaches. EA based
>>>>>> optimizations are reverted just before an agent acquires the
>>>>>> reference to an object. In the JBS item
>>>>>> you'll find more details.
>>>>>
>>>>> Most of the details here are in areas I can comment on in detail, but I
>>>>> did take an initial general look at things.
>>>>>
>>>>> The only thing that jumped out at me is that I think the
>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>
>>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>>
>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
>>>>> active testing this will just bit-rot.
>>>>>
>>>>> Also on the tests I don't understand your @requires clause:
>>>>>
>>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>> (vm.opt.TieredCompilation != true))
>>>>>
>>>>> This seems to require that TieredCompilation is disabled, but tiered is
>>>>> our normal mode of operation. ??
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
>>>>>>
>>>>>>

From david.holmes at oracle.com  Fri Dec 13 01:52:48 2019
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 13 Dec 2019 11:52:48 +1000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <b71d94e4-abfb-5384-c6fd-1a302f8fd2f2@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <a690384a-2002-2d95-ad69-45fb66bc3452@oracle.com>
 <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com>
 <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com>
 <DB7PR02MB3612074D337EA1B024063A289B550@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com>
 <b71d94e4-abfb-5384-c6fd-1a302f8fd2f2@oracle.com>
Message-ID: <01509185-7b0b-a269-deb1-799444cf082f@oracle.com>

On 13/12/2019 10:56 am, Vladimir Kozlov wrote:
> Yes, David
> 
> You are correct these changes touch all part of VM and may affect Graal 
> (which also has EA) too.
> Changes should be tested in all our modes: tiered, C1 only, Graal, 
> Interpreter. And I realized that I only ran tier3-graal testing so I 
> submitted the rest of Graal's tiers now.
> 
> I had assumed that our current testing (I ran all from tier1 to tier8) 
> should exercise all paths in VM these changes touch. But I may be wrong 
> and it is correct to ask author to add testing in all VM modes to make 
> sure new code in VM's runtime and JVMTI is tested.

It may be that our existing JVM TI tests will exercise this adequately 
and that the new tests are more "whitebox" testing than general 
functional tests. But it is not obvious to me that we do have the 
coverage we need.

Cheers,
David

> I do like to keep what current test is doing with C2. May be add an 
> other test for other modes or modify current one to enable to run it in 
> other modes.
> 
> Thanks,
> Vladimir
> 
> On 12/12/19 3:32 PM, David Holmes wrote:
>> On 13/12/2019 9:02 am, Reingruber, Richard wrote:
>>> Hello Vladimir,
>>>
>>> thanks for having a look.
>>>
>>> ?? > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to 
>>> skip
>>> ?? > test from running in Interpreter mode too.
>>>
>>> Done.
>>>
>>> ?? > You don't need vm.opt.TieredCompilation != true in @requires 
>>> because you
>>> ?? > specified -XX:-TieredCompilation in @run command.
>>>
>>> Ok.
>>>
>>> ?? > The test is specifically written for C2 only (not for C1 or 
>>> Graal) to
>>> ?? > verify its Escape Analysis optimization.
>>> ?? > I did not look in great details into test's code but its 
>>> analysis may be
>>> ?? > affected if C1 compiler is also used.
>>> ?? >
>>> ?? > Richard may clarify this.
>>>
>>> The test cases aim to get their testmethod 'dontinline_testMethod' 
>>> compiled by C2. If they get C1
>>> compiled before doesn't matter all that much. I've got a slight 
>>> preference to disabled tiered
>>> compilation for simplicity.
>>
>> My concern - perhaps unfounded - is that this seems to be being tested 
>> only in a pure C2 environment when the actual changes will have to 
>> operate correctly in a tiered environment (and JVMCI).
>>
>> Thanks,
>> David
>>
>>> Thanks, Richard.
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>>> Sent: Donnerstag, 12. Dezember 2019 19:20
>>> To: David Holmes <david.holmes at oracle.com>; 
>>> hotspot-runtime-dev at openjdk.java.net; 
>>> hotspot-compiler-dev at openjdk.java.net; 
>>> serviceability-dev at openjdk.java.net; Reingruber, Richard 
>>> <richard.reingruber at sap.com>
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>>> Performance in the Presence of JVMTI Agents
>>>
>>> Hi David,
>>>
>>> Tiered is disabled because we don't want to see compilations and outputs
>>> from C1 compiler which does not have EA.
>>>
>>> The test is specifically written for C2 only (not for C1 or Graal) to
>>> verify its Escape Analysis optimization.
>>> I did not look in great details into test's code but its analysis may be
>>> affected if C1 compiler is also used.
>>>
>>> Richard may clarify this.
>>>
>>> thanks,
>>> Vladimir
>>>
>>> On 12/11/19 1:04 PM, David Holmes wrote:
>>>> On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
>>>>> I will do full review later. I want to comment about test command 
>>>>> line.
>>>>>
>>>>> You don't need vm.opt.TieredCompilation != true in @requires because
>>>>> you specified -XX:-TieredCompilation in @run command.
>>>>
>>>> And per my comment this should be being tested with tiered as well.
>>>>
>>>> David
>>>>
>>>>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
>>>>> test from running in Interpreter mode too.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 12/11/19 7:07 AM, Reingruber, Richard wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> ??? > Most of the details here are in areas I can comment on in
>>>>>> detail, but I
>>>>>> ??? > did take an initial general look at things.
>>>>>>
>>>>>> Thanks for taking the time!
>>>>>>
>>>>>> ??? > The only thing that jumped out at me is that I think the
>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>> ??? >
>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>>>
>>>>>> Yes, it should. Will add the method like above.
>>>>>>
>>>>>> ??? > Also I don't see any testing of the 
>>>>>> DeoptimizeObjectsALotThread.
>>>>>> Without
>>>>>> ??? > active testing this will just bit-rot.
>>>>>>
>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>>>> workload. I will add a minimal test
>>>>>> to keep it fresh.
>>>>>>
>>>>>> ??? > Also on the tests I don't understand your @requires clause:
>>>>>> ??? >
>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>>> ??? > (vm.opt.TieredCompilation != true))
>>>>>> ??? >
>>>>>> ??? > This seems to require that TieredCompilation is disabled, but
>>>>>> tiered is
>>>>>> ??? > our normal mode of operation. ??
>>>>>> ??? >
>>>>>>
>>>>>> I removed the clause. I guess I wanted to target the tests towards
>>>>>> the code they are supposed to
>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>>>> with just one compiler thread.
>>>>>>
>>>>>> Additionally I will make use of
>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>>> serviceability-dev at openjdk.java.net;
>>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>>> Performance in the Presence of JVMTI Agents
>>>>>>
>>>>>> Hi Richard,
>>>>>>
>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I would like to get reviews please for
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>>>
>>>>>>> Corresponding RFE:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>>>
>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>>>
>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>>>> issues (thanks!). In addition the
>>>>>>> change is being tested at SAP since I posted the first RFR some
>>>>>>> months ago.
>>>>>>>
>>>>>>> The intention of this enhancement is to benefit performance wise
>>>>>>> from escape analysis even if JVMTI
>>>>>>> agents request capabilities that allow them to access local variable
>>>>>>> values. E.g. if you start-up
>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>>>> escape analysis is disabled right
>>>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>>>> should do so. With the
>>>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>>>> debugger attaches. EA based
>>>>>>> optimizations are reverted just before an agent acquires the
>>>>>>> reference to an object. In the JBS item
>>>>>>> you'll find more details.
>>>>>>
>>>>>> Most of the details here are in areas I can comment on in detail, 
>>>>>> but I
>>>>>> did take an initial general look at things.
>>>>>>
>>>>>> The only thing that jumped out at me is that I think the
>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>
>>>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>>>
>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>>>>>> Without
>>>>>> active testing this will just bit-rot.
>>>>>>
>>>>>> Also on the tests I don't understand your @requires clause:
>>>>>>
>>>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>>> (vm.opt.TieredCompilation != true))
>>>>>>
>>>>>> This seems to require that TieredCompilation is disabled, but 
>>>>>> tiered is
>>>>>> our normal mode of operation. ??
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 
>>>>>>>
>>>>>>>
>>>>>>>

From linzang at tencent.com  Fri Dec 13 06:22:16 2019
From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=)
Date: Fri, 13 Dec 2019 06:22:16 +0000
Subject: Discuss the design of parallel and incremental jmap histo.
Message-ID: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com>

Dear All,

   I want to re-activate the thread of discussion about the implementation of parallel and incremental ?Jmap -histo?.
   The target of these changes is to solve the problems that ?jmap -histo? may ? timeout or killed by timer? when heap is large. And the result of ?jmap -histo? is ?one or nothing?, which means if it gets killed before exit, user gets no information about the heap.
   The ?incremental? means that jmap -histo dumps the intermediate results when it is iterating the heap, so if it is interrupted, user can get some meaningful information.
   The ?parallel? targets to help speed up the heap iteration with multi-threading.

   Originally I have implemented the ?incremental dump? that dump the intermediate data into a separate file like <IncrementalHisto.dump>, and the final result will be saved to another file <HistoResult.dump>. so when jmap -histo get interrupted, user can get information from <IncrementalHisto.dump>, and if jmap -histo works fine, the final result would be in <HistoResult.dump>.

   And the parallel dump will have multiple thread working on heap iteration, each thread generates intermediate data timely.

   The main reason of using separate file for incremental dump is due to the consideration of parallel incremental dump implementation, so that every heap-iteration thread could dump its own data in separate file, to avoid using file lock.

   However, it seems that the original design might confuse user by having two or more result files (intermediated result and final result).  So I want to ask your help to discuss it:


  1.  For incremental dump without parallel, Intermediate result and the final result are dumped to the same file:

In this case, the intermediate data are generated in the middle of heap iteration, they are written to file <HistoResult.dump> at the same time. And if jmap -histo exits normally, the final result will be also dump to <HistoResult.dump>, then all intermediate data are flushed.


  1.  For parallel dump without incremental:
Every thread generates its own thread-local dump buffer, and all thread local dump are merged and write to the <HistoResult.dump> file at the end.
There is no incremental support, so the result is ?one or nothing?.


  1.  For parallel + incremental dump, I think it?s a little complicated because of intermediate data processing:

     *   Every thread has its own thread-local intermediate data buffer, and all the thread-local buffers will be written to <HistoResult.dump> file while holding file lock. So there is only one data file generated, and if jmap -histo is interrupted,  the intermediated data are save in the same file.

The problem is that the file write lock can be heavy, which may cause parallel heap dump slow.


     *   Every thread has its own thread-local intermediate data buffer, and every thread save its result in an temp file named <IntermediatedResult_[tid].dump>.

So there is no  file lock. The parallel can be fast. But the problem is that there will be multiple files generated to save the thread-local intermediate results. And this might confuse the user.


     *   Every thread has its own thread-local intermediate data buffer, and another ?data-merging-thread? will be generated.

The parallel threads write data to its thread local buffer, and enqueue the buffer when data reach some threshold. The ?data-merging-thread? consumes the queue, merge the data from different thread, save the merged data to the result file.

In this case, there is only one <HistoResult.dump> file generated. And there is no file lock needed, but there is queue lock, and a separate ?merging thread? impl. Do you think this is a reasonable solution?

So may I ask your suggestion ?

Details of previous discussion can be found at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028276.html

Thanks!

BRs,
Lin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191213/280d1b7f/attachment-0001.htm>

From gnu.andrew at redhat.com  Fri Dec 13 06:35:38 2019
From: gnu.andrew at redhat.com (Andrew John Hughes)
Date: Fri, 13 Dec 2019 06:35:38 +0000
Subject: [8u] RFR: 8195088: [TEST_BUG] StartManagementAgent got unexpected
 exception
In-Reply-To: <85c52e84-e827-9af1-3cec-6dff26daeda0@oracle.com>
References: <da9b738515ede3ddac45d058016a6ac0ef9e52f1.camel@redhat.com>
 <85c52e84-e827-9af1-3cec-6dff26daeda0@oracle.com>
Message-ID: <446f1d58-eb0d-4944-55b1-5f691d566f6a@redhat.com>


On 03/10/2019 00:47, serguei.spitsyn at oracle.com wrote:
> Hi Severin,
> 
> It looks good and applies cleanly.
> So, I'm not sure you really need a review for this.
> 
> Thanks,
> Serguei
> 
> On 10/1/19 06:01, Severin Gehwolf wrote:
>> Hi,
>>
>> Please review this OpenJDK 8u vs. Oracle JDK 8 parity patch. I wasn't
>> sure whether I need review for this one as the bug in question is a JDK
>> 8-only bug and the patch applies as-is. Anyway, here it is:
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8195088
>> webrev:
>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8195088/01/webrev/
>>
>> Testing: StartManagementAgent.java test fails prior and passes after
>> this patch.
>>
>> Thoughts?
>>
>> Thanks,
>> Severin
>>
> 

What "applies cleanly"? As far as I can see, this is part of
JDK-8165736. As the rest of 8165736 is not being applied, then yes, it
needs review.

Approved. I'll push it as part of b05.

Thanks,
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222
https://keybase.io/gnu_andrew

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191213/dc1281c8/signature.asc>

From stefan.karlsson at oracle.com  Fri Dec 13 08:39:29 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 13 Dec 2019 09:39:29 +0100
Subject: RFR: 8226797: serviceability/tmtools/jstat/GcCapacityTest.java
 fails with Exception: java.lang.RuntimeException: OGCMN > OGCMX (min
 generation capacity > max generation capacity)
In-Reply-To: <35c25ed3-948b-482d-2f21-1ffffdf1afd9@oracle.com>
References: <93c9ec14-a371-0121-d758-7e213cde9c85@oracle.com>
 <1fc22506-3e64-fb98-8f7d-acb460c986f5@oracle.com>
 <35c25ed3-948b-482d-2f21-1ffffdf1afd9@oracle.com>
Message-ID: <5b5bbdd0-b1a0-4c1c-ee13-ce8adbbca592@oracle.com>

Hi Serguei,

On 2019-12-13 01:41, serguei.spitsyn at oracle.com wrote:
> Hi Stefan,
> 
> It looks good to me.

Thanks for reviewing.

> 
> Sorry, I was on the meeting, wrote this email and forgot to push 'send' 
> button.
> Just now discovered that it has not been really sent. :(

No problem. I pushed this yesterday to make the JDK 14 fork cut-off.

Thanks,
StefanK

> 
> Thanks,
> Serguei
> 
> 
> On 12/12/19 07:23, Stefan Karlsson wrote:
>> In the interest to get this integrated before the RDP cut-off I'm 
>> going to push this ASAP. This has gone through tier1-tier3 testing.
>>
>> StefanK
>>
>> On 2019-12-12 13:01, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please review this patch to fix a problem with unintialized values in 
>>> our generation counters.
>>>
>>> https://cr.openjdk.java.net/~stefank/8226797/webrev.01/
>>> https://bugs.openjdk.java.net/browse/JDK-8226797
>>>
>>> The jstat values NGCMN and OGCMN both return uninitialized values.
>>>
>>> I stumbled upon this while creating a patch to remove the 
>>> GenerationSpec class.
>>>
>>> GenerationSpec::_min_size is never initialized, and then used to 
>>> create the generations:
>>>
>>> ???? case Generation::DefNew:
>>> ?????? return new DefNewGeneration(rs, _init_size, _min_size, 
>>> _max_size);
>>>
>>> ???? case Generation::MarkSweepCompact:
>>> ?????? return new TenuredGeneration(rs, _init_size, _min_size, 
>>> _max_size, remset);
>>>
>>> That in turn uses it to initialize the perf counters:
>>> DefNewGeneration::DefNewGeneration(ReservedSpace rs,
>>> ??????????????????????????????????? size_t initial_size,
>>> ??????????????????????????????????? size_t min_size,
>>> ??????????????????????????????????? size_t max_size,
>>> ??????????????????????????????????? const char* policy)
>>> ...
>>> ?? _gen_counters = new GenerationCounters("new", 0, 3,
>>> ?????? min_size, max_size, &_virtual_space);
>>>
>>> I'm setting the value to _init_size, because it reflects how 
>>> MinNewSize and MinOldSize relates to NewSize and OldSize.
>>>
>>> Thanks,
>>> StefanK
> 

From fairoz.matte at oracle.com  Fri Dec 13 13:02:21 2019
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Fri, 13 Dec 2019 05:02:21 -0800 (PST)
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
 <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>
Message-ID: <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>

Hi Chris,

Thanks for the review,

Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro for -1L.
I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L.
http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/

Thanks,
Fairoz

-----Original Message-----
From: Chris Plummer 
Sent: Thursday, December 12, 2019 9:48 PM
To: Fairoz Matte <fairoz.matte at oracle.com>; serviceability-dev at openjdk.java.net
Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled

Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR.

Chris

On 12/11/19 7:10 PM, Fairoz Matte wrote:
> Hi,
>
> Please review this small change,
> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call.
>
> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/
>
> This patch is provided by Yasumasa Suenaga
>
> Thanks,
> Fairoz


From richard.reingruber at sap.com  Fri Dec 13 14:17:00 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Fri, 13 Dec 2019 14:17:00 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <01509185-7b0b-a269-deb1-799444cf082f@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <a690384a-2002-2d95-ad69-45fb66bc3452@oracle.com>
 <863a7bfc-5656-2a4d-6c22-c1fc22968d11@oracle.com>
 <00ac0f71-fe90-9ead-8696-6e7f96ff6a17@oracle.com>
 <DB7PR02MB3612074D337EA1B024063A289B550@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <57c09482-7f0a-e666-2bd5-f4b43ae8b32a@oracle.com>
 <b71d94e4-abfb-5384-c6fd-1a302f8fd2f2@oracle.com>
 <01509185-7b0b-a269-deb1-799444cf082f@oracle.com>
Message-ID: <DB7PR02MB3612973C0B2C24420C02243B9B540@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi David, Vladimir,

The tests are very targeted and customized towards the issues they solve. IMHO they should be run in
the configuration they are tailored for, but as I said, I'm ok with removing the tiered
options/conditions.

The enhancement should be covered also by existing JVMTI, JDI, JDWP tests, assuming they are also
executed with Xcomp.

If running the tests with Graal as C2 replacement you'll get failures, because the JVMCI compiler
does not provide the debug info required at runtime (see compiledVFrame::not_global_escape_in_scope()
and compiledVFrame::arg_escape). Still it would be possible to change the tests to expect these
failures when executed with Graal. Perhaps I should do this?

Thanks, Richard.

-----Original Message-----
From: David Holmes <david.holmes at oracle.com> 
Sent: Freitag, 13. Dezember 2019 02:53
To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Reingruber, Richard <richard.reingruber at sap.com>; hotspot-runtime-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

On 13/12/2019 10:56 am, Vladimir Kozlov wrote:
> Yes, David
> 
> You are correct these changes touch all part of VM and may affect Graal 
> (which also has EA) too.
> Changes should be tested in all our modes: tiered, C1 only, Graal, 
> Interpreter. And I realized that I only ran tier3-graal testing so I 
> submitted the rest of Graal's tiers now.
> 
> I had assumed that our current testing (I ran all from tier1 to tier8) 
> should exercise all paths in VM these changes touch. But I may be wrong 
> and it is correct to ask author to add testing in all VM modes to make 
> sure new code in VM's runtime and JVMTI is tested.

It may be that our existing JVM TI tests will exercise this adequately 
and that the new tests are more "whitebox" testing than general 
functional tests. But it is not obvious to me that we do have the 
coverage we need.

Cheers,
David

> I do like to keep what current test is doing with C2. May be add an 
> other test for other modes or modify current one to enable to run it in 
> other modes.
> 
> Thanks,
> Vladimir
> 
> On 12/12/19 3:32 PM, David Holmes wrote:
>> On 13/12/2019 9:02 am, Reingruber, Richard wrote:
>>> Hello Vladimir,
>>>
>>> thanks for having a look.
>>>
>>> ?? > Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to 
>>> skip
>>> ?? > test from running in Interpreter mode too.
>>>
>>> Done.
>>>
>>> ?? > You don't need vm.opt.TieredCompilation != true in @requires 
>>> because you
>>> ?? > specified -XX:-TieredCompilation in @run command.
>>>
>>> Ok.
>>>
>>> ?? > The test is specifically written for C2 only (not for C1 or 
>>> Graal) to
>>> ?? > verify its Escape Analysis optimization.
>>> ?? > I did not look in great details into test's code but its 
>>> analysis may be
>>> ?? > affected if C1 compiler is also used.
>>> ?? >
>>> ?? > Richard may clarify this.
>>>
>>> The test cases aim to get their testmethod 'dontinline_testMethod' 
>>> compiled by C2. If they get C1
>>> compiled before doesn't matter all that much. I've got a slight 
>>> preference to disabled tiered
>>> compilation for simplicity.
>>
>> My concern - perhaps unfounded - is that this seems to be being tested 
>> only in a pure C2 environment when the actual changes will have to 
>> operate correctly in a tiered environment (and JVMCI).
>>
>> Thanks,
>> David
>>
>>> Thanks, Richard.
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>>> Sent: Donnerstag, 12. Dezember 2019 19:20
>>> To: David Holmes <david.holmes at oracle.com>; 
>>> hotspot-runtime-dev at openjdk.java.net; 
>>> hotspot-compiler-dev at openjdk.java.net; 
>>> serviceability-dev at openjdk.java.net; Reingruber, Richard 
>>> <richard.reingruber at sap.com>
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>>> Performance in the Presence of JVMTI Agents
>>>
>>> Hi David,
>>>
>>> Tiered is disabled because we don't want to see compilations and outputs
>>> from C1 compiler which does not have EA.
>>>
>>> The test is specifically written for C2 only (not for C1 or Graal) to
>>> verify its Escape Analysis optimization.
>>> I did not look in great details into test's code but its analysis may be
>>> affected if C1 compiler is also used.
>>>
>>> Richard may clarify this.
>>>
>>> thanks,
>>> Vladimir
>>>
>>> On 12/11/19 1:04 PM, David Holmes wrote:
>>>> On 12/12/2019 5:21 am, Vladimir Kozlov wrote:
>>>>> I will do full review later. I want to comment about test command 
>>>>> line.
>>>>>
>>>>> You don't need vm.opt.TieredCompilation != true in @requires because
>>>>> you specified -XX:-TieredCompilation in @run command.
>>>>
>>>> And per my comment this should be being tested with tiered as well.
>>>>
>>>> David
>>>>
>>>>> Use vm.compMode == "Xmixed" instead of vm.compMode != "Xcomp" to skip
>>>>> test from running in Interpreter mode too.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 12/11/19 7:07 AM, Reingruber, Richard wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> ??? > Most of the details here are in areas I can comment on in
>>>>>> detail, but I
>>>>>> ??? > did take an initial general look at things.
>>>>>>
>>>>>> Thanks for taking the time!
>>>>>>
>>>>>> ??? > The only thing that jumped out at me is that I think the
>>>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>> ??? >
>>>>>> ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>>>
>>>>>> Yes, it should. Will add the method like above.
>>>>>>
>>>>>> ??? > Also I don't see any testing of the 
>>>>>> DeoptimizeObjectsALotThread.
>>>>>> Without
>>>>>> ??? > active testing this will just bit-rot.
>>>>>>
>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>>>> workload. I will add a minimal test
>>>>>> to keep it fresh.
>>>>>>
>>>>>> ??? > Also on the tests I don't understand your @requires clause:
>>>>>> ??? >
>>>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>>> ??? > (vm.opt.TieredCompilation != true))
>>>>>> ??? >
>>>>>> ??? > This seems to require that TieredCompilation is disabled, but
>>>>>> tiered is
>>>>>> ??? > our normal mode of operation. ??
>>>>>> ??? >
>>>>>>
>>>>>> I removed the clause. I guess I wanted to target the tests towards
>>>>>> the code they are supposed to
>>>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>>>> with just one compiler thread.
>>>>>>
>>>>>> Additionally I will make use of
>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>>> serviceability-dev at openjdk.java.net;
>>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>>> Performance in the Presence of JVMTI Agents
>>>>>>
>>>>>> Hi Richard,
>>>>>>
>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I would like to get reviews please for
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>>>
>>>>>>> Corresponding RFE:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>>>
>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>>>
>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>>>> issues (thanks!). In addition the
>>>>>>> change is being tested at SAP since I posted the first RFR some
>>>>>>> months ago.
>>>>>>>
>>>>>>> The intention of this enhancement is to benefit performance wise
>>>>>>> from escape analysis even if JVMTI
>>>>>>> agents request capabilities that allow them to access local variable
>>>>>>> values. E.g. if you start-up
>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>>>> escape analysis is disabled right
>>>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>>>> should do so. With the
>>>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>>>> debugger attaches. EA based
>>>>>>> optimizations are reverted just before an agent acquires the
>>>>>>> reference to an object. In the JBS item
>>>>>>> you'll find more details.
>>>>>>
>>>>>> Most of the details here are in areas I can comment on in detail, 
>>>>>> but I
>>>>>> did take an initial general look at things.
>>>>>>
>>>>>> The only thing that jumped out at me is that I think the
>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>>
>>>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>>>
>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>>>>>> Without
>>>>>> active testing this will just bit-rot.
>>>>>>
>>>>>> Also on the tests I don't understand your @requires clause:
>>>>>>
>>>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>>> (vm.opt.TieredCompilation != true))
>>>>>>
>>>>>> This seems to require that TieredCompilation is disabled, but 
>>>>>> tiered is
>>>>>> our normal mode of operation. ??
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 
>>>>>>>
>>>>>>>
>>>>>>>

From chris.plummer at oracle.com  Fri Dec 13 17:39:20 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 13 Dec 2019 09:39:20 -0800
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
 <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>
 <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>
Message-ID: <b6ddd41d-a1f8-4cf2-1df4-da7c2dd93302@oracle.com>

Looks good.

thanks,

Chris

On 12/13/19 5:02 AM, Fairoz Matte wrote:
> Hi Chris,
>
> Thanks for the review,
>
> Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro for -1L.
> I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L.
> http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/
>
> Thanks,
> Fairoz
>
> -----Original Message-----
> From: Chris Plummer
> Sent: Thursday, December 12, 2019 9:48 PM
> To: Fairoz Matte <fairoz.matte at oracle.com>; serviceability-dev at openjdk.java.net
> Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled
>
> Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR.
>
> Chris
>
> On 12/11/19 7:10 PM, Fairoz Matte wrote:
>> Hi,
>>
>> Please review this small change,
>> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call.
>>
>> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
>> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/
>>
>> This patch is provided by Yasumasa Suenaga
>>
>> Thanks,
>> Fairoz


From chris.plummer at oracle.com  Fri Dec 13 18:44:57 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 13 Dec 2019 10:44:57 -0800
Subject: Discuss the design of parallel and incremental jmap histo.
In-Reply-To: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com>
References: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com>
Message-ID: <ebf1c04a-1b99-7b37-251f-c9c0cb5a16c4@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191213/f3cf55fa/attachment-0001.htm>

From richard.reingruber at sap.com  Fri Dec 13 19:01:57 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Fri, 13 Dec 2019 19:01:57 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
Message-ID: <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi David,

  > Some further queries/concerns:
  > 
  > src/hotspot/share/runtime/objectMonitor.cpp
  > 
  > Can you please explain the changes to ObjectMonitor::wait:
  > 
  > !   _recursions = save      // restore the old recursion count
  > !                 + jt->get_and_reset_relock_count_after_wait(); //
  > increased by the deferred relock count
  > 
  > what is the "deferred relock count"? I gather it relates to
  > 
  > "The code was extended to be able to deoptimize objects of a frame that
  > is not the top frame and to let another thread than the owning thread do
  > it."

Yes, these relate. Currently EA based optimizations are reverted, when a compiled frame is replaced
with corresponding interpreter frames. Part of this is relocking objects with eliminated
locking. New with the enhancement is that we do this also just before object references are acquired
through JVMTI. In this case we deoptimize also the owning compiled frame C and we register
deoptimized objects as deferred updates. When control returns to C it gets deoptimized, we notice
that objects are already deoptimized (reallocated and relocked), so we don't do it again (relocking
twice would be incorrect of course). Deferred updates are copied into the new interpreter frames.

Problem: relocking is not possible if the target thread T is waiting on the monitor that needs to be
relocked. This happens only with non-local objects with EliminateNestedLocks. Instead relocking is
deferred until T owns the monitor again. This is what the piece of code above does.

  > which I don't like the sound of at all when it comes to ObjectMonitor
  > state. So I'd like to understand in detail exactly what is going on here
  > and why.  This is a very intrusive change that seems to badly break
  > encapsulation and impacts future changes to ObjectMonitor that are under
  > investigation.

I would not regard this as breaking encapsulation. Certainly not badly.

I've added a property relock_count_after_wait to JavaThread. The property is well
encapsulated. Future ObjectMonitor implementations have to deal with recursion too. They are free in
choosing a way to do that as long as that property is taken into account. This is hardly a
limitation.

Note also that the property is a straight forward extension of the existing concept of deferred
local updates. It is embedded into the structure holding them. So not even the footprint of a
JavaThread is enlarged if no deferred updates are generated.

  > ---
  > 
  > src/hotspot/share/runtime/thread.cpp
  > 
  > Can you please explain why JavaThread::wait_for_object_deoptimization
  > has to be handcrafted in this way rather than using proper transitions.
  > 

I wrote wait_for_object_deoptimization taking JavaThread::java_suspend_self_with_safepoint_check
as template. So in short: for the same reasons :)

Threads reach both methods as part of thread state transitions, therefore special handling is
required to change thread state on top of ongoing transitions.

  > We got rid of "deopt suspend" some time ago and it is disturbing to see
  > it being added back (effectively). This seems like it may be something
  > that handshakes could be used for.

Deopt suspend used to be something rather different with a similar name[1]. It is not being added back.

I'm actually duplicating the existing external suspend mechanism, because a thread can be suspended
at most once. And hey, and don't like that either! But it seems not unlikely that the duplicate can
be removed together with the original and the new type of handshakes that will be used for
thread suspend can be used for object deoptimization too. See today's discussion in JDK-8227745 [2].

Thanks, Richard.

[1] Deopt suspend was something like an async. handshake for architectures with register windows,
    where patching the return pc for deoptimization of a compiled frame was racy if the owner thread
    was in native code. Instead a "deopt" suspend flag was set on which the thread patched its own
    frame upon return from native. So no thread was suspended. It got its name only from the name of
    the flags.

[2] Discussion about using handshakes to sync. with the target thread:
    https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727

-----Original Message-----
From: David Holmes <david.holmes at oracle.com> 
Sent: Freitag, 13. Dezember 2019 00:56
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

Some further queries/concerns:

src/hotspot/share/runtime/objectMonitor.cpp

Can you please explain the changes to ObjectMonitor::wait:

!   _recursions = save      // restore the old recursion count
!                 + jt->get_and_reset_relock_count_after_wait(); // 
increased by the deferred relock count

what is the "deferred relock count"? I gather it relates to

"The code was extended to be able to deoptimize objects of a frame that 
is not the top frame and to let another thread than the owning thread do 
it."

which I don't like the sound of at all when it comes to ObjectMonitor 
state. So I'd like to understand in detail exactly what is going on here 
and why.  This is a very intrusive change that seems to badly break 
encapsulation and impacts future changes to ObjectMonitor that are under 
investigation.

---

src/hotspot/share/runtime/thread.cpp

Can you please explain why JavaThread::wait_for_object_deoptimization 
has to be handcrafted in this way rather than using proper transitions.

We got rid of "deopt suspend" some time ago and it is disturbing to see 
it being added back (effectively). This seems like it may be something 
that handshakes could be used for.

Thanks,
David
-----

On 12/12/2019 7:02 am, David Holmes wrote:
> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>> Hi David,
>>
>> ?? > Most of the details here are in areas I can comment on in detail, 
>> but I
>> ?? > did take an initial general look at things.
>>
>> Thanks for taking the time!
> 
> Apologies the above should read:
> 
> "Most of the details here are in areas I *can't* comment on in detail ..."
> 
> David
> 
>> ?? > The only thing that jumped out at me is that I think the
>> ?? > DeoptimizeObjectsALotThread should be a hidden thread.
>> ?? >
>> ?? > +? bool is_hidden_from_external_view() const { return true; }
>>
>> Yes, it should. Will add the method like above.
>>
>> ?? > Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>> Without
>> ?? > active testing this will just bit-rot.
>>
>> DeoptimizeObjectsALot is meant for stress testing with a larger 
>> workload. I will add a minimal test
>> to keep it fresh.
>>
>> ?? > Also on the tests I don't understand your @requires clause:
>> ?? >
>> ?? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>> ?? > (vm.opt.TieredCompilation != true))
>> ?? >
>> ?? > This seems to require that TieredCompilation is disabled, but 
>> tiered is
>> ?? > our normal mode of operation. ??
>> ?? >
>>
>> I removed the clause. I guess I wanted to target the tests towards the 
>> code they are supposed to
>> test, and it's easier to analyze failures w/o tiered compilation and 
>> with just one compiler thread.
>>
>> Additionally I will make use of 
>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>
>> Thanks,
>> Richard.
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Mittwoch, 11. Dezember 2019 08:03
>> To: Reingruber, Richard <richard.reingruber at sap.com>; 
>> serviceability-dev at openjdk.java.net; 
>> hotspot-compiler-dev at openjdk.java.net; 
>> hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>> Performance in the Presence of JVMTI Agents
>>
>> Hi Richard,
>>
>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>> Hi,
>>>
>>> I would like to get reviews please for
>>>
>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>
>>> Corresponding RFE:
>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>
>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>
>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without 
>>> issues (thanks!). In addition the
>>> change is being tested at SAP since I posted the first RFR some 
>>> months ago.
>>>
>>> The intention of this enhancement is to benefit performance wise from 
>>> escape analysis even if JVMTI
>>> agents request capabilities that allow them to access local variable 
>>> values. E.g. if you start-up
>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then 
>>> escape analysis is disabled right
>>> from the beginning, well before a debugger attaches -- if ever one 
>>> should do so. With the
>>> enhancement, escape analysis will remain enabled until and after a 
>>> debugger attaches. EA based
>>> optimizations are reverted just before an agent acquires the 
>>> reference to an object. In the JBS item
>>> you'll find more details.
>>
>> Most of the details here are in areas I can comment on in detail, but I
>> did take an initial general look at things.
>>
>> The only thing that jumped out at me is that I think the
>> DeoptimizeObjectsALotThread should be a hidden thread.
>>
>> +? bool is_hidden_from_external_view() const { return true; }
>>
>> Also I don't see any testing of the DeoptimizeObjectsALotThread. Without
>> active testing this will just bit-rot.
>>
>> Also on the tests I don't understand your @requires clause:
>>
>> ?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>> (vm.opt.TieredCompilation != true))
>>
>> This seems to require that TieredCompilation is disabled, but tiered is
>> our normal mode of operation. ??
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Richard.
>>>
>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>       
>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 
>>>
>>>

From harold.seigel at oracle.com  Fri Dec 13 19:35:17 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 13 Dec 2019 14:35:17 -0500
Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and
 TestRecordAttr.java are failing
Message-ID: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com>

Hi,

Please review this trivial fix to prevent java/lang/instrument/... 
TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? The 
fix replaces hard-wired JDK version 14 with mechanisms that get the 
latest JDK version.

Open Webrev: 
http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html

JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922

The fix was tested by running the tests locally on Linux-x64.

Thanks, Harold


From daniel.daugherty at oracle.com  Fri Dec 13 19:45:13 2019
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 13 Dec 2019 14:45:13 -0500
Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and
 TestRecordAttr.java are failing
In-Reply-To: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com>
References: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com>
Message-ID: <bd536ea1-58d4-3e28-66a9-b7eca9322ac7@oracle.com>

On 12/13/19 2:35 PM, Harold Seigel wrote:
> Hi,
>
> Please review this trivial fix to prevent java/lang/instrument/... 
> TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? 
> The fix replaces hard-wired JDK version 14 with mechanisms that get 
> the latest JDK version.
>
> Open Webrev: 
> http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html

test/jdk/java/lang/instrument/RedefineRecordAttr/TestRecordAttr.java
 ??? No comments.

test/jdk/java/lang/instrument/RedefineRecordAttrGenericSig/TestRecordAttrGenericSig.java
 ??? No comments.

Thumbs up! I agree that this is a trivial fix.

Thanks for fixing this so quickly!

Dan

>
> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922
>
> The fix was tested by running the tests locally on Linux-x64.
>
> Thanks, Harold
>


From harold.seigel at oracle.com  Fri Dec 13 19:46:50 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 13 Dec 2019 14:46:50 -0500
Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and
 TestRecordAttr.java are failing
In-Reply-To: <bd536ea1-58d4-3e28-66a9-b7eca9322ac7@oracle.com>
References: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com>
 <bd536ea1-58d4-3e28-66a9-b7eca9322ac7@oracle.com>
Message-ID: <1d159e3f-ca09-bdfa-be63-7b33997e56e6@oracle.com>

Thanks Dan!

Harold

On 12/13/2019 2:45 PM, Daniel D. Daugherty wrote:
> On 12/13/19 2:35 PM, Harold Seigel wrote:
>> Hi,
>>
>> Please review this trivial fix to prevent java/lang/instrument/... 
>> TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? 
>> The fix replaces hard-wired JDK version 14 with mechanisms that get 
>> the latest JDK version.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html
>
> test/jdk/java/lang/instrument/RedefineRecordAttr/TestRecordAttr.java
> ??? No comments.
>
> test/jdk/java/lang/instrument/RedefineRecordAttrGenericSig/TestRecordAttrGenericSig.java 
>
> ??? No comments.
>
> Thumbs up! I agree that this is a trivial fix.
>
> Thanks for fixing this so quickly!
>
> Dan
>
>>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922
>>
>> The fix was tested by running the tests locally on Linux-x64.
>>
>> Thanks, Harold
>>
>

From serguei.spitsyn at oracle.com  Fri Dec 13 21:50:45 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 13 Dec 2019 13:50:45 -0800
Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and
 TestRecordAttr.java are failing
In-Reply-To: <bd536ea1-58d4-3e28-66a9-b7eca9322ac7@oracle.com>
References: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com>
 <bd536ea1-58d4-3e28-66a9-b7eca9322ac7@oracle.com>
Message-ID: <de7bfa80-8b7d-0585-bacd-30f51e012325@oracle.com>

Hi Harold,

+1

Thanks,
Serguei

On 12/13/19 11:45 AM, Daniel D. Daugherty wrote:
> On 12/13/19 2:35 PM, Harold Seigel wrote:
>> Hi,
>>
>> Please review this trivial fix to prevent java/lang/instrument/... 
>> TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? 
>> The fix replaces hard-wired JDK version 14 with mechanisms that get 
>> the latest JDK version.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html
>
> test/jdk/java/lang/instrument/RedefineRecordAttr/TestRecordAttr.java
> ??? No comments.
>
> test/jdk/java/lang/instrument/RedefineRecordAttrGenericSig/TestRecordAttrGenericSig.java 
>
> ??? No comments.
>
> Thumbs up! I agree that this is a trivial fix.
>
> Thanks for fixing this so quickly!
>
> Dan
>
>>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922
>>
>> The fix was tested by running the tests locally on Linux-x64.
>>
>> Thanks, Harold
>>
>


From serguei.spitsyn at oracle.com  Fri Dec 13 22:01:56 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 13 Dec 2019 14:01:56 -0800
Subject: RFR [XS]: 8234968: check calloc rv in libinstrument
 InvocationAdapter
In-Reply-To: <AM6PR02MB55410D34D1D0E92EF886DE808A550@AM6PR02MB5541.eurprd02.prod.outlook.com>
References: <AM6PR02MB5078DF8D13F8D5C7DC08117893470@AM6PR02MB5078.eurprd02.prod.outlook.com>
 <CAA-vtUziv2i+7YueBLn=Pvd8jfAjUi_RsxFqHWjjFbHwUMBRDg@mail.gmail.com>
 <AM6PR02MB50781EF60E251FE2536021FC93460@AM6PR02MB5078.eurprd02.prod.outlook.com>
 <AM6PR02MB55410D34D1D0E92EF886DE808A550@AM6PR02MB5541.eurprd02.prod.outlook.com>
Message-ID: <97f4c96f-73bb-b5fd-1b2f-91ef18012269@oracle.com>

Hi Matthias,

+1

Thanks,
Serguei

On 12/12/19 2:00 AM, Langer, Christoph wrote:
>
> Hi Matthias,
>
> I think your current patch is good as it is ? at least it wouldn?t 
> make things worse, AFAICS.
>
> Further improvements can probably be done under another issue.
>
> Cheers
>
> Christoph
>
> *From:* serviceability-dev 
> <serviceability-dev-bounces at openjdk.java.net> *On Behalf Of *Baesken, 
> Matthias
> *Sent:* Freitag, 29. November 2019 08:18
> *To:* Thomas St?fe <thomas.stuefe at gmail.com>
> *Cc:* serviceability-dev at openjdk.java.net
> *Subject:* [CAUTION] RE: RFR [XS]: 8234968: check calloc rv in 
> libinstrument InvocationAdapter
>
> Hi Thomas, Christoph, thanks for the comments .? Of course the init 
> of? * decodedLen ?must be added .
>
> In? case of ?returning? NULL? from ?decodePath ??,?? we would have? 
> tmp == NULL? (in char* tmp = func;? )?? ??, assign? tmp to res? and? 
> then? we ?jplis_assert?? , see :
>
> #define TRANSFORM(res,func) {??? \
>
> ??? char* tmp = func;??????????? \
>
> ??? if (tmp != res) {??????????? \
>
> ??????? free(res);?????????????? \
>
> ??????? res = tmp;?????????????? \
>
> ??? }??????????????????????????? \
>
> ??? jplis_assert((void*)res != (void*)NULL);???? \
>
> }
>
> ?.
>
> TRANSFORM(path, decodePath(path,&len));
>
> New webrev :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.2/
>
> Best regards, Matthias
>
> *From:* Thomas St?fe <thomas.stuefe at gmail.com 
> <mailto:thomas.stuefe at gmail.com>>
> *Sent:* Freitag, 29. November 2019 07:30
> *To:* Baesken, Matthias <matthias.baesken at sap.com 
> <mailto:matthias.baesken at sap.com>>
> *Cc:* serviceability-dev at openjdk.java.net 
> <mailto:serviceability-dev at openjdk.java.net>
> *Subject:* Re: RFR [XS]: 8234968: check calloc rv in libinstrument 
> InvocationAdapter
>
> Hi Matthias,
>
> I am not certain the callers are prepared to handle NULL.
>
> This is used in a chain of TRANSFORM macro calls which AFAICS do not 
> handle NULL; e.g. , at 872, we pass the returned pointer to 
> convertUft8ToPlatformString which passes it on (on Windows) to 
> MultiByteToWideChar, which does not handle NULL input.
>
> So I wonder whether a clear error message with an exit would be better 
> in this case. Otherwise we may get a crash just some instructions later.
>
> Cheers, Thomas
>
> On Thu, Nov 28, 2019 at 5:21 PM Baesken, Matthias 
> <matthias.baesken at sap.com <mailto:matthias.baesken at sap.com>> wrote:
>
>     Hello, please review this small? patch .
>
>     It adds return value checking for calloc at one place where it is
>     missing .
>
>     Thanks, Matthias
>
>     Bug/webrev :
>
>     https://bugs.openjdk.java.net/browse/JDK-8234968
>
>     http://cr.openjdk.java.net/~mbaesken/webrevs/8234968.1/
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191213/1cf10ef9/attachment.htm>

From serguei.spitsyn at oracle.com  Sat Dec 14 01:02:13 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 13 Dec 2019 17:02:13 -0800
Subject: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
Message-ID: <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>

Hi Yasumasa,

This is nice move in general.
Thank you for working on this!

http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html

96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // 
Java frame 98 Address rbp = 
context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == 
null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, 
pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 
106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 
108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 
109 if (rbp == null) { 110 return null; 111 } 112 return new 
LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 
115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 
116 !dwarf.isBPOffsetAvailable()) 117 ? 
context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : 
context.getRegisterAsAddress(dwarf.getCFARegister()) 119 
.addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return 
null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }


I'd suggest to simplify the logic by refactoring to something like below:

 ????????? long libptr = dbg.findLibPtrByAddress(pc);
 ????????? Address cfa = 
context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
 ????????? DwarfParser dwarf = null;

 ????????? if (libptr != 0L) { // Native frame
 ??????????? try {
 ????????????? dwarf = new DwarfParser(libptr);
 ????????????? dwarf.processDwarf(pc);
 ????????????? Address cfa = ((dwarf.getCFARegister() == 
AMD64ThreadContext.RBP) &&
 ???????????????????????????? !dwarf.isBPOffsetAvailable())
 ??????????????????????????????? ? 
context.getRegisterAsAddress(AMD64ThreadContext.RBP)
 ??????????????????????????????? : 
context.getRegisterAsAddress(dwarf.getCFARegister())
.addOffsetTo(dwarf.getCFAOffset());

 ?????????? } catch (DebuggerException e) { // bail out to Java frame case
 ?????????? }
 ???????? }
 ???????? if (cfa == null) {
 ?????????? return null;
 ???????? }
 ???????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);

http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html

58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()

 ? Better to rename 'ofs' => 'offs'.

77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());

 ? Extra space after '-' sign.

71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext 
context) {

 ? It feels like the logic has to be somehow refactored/simplified as
 ? several typical fragments appears in slightly different contexts.
 ? But it is not easy to understand what it is.
 ? Could you, please, add some comments to key places explaining this logic.
 ? Then I'll check if it is possible to make it a little bit simpler.

109 private CFrame javaSender(ThreadContext context) { 110 Address 
nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if 
(nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf 
= null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if 
(libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new 
DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = 
getNextCFA(null, context); 125 return (nextCFA == null) ? null : new 
LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 
nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = 
getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 
: new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }

 ?The above can be simplified if a DebuggerException can not be thrown 
from processDwarf(nextPC):
 ???? private CFrame javaSender(ThreadContext context) {
 ?????? Address nextPC = getNextPC(false);
 ?????? if (nextPC == null) {
 ???????? return null;
 ?????? }
 ?????? long libptr = dbg.findLibPtrByAddress(nextPC);
 ?????? DwarfParser nextDwarf = null;

 ?????? if (libptr != 0L) { // Native frame
 ???????? try {
 ?????????? nextDwarf = new DwarfParser(libptr);
 ?????????? nextDwarf.processDwarf(nextPC);
 ???????? } catch (DebuggerException e) { // Bail out to Java frame
 ???????? }
 ?????? }
 ?????? Address nextCFA = getNextCFA(nextDwarf, context);
 ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
nextCFA, nextPC, nextDwarf);
 ???? }

135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context 
= thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 
return javaSender(context); 140 } 141 142 Address nextPC = 
getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 
147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if 
(!dwarf.isIn(nextPC)) { 150 long libptr = 
dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next 
frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 
return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, 
nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 
158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, 
context); 160 return (nextCFA == null) ? null : new 
LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 
nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, 
context); 166 return (nextCFA == null) ? null : new 
LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }

 ?This one can be also simplified a little:

 ???? public CFrame sender(ThreadProxy thread) {
 ?????? ThreadContext context = thread.getContext();

 ?????? if (dwarf == null) { // Java frame
 ???????? return javaSender(context);
 ?????? }
 ?????? Address nextPC = getNextPC(true);
 ?????? if (nextPC == null) {
 ???????? return null;
 ?????? }
 ?????? DwarfParser nextDwarf = null;
 ?????? if (!dwarf.isIn(nextPC)) {
 ???????? long libptr = dbg.findLibPtrByAddress(nextPC);
 ???????? if (libptr != 0L) {
 ?????????? try {
 ???????????? nextDwarf = new DwarfParser(libptr);
 ???????????? nextDwarf.processDwarf(nextPC);
 ?????????? } catch (DebuggerException e) { // Bail out to Java frame
 ?????????? }
 ???????? }
 ?????? }
 ?????? Address nextCFA = getNextCFA(nextDwarf, context);
 ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
nextCFA, nextPC, nextDwarf);
 ???? }

Finally, it looks like just one method could replace both
sender(ThreadProxy thread) and javaSender(ThreadContext context):

 ???? private CFrame commonSender(ThreadProxy thread) {
 ?????? ThreadContext context = thread.getContext();
 ?????? Address nextPC = getNextPC(false);
 ?????? if (nextPC == null) {
 ???????? return null;
 ?????? }
 ?????? DwarfParser nextDwarf = null;

 ?????? long libptr = dbg.findLibPtrByAddress(nextPC);
 ?????? if (dwarf == null || !dwarf.isIn(nextPC)) {
 ???????? long libptr = dbg.findLibPtrByAddress(nextPC);
 ???????? if (libptr != 0L) {
 ?????????? try {
 ???????????? nextDwarf = new DwarfParser(libptr);
 ???????????? nextDwarf.processDwarf(nextPC);
 ?????????? } catch (DebuggerException e) { // Bail out to Java frame
 ?????????? }
 ???????? }
 ?????? }
 ?????? Address nextCFA = getNextCFA(nextDwarf, context);
 ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
nextCFA, nextPC, nextDwarf);
 ???? }

I'm still reviewing the dwarf parser files.

Thanks,
Serguei


On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
> Hi,
>
> I refactored LinuxAMD64CFrame.java . It works fine in 
> serviceability/sa tests and
> all tests on submit repo 
> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
> Could you review new webrev?
>
> ? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>
> The diff from previous webrev is here:
> ? http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> Please review this change:
>>
>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>
>>
>> According to 2.7 Stack Unwind Algorithm in System V Application 
>> Binary Interface AMD64
>> Architecture Processor Supplement [1], we need to use DWARF in 
>> .eh_frame or .debug_frame
>> for stack unwinding.
>>
>> As JDK-8022183 said, omit-frame-pointer is enabled by default since 
>> GCC 4.6, so system
>> library (e.g. libc) might be compiled with this feature.
>>
>> However `jhsdb jstack --mixed` does not do so, it uses base pointer 
>> register (RBP).
>> So it might be lack of stack frames.
>>
>> I guess JDK-8219201 is caused by same issue.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] 
>> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191213/1cfd3f00/attachment-0001.htm>

From suenaga at oss.nttdata.com  Sun Dec 15 01:51:36 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sun, 15 Dec 2019 10:51:36 +0900
Subject: RFR: 8234624: jstack mixed mode should refer DWARF
In-Reply-To: <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
References: <eb284894-0801-d04d-c5c2-68ea3faaef98@oss.nttdata.com>
 <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com>
 <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com>
Message-ID: <e819c572-2172-c3e2-9933-f3859662d400@oss.nttdata.com>

Hi Serguei,

Thanks for your comment!
I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.

   http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/

This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).


Thanks,

Yasumasa


On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
> Hi Yasumasa,
> 
> This is nice move in general.
> Thank you for working on this!
> 
> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
> 
> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
> 
> 
> I'd suggest to simplify the logic by refactoring to something like below:
> 
>  ????????? long libptr = dbg.findLibPtrByAddress(pc);
>  ????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>  ????????? DwarfParser dwarf = null;
> 
>  ????????? if (libptr != 0L) { // Native frame
>  ??????????? try {
>  ????????????? dwarf = new DwarfParser(libptr);
>  ????????????? dwarf.processDwarf(pc);
>  ????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>  ???????????????????????????? !dwarf.isBPOffsetAvailable())
>  ??????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>  ??????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister())
> .addOffsetTo(dwarf.getCFAOffset());
> 
>  ?????????? } catch (DebuggerException e) { // bail out to Java frame case
>  ?????????? }
>  ???????? }
>  ???????? if (cfa == null) {
>  ?????????? return null;
>  ???????? }
>  ???????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
> 
> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
> 
> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
> 
>  ? Better to rename 'ofs' => 'offs'.
> 
> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
> 
>  ? Extra space after '-' sign.
> 
> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
> 
>  ? It feels like the logic has to be somehow refactored/simplified as
>  ? several typical fragments appears in slightly different contexts.
>  ? But it is not easy to understand what it is.
>  ? Could you, please, add some comments to key places explaining this logic.
>  ? Then I'll check if it is possible to make it a little bit simpler.
> 
> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
> 
>  ?The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>  ???? private CFrame javaSender(ThreadContext context) {
>  ?????? Address nextPC = getNextPC(false);
>  ?????? if (nextPC == null) {
>  ???????? return null;
>  ?????? }
>  ?????? long libptr = dbg.findLibPtrByAddress(nextPC);
>  ?????? DwarfParser nextDwarf = null;
> 
>  ?????? if (libptr != 0L) { // Native frame
>  ???????? try {
>  ?????????? nextDwarf = new DwarfParser(libptr);
>  ?????????? nextDwarf.processDwarf(nextPC);
>  ???????? } catch (DebuggerException e) { // Bail out to Java frame
>  ???????? }
>  ?????? }
>  ?????? Address nextCFA = getNextCFA(nextDwarf, context);
>  ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>  ???? }
> 
> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, 
> nextCFA, nextPC, nextDwarf); 167 }
> 
>  ?This one can be also simplified a little:
> 
>  ???? public CFrame sender(ThreadProxy thread) {
>  ?????? ThreadContext context = thread.getContext();
> 
>  ?????? if (dwarf == null) { // Java frame
>  ???????? return javaSender(context);
>  ?????? }
>  ?????? Address nextPC = getNextPC(true);
>  ?????? if (nextPC == null) {
>  ???????? return null;
>  ?????? }
>  ?????? DwarfParser nextDwarf = null;
>  ?????? if (!dwarf.isIn(nextPC)) {
>  ???????? long libptr = dbg.findLibPtrByAddress(nextPC);
>  ???????? if (libptr != 0L) {
>  ?????????? try {
>  ???????????? nextDwarf = new DwarfParser(libptr);
>  ???????????? nextDwarf.processDwarf(nextPC);
>  ?????????? } catch (DebuggerException e) { // Bail out to Java frame
>  ?????????? }
>  ???????? }
>  ?????? }
>  ?????? Address nextCFA = getNextCFA(nextDwarf, context);
>  ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>  ???? }
> 
> Finally, it looks like just one method could replace both
> sender(ThreadProxy thread) and javaSender(ThreadContext context):
> 
>  ???? private CFrame commonSender(ThreadProxy thread) {
>  ?????? ThreadContext context = thread.getContext();
>  ?????? Address nextPC = getNextPC(false);
>  ?????? if (nextPC == null) {
>  ???????? return null;
>  ?????? }
>  ?????? DwarfParser nextDwarf = null;
> 
>  ?????? long libptr = dbg.findLibPtrByAddress(nextPC);
>  ?????? if (dwarf == null || !dwarf.isIn(nextPC)) {
>  ???????? long libptr = dbg.findLibPtrByAddress(nextPC);
>  ???????? if (libptr != 0L) {
>  ?????????? try {
>  ???????????? nextDwarf = new DwarfParser(libptr);
>  ???????????? nextDwarf.processDwarf(nextPC);
>  ?????????? } catch (DebuggerException e) { // Bail out to Java frame
>  ?????????? }
>  ???????? }
>  ?????? }
>  ?????? Address nextCFA = getNextCFA(nextDwarf, context);
>  ?????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>  ???? }
> 
> I'm still reviewing the dwarf parser files.
> 
> Thanks,
> Serguei
> 
> 
> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>> Hi,
>>
>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>> Could you review new webrev?
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>
>> The diff from previous webrev is here:
>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> Please review this change:
>>>
>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>
>>>
>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>> for stack unwinding.
>>>
>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>> library (e.g. libc) might be compiled with this feature.
>>>
>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>> So it might be lack of stack frames.
>>>
>>> I guess JDK-8219201 is caused by same issue.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
> 

From linzang at tencent.com  Mon Dec 16 01:38:23 2019
From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=)
Date: Mon, 16 Dec 2019 01:38:23 +0000
Subject: Discuss the design of parallel and incremental jmap
 histo.(Internet mail)
In-Reply-To: <ebf1c04a-1b99-7b37-251f-c9c0cb5a16c4@oracle.com>
References: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com>
 <ebf1c04a-1b99-7b37-251f-c9c0cb5a16c4@oracle.com>
Message-ID: <F01472E6-8A77-4677-8D50-75E7B980A88B@tencent.com>

Dear Chris,

>> why jmap is getting stuck or "killed by timer" as you mention below. Shouldn't this be considered a bug and addressed directly.
This ?timer? is usually another process, my experience is HDFS and ZKFC, the ZKFC pings it?s NameNode periodically, and when the NameNode?s heap is large (~180GB in my case), the heap iteration by jmap can cause the process stuck, so ZFKC can not get response from NameNode, so the NameNode got killed.

>>  How useful are intermediate results? How often can users come to reasonable conclusions about heap usage when the data is incomplete.
From my experience,  I usually use jmap -histo to get the information about the distribution of objects, and I found usually the object distribution of part of the heap is similar about the distribution of the whole heap. I agree that this is not correct for all cases, but since jmap -histo give results only when it?s exit normally at present,  and I think maybe info of partial heap is better than nothing, especially for memory leak analysis.

>> Is there even an indication given of how much of the heap is accounted for in the output?
Yes, the incremental dump information shows the number of the objects and the totally bytes have been iterated.

Thanks!

BRs,
Lin

From: Chris Plummer <chris.plummer at oracle.com>
Date: Saturday, December 14, 2019 at 2:46 AM
To: "linzang(??)" <linzang at tencent.com>, "serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: Discuss the design of parallel and incremental jmap histo.(Internet mail)

Hi Lin,

I have a question regarding the need for incremental support. The CSR states:

Problem: Now, the "JMap -histo" tool can not dump intermediate result, which is useful if the heap is large and dumping the whole heap can be stuck.

Two questions. The first is why jmap is getting stuck or "killed by timer" as you mention below. Shouldn't this be considered a bug and addressed directly. Second question is how useful are intermediate results? How often can users come to reasonable conclusions about heap usage when the data is incomplete. Is there even an indication given of how much of the heap is accounted for in the output?

thanks,

Chris

On 12/12/19 10:22 PM, linzang(??) wrote:
Dear All,

   I want to re-activate the thread of discussion about the implementation of parallel and incremental ?Jmap -histo?.
   The target of these changes is to solve the problems that ?jmap -histo? may ? timeout or killed by timer? when heap is large. And the result of ?jmap -histo? is ?one or nothing?, which means if it gets killed before exit, user gets no information about the heap.
   The ?incremental? means that jmap -histo dumps the intermediate results when it is iterating the heap, so if it is interrupted, user can get some meaningful information.
   The ?parallel? targets to help speed up the heap iteration with multi-threading.

   Originally I have implemented the ?incremental dump? that dump the intermediate data into a separate file like <IncrementalHisto.dump>, and the final result will be saved to another file <HistoResult.dump>. so when jmap -histo get interrupted, user can get information from <IncrementalHisto.dump>, and if jmap -histo works fine, the final result would be in <HistoResult.dump>.

   And the parallel dump will have multiple thread working on heap iteration, each thread generates intermediate data timely.

   The main reason of using separate file for incremental dump is due to the consideration of parallel incremental dump implementation, so that every heap-iteration thread could dump its own data in separate file, to avoid using file lock.

   However, it seems that the original design might confuse user by having two or more result files (intermediated result and final result).  So I want to ask your help to discuss it:


  1.  For incremental dump without parallel, Intermediate result and the final result are dumped to the same file:

In this case, the intermediate data are generated in the middle of heap iteration, they are written to file <HistoResult.dump> at the same time. And if jmap -histo exits normally, the final result will be also dump to <HistoResult.dump>, then all intermediate data are flushed.


  1.  For parallel dump without incremental:
Every thread generates its own thread-local dump buffer, and all thread local dump are merged and write to the <HistoResult.dump> file at the end.
There is no incremental support, so the result is ?one or nothing?.


  1.  For parallel + incremental dump, I think it?s a little complicated because of intermediate data processing:

     *   Every thread has its own thread-local intermediate data buffer, and all the thread-local buffers will be written to <HistoResult.dump> file while holding file lock. So there is only one data file generated, and if jmap -histo is interrupted,  the intermediated data are save in the same file.

The problem is that the file write lock can be heavy, which may cause parallel heap dump slow.


     *   Every thread has its own thread-local intermediate data buffer, and every thread save its result in an temp file named <IntermediatedResult_[tid].dump>.

So there is no  file lock. The parallel can be fast. But the problem is that there will be multiple files generated to save the thread-local intermediate results. And this might confuse the user.


     *   Every thread has its own thread-local intermediate data buffer, and another ?data-merging-thread? will be generated.

The parallel threads write data to its thread local buffer, and enqueue the buffer when data reach some threshold. The ?data-merging-thread? consumes the queue, merge the data from different thread, save the merged data to the result file.

In this case, there is only one <HistoResult.dump> file generated. And there is no file lock needed, but there is queue lock, and a separate ?merging thread? impl. Do you think this is a reasonable solution?

So may I ask your suggestion ?

Details of previous discussion can be found at https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028276.html

Thanks!

BRs,
Lin


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191216/41e13508/attachment-0001.htm>

From fairoz.matte at oracle.com  Mon Dec 16 02:47:43 2019
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Sun, 15 Dec 2019 18:47:43 -0800 (PST)
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <b6ddd41d-a1f8-4cf2-1df4-da7c2dd93302@oracle.com>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
 <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>
 <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>
 <b6ddd41d-a1f8-4cf2-1df4-da7c2dd93302@oracle.com>
Message-ID: <4376860c-173f-4954-aba2-dae39986dbf2@default>

Thanks Chris,

> -----Original Message-----
> From: Chris Plummer
> Sent: Friday, December 13, 2019 11:09 PM
> To: Fairoz Matte <fairoz.matte at oracle.com>; serviceability-
> dev at openjdk.java.net
> Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
> prelink is enabled
> 
> Looks good.
> 
> thanks,
> 
> Chris
> 
> On 12/13/19 5:02 AM, Fairoz Matte wrote:
> > Hi Chris,
> >
> > Thanks for the review,
> >
> > Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro
> for -1L.
> > I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L.
> > http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/
> >
> > Thanks,
> > Fairoz
> >
> > -----Original Message-----
> > From: Chris Plummer
> > Sent: Thursday, December 12, 2019 9:48 PM
> > To: Fairoz Matte <fairoz.matte at oracle.com>; serviceability-
> dev at openjdk.java.net
> > Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
> prelink is enabled
> >
> > Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or
> LOAD_ADDRESS_ERROR.
> >
> > Chris
> >
> > On 12/11/19 7:10 PM, Fairoz Matte wrote:
> >> Hi,
> >>
> >> Please review this small change,
> >> Updating error handling, to make sure "lib_base_diff = 0" is still a valid
> scenario even after calc_prelinked_load_address() call.
> >>
> >> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
> >> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/
> >>
> >> This patch is provided by Yasumasa Suenaga
> >>
> >> Thanks,
> >> Fairoz
> 

From ioi.lam at oracle.com  Mon Dec 16 06:21:24 2019
From: ioi.lam at oracle.com (Ioi Lam)
Date: Sun, 15 Dec 2019 22:21:24 -0800
Subject: RFR(S) 8235970 [TESTBUG] Remove dependency of sun.tools.jar from
 RedefineClassHelper
Message-ID: <faba6e1f-bfe1-46d7-d4b3-c594135d4026@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8235970
http://cr.openjdk.java.net/~iklam/jdk15/8235970-RedefineClassHelper-no-sun-tools-jar.v01/

test/lib/RedefineClassHelper.java uses the internal sun.tools.jar.Main class
directly, causing a warning from javac. As a result, all tests that use
RedefineClassHelper need to have this line for the additional module 
dependency.

 ?? @modules jdk.jartool/sun.tools.jar

The fix is to rewrite RedefineClassHelper to use ClassFileInstaller instead.

I removed "@modules jdk.jartool/sun.tools.jar" for all users of
RedefineClassHelper, except for the following (which use
sun.tools.jar.Main directly).

 ??? test/hotspot/jtreg/serviceability/jvmti/RedefineClasses/*

-----

Testing with hs-tier1,hs-tier2,hs-tier5-svc which cover all the affected 
test cases.

Thanks
- Ioi


From Alan.Bateman at oracle.com  Mon Dec 16 07:22:07 2019
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 16 Dec 2019 07:22:07 +0000
Subject: RFR(S) 8235970 [TESTBUG] Remove dependency of sun.tools.jar from
 RedefineClassHelper
In-Reply-To: <faba6e1f-bfe1-46d7-d4b3-c594135d4026@oracle.com>
References: <faba6e1f-bfe1-46d7-d4b3-c594135d4026@oracle.com>
Message-ID: <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com>

On 16/12/2019 06:21, Ioi Lam wrote:
> :
>
> The fix is to rewrite RedefineClassHelper to use ClassFileInstaller 
> instead.
This looks okay but just to point out that the jar tool can be obtained 
via ToolProvider, e.g.
 ?? ToolProvider jarTool = ToolProvider.findFirst("jar").orElseThrow();

so RedefineClassHelper, or better still ClassFileInstaller, could use 
that for cases where JAR files need to be created or updated in ways 
that would be easier if the jar tool could be used in the test. Avoids 
using some of the prickly APIs in java.util.zip|jar.

-Alan

From robbin.ehn at oracle.com  Mon Dec 16 09:47:33 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 16 Dec 2019 10:47:33 +0100
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
Message-ID: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>

Hi all, please review.

 From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:

JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in a vm 
operation) before they are installed in the safeopint and after they have been 
installed, walked with JvmtiCurrentBreakpoints::oops_do().
By putting the class holder inside oopStorage there is no need for this.

JvmtiCurrentBreakpoints::metadata_do is not needed because redefine classes 
actually removes the breakpoints before updating them (so there is no 
breakpoints to update).
We can just remove metadata_do.


I also removed some unused code.

Changeset:
http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/

Passes several runs of nsk jvmti/jdi and t1-7.

Thanks, Robbin

From robbin.ehn at oracle.com  Mon Dec 16 10:20:39 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 16 Dec 2019 11:20:39 +0100
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com>

Hi Richard, as mentioned it would be better if you could do this with
handshakes, instead of using _suspend_flag (since they are going away).
But I can't think of a way doing it without blocking safepoints, so we need to
add some more features in handshakes first.
When possible I hope you are willing to move this code to handshakes instead.

You could stop one thread with, e.g.:
class EscapeBarrierSuspendHandshake : public HandshakeClosure {
   Semaphore _is_waiting;
   Semaphore _wait;
   bool _started;
  public:
   EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), 
_wait(0), _started(false) { }
   void do_thread(Thread* th) {
     _is_waiting.signal();
     _wait.wait();
     Atomic::store(&_started, true);
   }
   void wait_until_eb_stopped() { _is_waiting.wait(); }
   void start_thread() {
     _wait.signal();
     while(!Atomic::load(&_started)) {
       os::naked_yield();
     }
   }
};

But it would block safepoints.

Thanks, Robbin

On 12/10/19 10:45 PM, Reingruber, Richard wrote:
> Hi,
> 
> I would like to get reviews please for
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> 
> Corresponding RFE:
> https://bugs.openjdk.java.net/browse/JDK-8227745
> 
> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
> 
> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
> change is being tested at SAP since I posted the first RFR some months ago.
> 
> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
> agents request capabilities that allow them to access local variable values. E.g. if you start-up
> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
> from the beginning, well before a debugger attaches -- if ever one should do so. With the
> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
> you'll find more details.
> 
> Thanks,
> Richard.
> 
> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>      http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
> 

From coleen.phillimore at oracle.com  Mon Dec 16 11:41:05 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Dec 2019 06:41:05 -0500
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
Message-ID: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>

Summary: Start ServiceThread before compiler threads, and run nmethod 
barriers for zgc before adding to the service thread queue, or posting 
the events on the java thread queue.

See bug for description of the problems found with the new Zombie.java test.

open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8235829

Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
original test failure from bug 
https://bugs.openjdk.java.net/browse/JDK-8173361.

Thanks,
Coleen

From ralf.schmelter at sap.com  Mon Dec 16 12:27:58 2019
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Mon, 16 Dec 2019 12:27:58 +0000
Subject: RFR (M) 8234510: Remove file seeking requirement for writing a
 heap dump
In-Reply-To: <AM0PR02MB4500B68C26D844003956EC339F580@AM0PR02MB4500.eurprd02.prod.outlook.com>
References: <AM0PR02MB45008C66EC315E9836F7FF7A9F4A0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAA-vtUxV40m+-qQ0N7SZmOe9FotFfyr1snQGe0T5pK50LKmJXg@mail.gmail.com>
 <AM0PR02MB4500B68C26D844003956EC339F580@AM0PR02MB4500.eurprd02.prod.outlook.com>
Message-ID: <AM0PR02MB450094D67F68E9168B6293D39F510@AM0PR02MB4500.eurprd02.prod.outlook.com>

I forgot to post the updated webrev:
http://cr.openjdk.java.net/~rschmelter/webrevs/8234510/webrev.2/

In addition to the changes requested by Thomas, I also renamed the entries in the heap dump segment from entries to sub-records, since that is what they are called in the comment describing the format.

Best regards,
Ralf


From coleen.phillimore at oracle.com  Mon Dec 16 12:32:56 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Dec 2019 07:32:56 -0500
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
Message-ID: <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>


I have to think about this.?? Could there be breakpoints in old emcp 
methods that we do not remove??? The metadata_do function is trying to 
keep old Methods from being deleted while there are still references to 
them.

http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html 


+ oop* _class_holder; // keeps _method memory from being deallocated


We created the class OopHandle to encapsulate strong oopStorage 
references, although it's missing oop_store.? Can you use that?

Coleen

On 12/16/19 4:47 AM, Robbin Ehn wrote:
> Hi all, please review.
>
> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>
> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in 
> a vm operation) before they are installed in the safeopint and after 
> they have been installed, walked with JvmtiCurrentBreakpoints::oops_do().
> By putting the class holder inside oopStorage there is no need for this.
>
> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine 
> classes actually removes the breakpoints before updating them (so 
> there is no breakpoints to update).
> We can just remove metadata_do.
>
>
> I also removed some unused code.
>
> Changeset:
> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>
> Passes several runs of nsk jvmti/jdi and t1-7.
>
> Thanks, Robbin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191216/508aceb1/attachment.htm>

From david.holmes at oracle.com  Mon Dec 16 13:04:34 2019
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 16 Dec 2019 23:04:34 +1000
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
Message-ID: <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>

Hi Coleen,

On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
> Summary: Start ServiceThread before compiler threads, and run nmethod 
> barriers for zgc before adding to the service thread queue, or posting 
> the events on the java thread queue.

I can't comment on most of this but the earlier starting of the service 
thread has some concerns:

- there is a lot of JDK level initialization which now will not have 
happened before the service thread is started and it is far from obvious 
that all possible initialization dependencies will be satisfied

- current starting of the service thread in Management::initialize is 
guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the 
service thread unconditionally for all builds. Hmm just saw your latest 
comment to the bug report - so the service thread is now (for quite some 
time?) being used for other than management tasks and so should always 
be present even if INCLUDE_MANAGEMENT is not enabled. Is that sufficient 
or are there likely to be other changes needed to actually ensure that 
all works correctly  e.g. any code the service thread executes that is 
only defined for INCLUDE_MANAGEMENT will need to be compiled out explicitly.

- the service thread and the notification thread are (were?) closely 
related but now started at completely different times

The bug report states the problem as:

"The graal crash is because compiled_method_load events are added to the 
ServiceThread's deferred event queue before the ServiceThread is created 
so are not walked to keep them from being zombied."

so why isn't the solution to ensure the deferred event queue is walked? 
I'm not clear how starting the service thread relates to walking the queue.

Thanks,
David

> See bug for description of the problems found with the new Zombie.java 
> test.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
> 
> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
> original test failure from bug 
> https://bugs.openjdk.java.net/browse/JDK-8173361.
> 
> Thanks,
> Coleen

From robbin.ehn at oracle.com  Mon Dec 16 13:13:18 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 16 Dec 2019 14:13:18 +0100
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
Message-ID: <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>

Hi Coleen, in VM_RedefineClasses::doit:

This updates the breakpoints:
   MetadataOnStackMark md_on_stack(/*walk_all_metadata*/true, 
/*redefinition_walk*/true);

And this removes breakpoints:
   for (int i = 0; i < _class_count; i++) {
     redefine_single_class(_class_defs[i].klass, _scratch_classes[i], thread);
   }

So we skip updating, since we do remove them after we updated them.
But you are the expert here. Let me know if there is something I missed.

OopHandle just adds more code.

Thanks for having a look, Robbin

On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote:
> 
> I have to think about this.?? Could there be breakpoints in old emcp methods 
> that we do not remove??? The metadata_do function is trying to keep old Methods 
> from being deleted while there are still references to them.
> 
> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html 
> 
> 
> + oop* _class_holder; // keeps _method memory from being deallocated
> 
> 
> We created the class OopHandle to encapsulate strong oopStorage references, 
> although it's missing oop_store.? Can you use that?


> 
> Coleen
> 
> On 12/16/19 4:47 AM, Robbin Ehn wrote:
>> Hi all, please review.
>>
>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>>
>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in a vm 
>> operation) before they are installed in the safeopint and after they have been 
>> installed, walked with JvmtiCurrentBreakpoints::oops_do().
>> By putting the class holder inside oopStorage there is no need for this.
>>
>> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine classes 
>> actually removes the breakpoints before updating them (so there is no 
>> breakpoints to update).
>> We can just remove metadata_do.
>>
>>
>> I also removed some unused code.
>>
>> Changeset:
>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>>
>> Passes several runs of nsk jvmti/jdi and t1-7.
>>
>> Thanks, Robbin
> 

From coleen.phillimore at oracle.com  Mon Dec 16 13:26:58 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Dec 2019 08:26:58 -0500
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
Message-ID: <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>


On 12/16/19 8:04 AM, David Holmes wrote:
> Hi Coleen,
>
> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>> Summary: Start ServiceThread before compiler threads, and run nmethod 
>> barriers for zgc before adding to the service thread queue, or 
>> posting the events on the java thread queue.
>
> I can't comment on most of this but the earlier starting of the 
> service thread has some concerns:
>
> - there is a lot of JDK level initialization which now will not have 
> happened before the service thread is started and it is far from 
> obvious that all possible initialization dependencies will be satisfied

I agree that the order of initialization is very sensitive.? From the 
actions that the service thread does, the one that I found was a problem 
was that events were posted before the LIVE phase (see comment in 
has_events()), which could have happened with the existing code, but the 
window for the race is a lot smaller. ? The other actions can be run if 
there's a GC before initialization but would be a bug in the 
initialization code, and I didn't find these bugs in all my testing.? 
There are some ordering dependencies that do have odd side effects 
(between the compiler thread startup and initialization jsr292 classes) 
which have comments.? This patch doesn't touch those.

>
> - current starting of the service thread in Management::initialize is 
> guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the 
> service thread unconditionally for all builds. Hmm just saw your 
> latest comment to the bug report - so the service thread is now (for 
> quite some time?) being used for other than management tasks and so 
> should always be present even if INCLUDE_MANAGEMENT is not enabled. Is 
> that sufficient or are there likely to be other changes needed to 
> actually ensure that all works correctly? e.g. any code the service 
> thread executes that is only defined for INCLUDE_MANAGEMENT will need 
> to be compiled out explicitly.
>

I asked Jie offline to check the minimal build.? I don't think there are 
other INCLUDE_MANAGEMENT actions in the service thread and I'm not sure 
why it was initialized there in the first place.? The minimal vm would 
have been broken ie. hashtables would not have been cleaned up, etc, but 
I'm not sure how well that is tested or if one would notice.
> - the service thread and the notification thread are (were?) closely 
> related but now started at completely different times

The notification thread is limited to "services" so it makes sense where 
it is.? The ServiceThread does lots of other things.? Maybe it needs 
renaming in 15.
>
> The bug report states the problem as:
>
> "The graal crash is because compiled_method_load events are added to 
> the ServiceThread's deferred event queue before the ServiceThread is 
> created so are not walked to keep them from being zombied."
>
> so why isn't the solution to ensure the deferred event queue is 
> walked? I'm not clear how starting the service thread relates to 
> walking the queue.
>

The service thread is responsible for walking the deferred event 
queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could be 
changed to have some global walk somewhere of this queue, but 
essentially this queue is processed by the service thread.

I had an additional change to make the queue non-static but want to 
limit the change at this point.

Thanks,
Coleen
> Thanks,
> David
>
>> See bug for description of the problems found with the new 
>> Zombie.java test.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>
>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
>> original test failure from bug 
>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>
>> Thanks,
>> Coleen


From richard.reingruber at sap.com  Mon Dec 16 13:41:49 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Mon, 16 Dec 2019 13:41:49 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com>
Message-ID: <DB7PR02MB3612D26A0522C4B17B924E9A9B510@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi Robbin,

first of all: thanks a lot for providing feedback. I do appreciate it.

I am absolutely willing to move this to handshakes. Only I still can't see how to achieve it.

Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? [1]

I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() where calling_thread
T1 would apply it on another thread T2.

1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot move anymore?

2. Handshakes between two threads are synchronous, correct? If so, then T1 will block handshaking
   T2, because either T2 or the VMThread will block in L10.

I cannot figure out, how you mean this. Only if a helper thread H would handshake T2 then T1 could
continue and call wait_until_eb_stopped(). But returning from there T1 would block if reallocating
objects triggers GC or attempting to execute the vm operation in
JvmtiEnv::GetOwnedMonitorStackDepthInfo().

It might be impossible to replace my suspend flag with handshakes that are available today, because
if it was you could replace all the suspend flags right away, couldn't you?

Or I'm simply missing something... quite possible... :)

Thanks, Richard.

[1] Drafted by Robbin (thanks!)

     1	class EscapeBarrierSuspendHandshake : public HandshakeClosure {
     2	  Semaphore _is_waiting;
     3	  Semaphore _wait;
     4	  bool _started;
     5	public:
     6	  EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"),
     7	  _wait(0), _started(false) { }
     8	  void do_thread(Thread* th) {
     9	    _is_waiting.signal();
    10	    _wait.wait();
    11	    Atomic::store(&_started, true);
    12	  }
    13	  void wait_until_eb_stopped() { _is_waiting.wait(); }
    14	  void start_thread() {
    15	    _wait.signal();
    16	    while(!Atomic::load(&_started)) {
    17	      os::naked_yield();
    18	    }
    19	  }
    20	};

-----Original Message-----
From: Robbin Ehn <robbin.ehn at oracle.com> 
Sent: Montag, 16. Dezember 2019 11:21
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard, as mentioned it would be better if you could do this with
handshakes, instead of using _suspend_flag (since they are going away).
But I can't think of a way doing it without blocking safepoints, so we need to
add some more features in handshakes first.
When possible I hope you are willing to move this code to handshakes instead.

You could stop one thread with, e.g.:
class EscapeBarrierSuspendHandshake : public HandshakeClosure {
   Semaphore _is_waiting;
   Semaphore _wait;
   bool _started;
  public:
   EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"), 
_wait(0), _started(false) { }
   void do_thread(Thread* th) {
     _is_waiting.signal();
     _wait.wait();
     Atomic::store(&_started, true);
   }
   void wait_until_eb_stopped() { _is_waiting.wait(); }
   void start_thread() {
     _wait.signal();
     while(!Atomic::load(&_started)) {
       os::naked_yield();
     }
   }
};

But it would block safepoints.

Thanks, Robbin

On 12/10/19 10:45 PM, Reingruber, Richard wrote:
> Hi,
> 
> I would like to get reviews please for
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> 
> Corresponding RFE:
> https://bugs.openjdk.java.net/browse/JDK-8227745
> 
> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
> 
> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
> change is being tested at SAP since I posted the first RFR some months ago.
> 
> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
> agents request capabilities that allow them to access local variable values. E.g. if you start-up
> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
> from the beginning, well before a debugger attaches -- if ever one should do so. With the
> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
> you'll find more details.
> 
> Thanks,
> Richard.
> 
> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>      http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
> 

From kevin.walls at oracle.com  Mon Dec 16 15:13:36 2019
From: kevin.walls at oracle.com (Kevin Walls)
Date: Mon, 16 Dec 2019 15:13:36 +0000
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
 <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>
 <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>
Message-ID: <e1beb99a-cbd3-5254-6ee7-0b007537abb2@oracle.com>

Nice to know the difference between something that is zero, and 
something that has failed. 8-)

...oops, that says INAVLID_LOAD_ADDRESS

..not INVALID, so other that the typo yes it looks good.

---
Kevin


On 13/12/2019 13:02, Fairoz Matte wrote:
> Hi Chris,
>
> Thanks for the review,
>
> Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro for -1L.
> I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L.
> http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/
>
> Thanks,
> Fairoz
>
> -----Original Message-----
> From: Chris Plummer
> Sent: Thursday, December 12, 2019 9:48 PM
> To: Fairoz Matte <fairoz.matte at oracle.com>; serviceability-dev at openjdk.java.net
> Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled
>
> Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR.
>
> Chris
>
> On 12/11/19 7:10 PM, Fairoz Matte wrote:
>> Hi,
>>
>> Please review this small change,
>> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call.
>>
>> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
>> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/
>>
>> This patch is provided by Yasumasa Suenaga
>>
>> Thanks,
>> Fairoz

From fairoz.matte at oracle.com  Mon Dec 16 15:36:34 2019
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Mon, 16 Dec 2019 07:36:34 -0800 (PST)
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <e1beb99a-cbd3-5254-6ee7-0b007537abb2@oracle.com>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
 <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>
 <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>
 <e1beb99a-cbd3-5254-6ee7-0b007537abb2@oracle.com>
Message-ID: <43fd053e-7fd4-41fe-8bf3-7382cf77ca33@default>

Oh yes,
Thanks Kevin for the review.

Corrected the same - http://cr.openjdk.java.net/~fmatte/8235637/webrev.02

Thanks,
Fairoz

-----Original Message-----
From: Kevin Walls 
Sent: Monday, December 16, 2019 8:44 PM
To: Fairoz Matte <fairoz.matte at oracle.com>; Chris Plummer <chris.plummer at oracle.com>; serviceability-dev at openjdk.java.net
Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if prelink is enabled

Nice to know the difference between something that is zero, and something that has failed. 8-)

...oops, that says INAVLID_LOAD_ADDRESS

..not INVALID, so other that the typo yes it looks good.

---
Kevin


On 13/12/2019 13:02, Fairoz Matte wrote:
> Hi Chris,
>
> Thanks for the review,
>
> Please find the webrev.01 with usage of INAVLID_LOAD_ADDRESS macro for -1L.
> I have also added one more macro for ZERO_LOAD_ADDRESS for 0x0L.
> http://cr.openjdk.java.net/~fmatte/8235637/webrev.01/
>
> Thanks,
> Fairoz
>
> -----Original Message-----
> From: Chris Plummer
> Sent: Thursday, December 12, 2019 9:48 PM
> To: Fairoz Matte <fairoz.matte at oracle.com>; 
> serviceability-dev at openjdk.java.net
> Subject: Re: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work 
> if prelink is enabled
>
> Can you use a macro for -1L? Maybe INAVLID_LOAD_ADDRESS or LOAD_ADDRESS_ERROR.
>
> Chris
>
> On 12/11/19 7:10 PM, Fairoz Matte wrote:
>> Hi,
>>
>> Please review this small change,
>> Updating error handling, to make sure "lib_base_diff = 0" is still a valid scenario even after calc_prelinked_load_address() call.
>>
>> JBS - https://bugs.openjdk.java.net/browse/JDK-8235637
>> Webrev - http://cr.openjdk.java.net/~fmatte/8235637/webrev.00/
>>
>> This patch is provided by Yasumasa Suenaga
>>
>> Thanks,
>> Fairoz

From kevin.walls at oracle.com  Mon Dec 16 16:35:26 2019
From: kevin.walls at oracle.com (Kevin Walls)
Date: Mon, 16 Dec 2019 16:35:26 +0000
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <43fd053e-7fd4-41fe-8bf3-7382cf77ca33@default>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
 <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>
 <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>
 <e1beb99a-cbd3-5254-6ee7-0b007537abb2@oracle.com>
 <43fd053e-7fd4-41fe-8bf3-7382cf77ca33@default>
Message-ID: <068d2c84-d865-3c78-1b53-eab3f3af660d@oracle.com>

Great! 8-)

On 16/12/2019 15:36, Fairoz Matte wrote:
> Oh yes,
> Thanks Kevin for the review.
>
> Corrected the same - http://cr.openjdk.java.net/~fmatte/8235637/webrev.02
>
> Thanks,
> Fairoz
>

From robbin.ehn at oracle.com  Mon Dec 16 17:20:50 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Mon, 16 Dec 2019 18:20:50 +0100
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB7PR02MB3612D26A0522C4B17B924E9A9B510@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com>
 <DB7PR02MB3612D26A0522C4B17B924E9A9B510@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <9f24ec2c-d737-f9b7-8821-5905264971a7@oracle.com>

Hi Richard,

On 2019-12-16 14:41, Reingruber, Richard wrote:
> Hi Robbin,
> 
> first of all: thanks a lot for providing feedback. I do appreciate it.
> 
> I am absolutely willing to move this to handshakes. Only I still can't see how to achieve it.
> 
> Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? [1]
> 
> I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() where calling_thread
> T1 would apply it on another thread T2.

Sorry I don't immediately see what issue there is in doing a handshake 
instead of:
VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread, 
owned_monitors_list);

> 
> 1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot move anymore?
> 
> 2. Handshakes between two threads are synchronous, correct? If so, then T1 will block handshaking
>     T2, because either T2 or the VMThread will block in L10.

Yes, sorry, I forgot/confused myself about asynch handshake.
(I have a test prototype for that, which removes suspend flag)

> 
> I cannot figure out, how you mean this. Only if a helper thread H would handshake T2 then T1 could
> continue and call wait_until_eb_stopped(). But returning from there T1 would block if reallocating
> objects triggers GC or attempting to execute the vm operation in
> JvmtiEnv::GetOwnedMonitorStackDepthInfo().
> 
> It might be impossible to replace my suspend flag with handshakes that are available today, because
> if it was you could replace all the suspend flags right away, couldn't you?

So adding asynch handshakes and a per thread handshake queue, we can. 
(which this test prototype does)
The issue I'm thinking of is if we need selective polling first.
Suspend flags are not checked in every transition, e.g. vm->native.
A JVM TI agent don't expect to suspend it's own thread when suspending
all threads.
(that thread would be suspended when trying to get back to agent code
when it does vm->native transition)

> 
> Or I'm simply missing something... quite possible... :)

No I think you got it right.

Thanks, Robbin

> 
> Thanks, Richard.
> 
> [1] Drafted by Robbin (thanks!)
> 
>       1	class EscapeBarrierSuspendHandshake : public HandshakeClosure {
>       2	  Semaphore _is_waiting;
>       3	  Semaphore _wait;
>       4	  bool _started;
>       5	public:
>       6	  EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"),
>       7	  _wait(0), _started(false) { }
>       8	  void do_thread(Thread* th) {
>       9	    _is_waiting.signal();
>      10	    _wait.wait();
>      11	    Atomic::store(&_started, true);
>      12	  }
>      13	  void wait_until_eb_stopped() { _is_waiting.wait(); }
>      14	  void start_thread() {
>      15	    _wait.signal();
>      16	    while(!Atomic::load(&_started)) {
>      17	      os::naked_yield();
>      18	    }
>      19	  }
>      20	};
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Montag, 16. Dezember 2019 11:21
> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents
> 
> Hi Richard, as mentioned it would be better if you could do this with
> handshakes, instead of using _suspend_flag (since they are going away).
> But I can't think of a way doing it without blocking safepoints, so we need to
> add some more features in handshakes first.
> When possible I hope you are willing to move this code to handshakes instead.
> 
> You could stop one thread with, e.g.:
> class EscapeBarrierSuspendHandshake : public HandshakeClosure {
>     Semaphore _is_waiting;
>     Semaphore _wait;
>     bool _started;
>    public:
>     EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"),
> _wait(0), _started(false) { }
>     void do_thread(Thread* th) {
>       _is_waiting.signal();
>       _wait.wait();
>       Atomic::store(&_started, true);
>     }
>     void wait_until_eb_stopped() { _is_waiting.wait(); }
>     void start_thread() {
>       _wait.signal();
>       while(!Atomic::load(&_started)) {
>         os::naked_yield();
>       }
>     }
> };
> 
> But it would block safepoints.
> 
> Thanks, Robbin
> 
> On 12/10/19 10:45 PM, Reingruber, Richard wrote:
>> Hi,
>>
>> I would like to get reviews please for
>>
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>
>> Corresponding RFE:
>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>
>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>
>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
>> change is being tested at SAP since I posted the first RFR some months ago.
>>
>> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
>> agents request capabilities that allow them to access local variable values. E.g. if you start-up
>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
>> from the beginning, well before a debugger attaches -- if ever one should do so. With the
>> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
>> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
>> you'll find more details.
>>
>> Thanks,
>> Richard.
>>
>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>       http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
>>

From ioi.lam at oracle.com  Mon Dec 16 18:40:51 2019
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 16 Dec 2019 10:40:51 -0800
Subject: RFR(S) 8235970 [TESTBUG] Remove dependency of sun.tools.jar from
 RedefineClassHelper
In-Reply-To: <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com>
References: <faba6e1f-bfe1-46d7-d4b3-c594135d4026@oracle.com>
 <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com>
Message-ID: <3ca2fd1a-cbde-62e2-e8b6-dcce04742091@oracle.com>

Hi Alan,

Thanks for the review and the tip. I will use ToolProvider for
JDK-8236028 [TESTBUG] Remove dependency of sun.tools.jar from 
appcds/JarBuilder

- Ioi

On 12/15/19 11:22 PM, Alan Bateman wrote:
> On 16/12/2019 06:21, Ioi Lam wrote:
>> :
>>
>> The fix is to rewrite RedefineClassHelper to use ClassFileInstaller 
>> instead.
> This looks okay but just to point out that the jar tool can be 
> obtained via ToolProvider, e.g.
> ?? ToolProvider jarTool = ToolProvider.findFirst("jar").orElseThrow();
>
> so RedefineClassHelper, or better still ClassFileInstaller, could use 
> that for cases where JAR files need to be created or updated in ways 
> that would be easier if the jar tool could be used in the test. Avoids 
> using some of the prickly APIs in java.util.zip|jar.
>
> -Alan


From harold.seigel at oracle.com  Mon Dec 16 18:48:16 2019
From: harold.seigel at oracle.com (Harold Seigel)
Date: Mon, 16 Dec 2019 13:48:16 -0500
Subject: RFR(T) 8235922: [TESTBUG]TestRecordAttrGenericSig.java and
 TestRecordAttr.java are failing
In-Reply-To: <de7bfa80-8b7d-0585-bacd-30f51e012325@oracle.com>
References: <18eb6f8c-125f-1ec0-e618-8416aea35d9b@oracle.com>
 <bd536ea1-58d4-3e28-66a9-b7eca9322ac7@oracle.com>
 <de7bfa80-8b7d-0585-bacd-30f51e012325@oracle.com>
Message-ID: <adcb6214-a3ca-be69-f65f-fb5b0432b051@oracle.com>

Thanks Serguei!

Harold

On 12/13/2019 4:50 PM, serguei.spitsyn at oracle.com wrote:
> Hi Harold,
>
> +1
>
> Thanks,
> Serguei
>
> On 12/13/19 11:45 AM, Daniel D. Daugherty wrote:
>> On 12/13/19 2:35 PM, Harold Seigel wrote:
>>> Hi,
>>>
>>> Please review this trivial fix to prevent java/lang/instrument/... 
>>> TestRecordAttr.java and TestRecordAttrGenericSig.java from failing.? 
>>> The fix replaces hard-wired JDK version 14 with mechanisms that get 
>>> the latest JDK version.
>>>
>>> Open Webrev: 
>>> http://cr.openjdk.java.net/~hseigel/bug_8235922/webrev/index.html
>>
>> test/jdk/java/lang/instrument/RedefineRecordAttr/TestRecordAttr.java
>> ??? No comments.
>>
>> test/jdk/java/lang/instrument/RedefineRecordAttrGenericSig/TestRecordAttrGenericSig.java 
>>
>> ??? No comments.
>>
>> Thumbs up! I agree that this is a trivial fix.
>>
>> Thanks for fixing this so quickly!
>>
>> Dan
>>
>>>
>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8235922
>>>
>>> The fix was tested by running the tests locally on Linux-x64.
>>>
>>> Thanks, Harold
>>>
>>
>

From chris.plummer at oracle.com  Mon Dec 16 19:09:10 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 16 Dec 2019 11:09:10 -0800
Subject: Discuss the design of parallel and incremental jmap
 histo.(Internet mail)
In-Reply-To: <F01472E6-8A77-4677-8D50-75E7B980A88B@tencent.com>
References: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com>
 <ebf1c04a-1b99-7b37-251f-c9c0cb5a16c4@oracle.com>
 <F01472E6-8A77-4677-8D50-75E7B980A88B@tencent.com>
Message-ID: <8b179abc-cda8-0ec7-88f3-5bfe1af78eb3@oracle.com>

On 12/15/19 5:38 PM, linzang(??) wrote:
>
> Dear Chris,
>
> >> why jmap is getting stuck or "killed by timer" as you mention below. 
> Shouldn't this be considered a bug and addressed directly.
>
> This ?timer? is usually another process, my experience is HDFS and 
> ZKFC, the ZKFC pings it?s NameNode periodically, and when the 
> NameNode?s heap is large (~180GB in my case), the heap iteration by 
> jmap can cause the process stuck, so ZFKC can not get response from 
> NameNode, so the NameNode got killed.
>
This is the first I've heard mentioned of any of these. I assume 
NameNode is the process you are getting the heap dump from, and while 
dumping the heap it can't respond to ZKFC? That still sounds like 
something to me that should be addressed directly, and not worked around 
with the incremental solution. The parallel solution is ok because it 
also has a performance benefit, so if as a side affect it helps prevent 
the timeout issue, then that's ok.
>
> >> ?How useful are intermediate results? How often can users come to 
> reasonable conclusions about heap usage when the data is incomplete.
>
> From my experience, ?I usually use jmap -histo to get the information 
> about the distribution of objects, and I found usually the object 
> distribution of part of the heap is similar about the distribution of 
> the whole heap. I agree that this is not correct for all cases, but 
> since jmap -histo give results only when it?s exit normally at 
> present, ?and I think maybe info of partial heap is better than 
> nothing, especially for memory leak analysis.
>
Ok, but I still think avoiding the need for incremental dumps would be a 
better approach.

thanks,

Chris
>
> >> Is there even an indication given of how much of the heap is 
> accounted for in the output?
>
> Yes, the incremental dump information shows the number of the objects 
> and the totally bytes have been iterated.
>
> Thanks!
>
> BRs,
>
> Lin
>
> *From: *Chris Plummer <chris.plummer at oracle.com>
> *Date: *Saturday, December 14, 2019 at 2:46 AM
> *To: *"linzang(??)" <linzang at tencent.com>, 
> "serviceability-dev at openjdk.java.net" 
> <serviceability-dev at openjdk.java.net>
> *Subject: *Re: Discuss the design of parallel and incremental jmap 
> histo.(Internet mail)
>
> Hi Lin,
>
> I have a question regarding the need for incremental support. The CSR 
> states:
>
> Problem: Now, the "JMap -histo" tool can not dump intermediate result, 
> which is useful if the heap is large and dumping the whole heap can be 
> stuck.
>
> Two questions. The first is why jmap is getting stuck or "killed by 
> timer" as you mention below. Shouldn't this be considered a bug and 
> addressed directly. Second question is how useful are intermediate 
> results? How often can users come to reasonable conclusions about heap 
> usage when the data is incomplete. Is there even an indication given 
> of how much of the heap is accounted for in the output?
>
> thanks,
>
> Chris
>
> On 12/12/19 10:22 PM, linzang(??) wrote:
>
>     Dear All,
>
>     ?? I want to re-activate the thread of discussion about the
>     implementation of parallel and incremental ?Jmap -histo?.
>
>     ???The target of these changes is to solve the problems that ?jmap
>     -histo? may ? timeout or killed by timer? when heap is large. And
>     the result of ?jmap -histo? is ?one or nothing?, which means if it
>     gets killed before exit, user gets no information about the heap.
>
>     ?? The ?incremental? means that jmap -histo dumps the intermediate
>     results when it is iterating the heap, so if it is interrupted,
>     user can get some meaningful information.
>
>     ?? The ?parallel? targets to help speed up the heap iteration with
>     multi-threading.
>
>     Originally I have implemented the ?incremental dump? that dump the
>     intermediate data into a separate file like
>     <IncrementalHisto.dump>, and the final result will be saved to
>     another file <HistoResult.dump>. so when jmap -histo get
>     interrupted, user can get information from
>     <IncrementalHisto.dump>, and if jmap -histo works fine, the final
>     result would be in <HistoResult.dump>.
>
>     ?? And the parallel dump will have multiple thread working on heap
>     iteration, each thread generates intermediate data timely.
>
>     ?? The main reason of using separate file for incremental dump is
>     due to the consideration of parallel incremental dump
>     implementation, so that every heap-iteration thread could dump its
>     own data in separate file, to avoid using file lock.
>
>     However, it seems that the original design might confuse user by
>     having two or more result files (intermediated result and final
>     result).? So I want to ask your help to discuss it:
>
>      1. For incremental dump without parallel, Intermediate result and
>         the final result are dumped to the same file:
>
>     In this case, the intermediate data are generated in the middle of
>     heap iteration, they are written to file <HistoResult.dump> at the
>     same time. And if jmap -histo exits normally, the final result
>     will be also dump to <HistoResult.dump>, then all intermediate
>     data are flushed.
>
>      2. For parallel dump without incremental:
>
>     Every thread generates its own thread-local dump buffer, and all
>     thread local dump are merged and write to the <HistoResult.dump>
>     file at the end.
>
>     There is no incremental support, so the result is ?one or nothing?.
>
>      3. For parallel + incremental dump, I think it?s a little
>         complicated because of intermediate data processing:
>
>          1. Every thread has its own thread-local intermediate data
>             buffer, and all the thread-local buffers will be written
>             to <HistoResult.dump> file while holding file lock. So
>             there is only one data file generated, and if jmap -histo
>             is interrupted, ?the intermediated data are save in the
>             same file.
>
>     The problem is that the file write lock can be heavy, which may
>     cause parallel heap dump slow.
>
>          2. Every thread has its own thread-local intermediate data
>             buffer, and every thread save its result in an temp file
>             named <IntermediatedResult_[tid].dump>.
>
>     So there is no ?file lock. The parallel can be fast. But the
>     problem is that there will be multiple files generated to save the
>     thread-local intermediate results. And this might confuse the user.
>
>          3. Every thread has its own thread-local intermediate data
>             buffer, and another ?data-merging-thread? will be generated.
>
>     The parallel threads write data to its thread local buffer, and
>     enqueue the buffer when data reach some threshold. The
>     ?data-merging-thread? consumes the queue, merge the data from
>     different thread, save the merged data to the result file.
>
>     In this case, there is only one <HistoResult.dump> file generated.
>     And there is no file lock needed, but there is queue lock, and a
>     separate ?merging thread? impl. Do you think this is a reasonable
>     solution?
>
>     So may I ask your suggestion ?
>
>     Details of previous discussion can be found at
>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028276.html
>
>     Thanks!
>
>     BRs,
>
>     Lin
>
>
>


From chris.plummer at oracle.com  Mon Dec 16 19:12:24 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 16 Dec 2019 11:12:24 -0800
Subject: RFR: 8235637: jhsdb jmap from OpenJDK 11.0.5 doesn't work if
 prelink is enabled
In-Reply-To: <068d2c84-d865-3c78-1b53-eab3f3af660d@oracle.com>
References: <de7f162a-34ed-4b8a-a7da-13aed83ecf8b@default>
 <0c5f8a55-e7e6-ba87-58a8-e6a7edd55f01@oracle.com>
 <d360d727-b6e7-4f5f-a9c1-0f37727c7cce@default>
 <e1beb99a-cbd3-5254-6ee7-0b007537abb2@oracle.com>
 <43fd053e-7fd4-41fe-8bf3-7382cf77ca33@default>
 <068d2c84-d865-3c78-1b53-eab3f3af660d@oracle.com>
Message-ID: <3306a3c9-c26c-d72e-983d-3f7e4c890abe@oracle.com>

+1

On 12/16/19 8:35 AM, Kevin Walls wrote:
> Great! 8-)
>
> On 16/12/2019 15:36, Fairoz Matte wrote:
>> Oh yes,
>> Thanks Kevin for the review.
>>
>> Corrected the same - 
>> http://cr.openjdk.java.net/~fmatte/8235637/webrev.02
>>
>> Thanks,
>> Fairoz
>>


From coleen.phillimore at oracle.com  Mon Dec 16 20:21:31 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Dec 2019 15:21:31 -0500
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
 <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>
Message-ID: <e07b16a0-a2ca-6683-8a69-529fc41fd842@oracle.com>


On 12/16/19 8:13 AM, Robbin Ehn wrote:
> Hi Coleen, in VM_RedefineClasses::doit:
>
> This updates the breakpoints:
> ? MetadataOnStackMark md_on_stack(/*walk_all_metadata*/true, 
> /*redefinition_walk*/true);
>
> And this removes breakpoints:
> ? for (int i = 0; i < _class_count; i++) {
> ??? redefine_single_class(_class_defs[i].klass, _scratch_classes[i], 
> thread);
> ? }
>
> So we skip updating, since we do remove them after we updated them.
> But you are the expert here. Let me know if there is something I missed.
>

No, you are correct. The JVMTI spec says that the breakpoints are all 
deleted.? I'm remembering code that sets/clears breakpoints that has to 
walk emcp methods, and set them there also.? But redefinition does clear 
them.

If the old Method* is still executing or referenced somehow, the other 
metadata walking would find it anyway.? So maybe this was never needed.

> OopHandle just adds more code.
>

It doesn't.? And if we want to make all native memory never point 
directly to oops and point to oopStorage instead, having some 
encapsulation makes that easier.? It also makes it so that we don't have 
to stare at oop* in data structures and wonder if we're going to miss 
the mumble-fratz access and decorators that we need.? ie:

http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html

+ NativeAccess<>::oop_store(_class_holder, class_holder_oop);


This should probably be:

   41   NativeAccess<IS_DEST_UNINITIALIZED>::oop_store(handle, obj);


You can leave out using OopHandle.? I have a patch to add the missing 
functionality and add it to your code.?? Actually, I was looking to see 
how much OopHandle is used to see if it's helping anything and there is 
a lot of code using it.? Most of it is to hide oop* in ClassLoaderData.

This change otherwise looks great.
Thanks,
Coleen


> Thanks for having a look, Robbin
>
> On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote:
>>
>> I have to think about this.?? Could there be breakpoints in old emcp 
>> methods that we do not remove??? The metadata_do function is trying 
>> to keep old Methods from being deleted while there are still 
>> references to them.
>>
>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html 
>>
>>
>> + oop* _class_holder; // keeps _method memory from being deallocated
>>
>>
>> We created the class OopHandle to encapsulate strong oopStorage 
>> references, although it's missing oop_store.? Can you use that?
>
>
>>
>> Coleen
>>
>> On 12/16/19 4:47 AM, Robbin Ehn wrote:
>>> Hi all, please review.
>>>
>>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>>>
>>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is 
>>> in a vm operation) before they are installed in the safeopint and 
>>> after they have been installed, walked with 
>>> JvmtiCurrentBreakpoints::oops_do().
>>> By putting the class holder inside oopStorage there is no need for 
>>> this.
>>>
>>> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine 
>>> classes actually removes the breakpoints before updating them (so 
>>> there is no breakpoints to update).
>>> We can just remove metadata_do.
>>>
>>>
>>> I also removed some unused code.
>>>
>>> Changeset:
>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>>>
>>> Passes several runs of nsk jvmti/jdi and t1-7.
>>>
>>> Thanks, Robbin
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191216/42d626c9/attachment-0001.htm>

From david.holmes at oracle.com  Mon Dec 16 22:51:00 2019
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 17 Dec 2019 08:51:00 +1000
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
 <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
Message-ID: <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>

Hi Coleen,

A quick initial response ...

On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote:
> 
> 
> On 12/16/19 8:04 AM, David Holmes wrote:
>> Hi Coleen,
>>
>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>>> Summary: Start ServiceThread before compiler threads, and run nmethod 
>>> barriers for zgc before adding to the service thread queue, or 
>>> posting the events on the java thread queue.
>>
>> I can't comment on most of this but the earlier starting of the 
>> service thread has some concerns:
>>
>> - there is a lot of JDK level initialization which now will not have 
>> happened before the service thread is started and it is far from 
>> obvious that all possible initialization dependencies will be satisfied
> 
> I agree that the order of initialization is very sensitive.? From the 
> actions that the service thread does, the one that I found was a problem 
> was that events were posted before the LIVE phase (see comment in 
> has_events()), which could have happened with the existing code, but the 
> window for the race is a lot smaller. ? The other actions can be run if 
> there's a GC before initialization but would be a bug in the 
> initialization code, and I didn't find these bugs in all my testing. 
> There are some ordering dependencies that do have odd side effects 
> (between the compiler thread startup and initialization jsr292 classes) 
> which have comments.? This patch doesn't touch those.
> 
>>
>> - current starting of the service thread in Management::initialize is 
>> guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the 
>> service thread unconditionally for all builds. Hmm just saw your 
>> latest comment to the bug report - so the service thread is now (for 
>> quite some time?) being used for other than management tasks and so 
>> should always be present even if INCLUDE_MANAGEMENT is not enabled. Is 
>> that sufficient or are there likely to be other changes needed to 
>> actually ensure that all works correctly? e.g. any code the service 
>> thread executes that is only defined for INCLUDE_MANAGEMENT will need 
>> to be compiled out explicitly.
>>
> 
> I asked Jie offline to check the minimal build.? I don't think there are 
> other INCLUDE_MANAGEMENT actions in the service thread and I'm not sure 
> why it was initialized there in the first place.? The minimal vm would 
> have been broken ie. hashtables would not have been cleaned up, etc, but 
> I'm not sure how well that is tested or if one would notice.
>> - the service thread and the notification thread are (were?) closely 
>> related but now started at completely different times
> 
> The notification thread is limited to "services" so it makes sense where 
> it is.? The ServiceThread does lots of other things.? Maybe it needs 
> renaming in 15.
>>
>> The bug report states the problem as:
>>
>> "The graal crash is because compiled_method_load events are added to 
>> the ServiceThread's deferred event queue before the ServiceThread is 
>> created so are not walked to keep them from being zombied."
>>
>> so why isn't the solution to ensure the deferred event queue is 
>> walked? I'm not clear how starting the service thread relates to 
>> walking the queue.
>>
> 
> The service thread is responsible for walking the deferred event 
> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could be 
> changed to have some global walk somewhere of this queue, but 
> essentially this queue is processed by the service thread.

Sorry I don't follow. I thought "oops_do" and friends are for the GC 
threads and/or VMThread to call to process oops when GC updates them.

David
-----

> I had an additional change to make the queue non-static but want to 
> limit the change at this point.
> 
> Thanks,
> Coleen
>> Thanks,
>> David
>>
>>> See bug for description of the problems found with the new 
>>> Zombie.java test.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>>
>>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
>>> original test failure from bug 
>>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>>
>>> Thanks,
>>> Coleen
> 

From serguei.spitsyn at oracle.com  Mon Dec 16 23:33:31 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 16 Dec 2019 15:33:31 -0800
Subject: RFR(S) 8235970 [TESTBUG] Remove dependency of sun.tools.jar from
 RedefineClassHelper
In-Reply-To: <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com>
References: <faba6e1f-bfe1-46d7-d4b3-c594135d4026@oracle.com>
 <6e7b0a33-ab1b-6c9c-4e73-b0df4f7e2bc3@oracle.com>
Message-ID: <6d09f46e-d253-c37d-dad7-e048f6aafb47@oracle.com>

Hi Ioi,

It looks good.
It is nice to get rid of unneeded dependency.

Thanks,
Serguei


On 12/15/19 23:22, Alan Bateman wrote:
> On 16/12/2019 06:21, Ioi Lam wrote:
>> :
>>
>> The fix is to rewrite RedefineClassHelper to use ClassFileInstaller 
>> instead.
> This looks okay but just to point out that the jar tool can be 
> obtained via ToolProvider, e.g.
> ?? ToolProvider jarTool = ToolProvider.findFirst("jar").orElseThrow();
>
> so RedefineClassHelper, or better still ClassFileInstaller, could use 
> that for cases where JAR files need to be created or updated in ways 
> that would be easier if the jar tool could be used in the test. Avoids 
> using some of the prickly APIs in java.util.zip|jar.
>
> -Alan


From coleen.phillimore at oracle.com  Tue Dec 17 02:40:57 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 16 Dec 2019 21:40:57 -0500
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
 <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
 <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>
Message-ID: <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com>


Short answer below.

On 12/16/19 5:51 PM, David Holmes wrote:
> Hi Coleen,
>
> A quick initial response ...
>
> On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/16/19 8:04 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>>>> Summary: Start ServiceThread before compiler threads, and run 
>>>> nmethod barriers for zgc before adding to the service thread queue, 
>>>> or posting the events on the java thread queue.
>>>
>>> I can't comment on most of this but the earlier starting of the 
>>> service thread has some concerns:
>>>
>>> - there is a lot of JDK level initialization which now will not have 
>>> happened before the service thread is started and it is far from 
>>> obvious that all possible initialization dependencies will be satisfied
>>
>> I agree that the order of initialization is very sensitive. From the 
>> actions that the service thread does, the one that I found was a 
>> problem was that events were posted before the LIVE phase (see 
>> comment in has_events()), which could have happened with the existing 
>> code, but the window for the race is a lot smaller. ? The other 
>> actions can be run if there's a GC before initialization but would be 
>> a bug in the initialization code, and I didn't find these bugs in all 
>> my testing. There are some ordering dependencies that do have odd 
>> side effects (between the compiler thread startup and initialization 
>> jsr292 classes) which have comments.? This patch doesn't touch those.
>>
>>>
>>> - current starting of the service thread in Management::initialize 
>>> is guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the 
>>> service thread unconditionally for all builds. Hmm just saw your 
>>> latest comment to the bug report - so the service thread is now (for 
>>> quite some time?) being used for other than management tasks and so 
>>> should always be present even if INCLUDE_MANAGEMENT is not enabled. 
>>> Is that sufficient or are there likely to be other changes needed to 
>>> actually ensure that all works correctly? e.g. any code the service 
>>> thread executes that is only defined for INCLUDE_MANAGEMENT will 
>>> need to be compiled out explicitly.
>>>
>>
>> I asked Jie offline to check the minimal build.? I don't think there 
>> are other INCLUDE_MANAGEMENT actions in the service thread and I'm 
>> not sure why it was initialized there in the first place.? The 
>> minimal vm would have been broken ie. hashtables would not have been 
>> cleaned up, etc, but I'm not sure how well that is tested or if one 
>> would notice.
>>> - the service thread and the notification thread are (were?) closely 
>>> related but now started at completely different times
>>
>> The notification thread is limited to "services" so it makes sense 
>> where it is.? The ServiceThread does lots of other things.? Maybe it 
>> needs renaming in 15.
>>>
>>> The bug report states the problem as:
>>>
>>> "The graal crash is because compiled_method_load events are added to 
>>> the ServiceThread's deferred event queue before the ServiceThread is 
>>> created so are not walked to keep them from being zombied."
>>>
>>> so why isn't the solution to ensure the deferred event queue is 
>>> walked? I'm not clear how starting the service thread relates to 
>>> walking the queue.
>>>
>>
>> The service thread is responsible for walking the deferred event 
>> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could 
>> be changed to have some global walk somewhere of this queue, but 
>> essentially this queue is processed by the service thread.
>
> Sorry I don't follow. I thought "oops_do" and friends are for the GC 
> threads and/or VMThread to call to process oops when GC updates them.

The oops_do and nmethods_do() can be called by a thread walk in 
handshakes (by the sweeper thread) and by parallel GC thread walks. 
There isn't a single entry to do the thread-specific closures that we 
need to do for these deferred event queues.?? I tried a version that 
walked the queues with a static call but missed some places where it 
would be needed to make this call (didn't work).? Keeping this 
associated with the ServiceThread simplifies a lot.

thanks,
Coleen

>
> David
> -----
>
>> I had an additional change to make the queue non-static but want to 
>> limit the change at this point.
>>
>> Thanks,
>> Coleen
>>> Thanks,
>>> David
>>>
>>>> See bug for description of the problems found with the new 
>>>> Zombie.java test.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>>>
>>>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
>>>> original test failure from bug 
>>>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>>>
>>>> Thanks,
>>>> Coleen
>>


From linzang at tencent.com  Tue Dec 17 02:57:08 2019
From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=)
Date: Tue, 17 Dec 2019 02:57:08 +0000
Subject: Discuss the design of parallel and incremental jmap histo.
In-Reply-To: <8b179abc-cda8-0ec7-88f3-5bfe1af78eb3@oracle.com>
References: <72276240-61D3-4101-A435-717F16CC6FC3@tencent.com>
 <ebf1c04a-1b99-7b37-251f-c9c0cb5a16c4@oracle.com>
 <F01472E6-8A77-4677-8D50-75E7B980A88B@tencent.com>
 <8b179abc-cda8-0ec7-88f3-5bfe1af78eb3@oracle.com>
Message-ID: <C00B6677-2BD8-42E0-A6F3-7AEC00744891@tencent.com>

Dear Chris,
    I think I can first make the patch of parallel jmap. it seems to me that if parallel is fast enough, there is no need for incremental. So l will not work on it until I found new cases that show it is necessary, then we can discuss it again.
    Thanks!

BRs,
Lin


On Dec 17, 2019, at 3:09 AM, Chris Plummer <chris.plummer at oracle.com<mailto:chris.plummer at oracle.com>> wrote:

On 12/15/19 5:38 PM, linzang(??) wrote:

Dear Chris,

>> why jmap is getting stuck or "killed by timer" as you mention below. Shouldn't this be considered a bug and addressed directly.

This ?timer? is usually another process, my experience is HDFS and ZKFC, the ZKFC pings it?s NameNode periodically, and when the NameNode?s heap is large (~180GB in my case), the heap iteration by jmap can cause the process stuck, so ZFKC can not get response from NameNode, so the NameNode got killed.

This is the first I've heard mentioned of any of these. I assume NameNode is the process you are getting the heap dump from, and while dumping the heap it can't respond to ZKFC? That still sounds like something to me that should be addressed directly, and not worked around with the incremental solution. The parallel solution is ok because it also has a performance benefit, so if as a side affect it helps prevent the timeout issue, then that's ok.

>>  How useful are intermediate results? How often can users come to reasonable conclusions about heap usage when the data is incomplete.

From my experience,  I usually use jmap -histo to get the information about the distribution of objects, and I found usually the object distribution of part of the heap is similar about the distribution of the whole heap. I agree that this is not correct for all cases, but since jmap -histo give results only when it?s exit normally at present,  and I think maybe info of partial heap is better than nothing, especially for memory leak analysis.

Ok, but I still think avoiding the need for incremental dumps would be a better approach.

thanks,

Chris

>> Is there even an indication given of how much of the heap is accounted for in the output?

Yes, the incremental dump information shows the number of the objects and the totally bytes have been iterated.

Thanks!

BRs,

Lin

*From: *Chris Plummer <chris.plummer at oracle.com<mailto:chris.plummer at oracle.com>>
*Date: *Saturday, December 14, 2019 at 2:46 AM
*To: *"linzang(??)" <linzang at tencent.com<mailto:linzang at tencent.com>>, "serviceability-dev at openjdk.java.net<mailto:serviceability-dev at openjdk.java.net>" <serviceability-dev at openjdk.java.net<mailto:serviceability-dev at openjdk.java.net>>
*Subject: *Re: Discuss the design of parallel and incremental jmap histo.(Internet mail)

Hi Lin,

I have a question regarding the need for incremental support. The CSR states:

Problem: Now, the "JMap -histo" tool can not dump intermediate result, which is useful if the heap is large and dumping the whole heap can be stuck.

Two questions. The first is why jmap is getting stuck or "killed by timer" as you mention below. Shouldn't this be considered a bug and addressed directly. Second question is how useful are intermediate results? How often can users come to reasonable conclusions about heap usage when the data is incomplete. Is there even an indication given of how much of the heap is accounted for in the output?

thanks,

Chris

On 12/12/19 10:22 PM, linzang(??) wrote:

   Dear All,

      I want to re-activate the thread of discussion about the
   implementation of parallel and incremental ?Jmap -histo?.

      The target of these changes is to solve the problems that ?jmap
   -histo? may ? timeout or killed by timer? when heap is large. And
   the result of ?jmap -histo? is ?one or nothing?, which means if it
   gets killed before exit, user gets no information about the heap.

      The ?incremental? means that jmap -histo dumps the intermediate
   results when it is iterating the heap, so if it is interrupted,
   user can get some meaningful information.

      The ?parallel? targets to help speed up the heap iteration with
   multi-threading.

   Originally I have implemented the ?incremental dump? that dump the
   intermediate data into a separate file like
   <IncrementalHisto.dump>, and the final result will be saved to
   another file <HistoResult.dump>. so when jmap -histo get
   interrupted, user can get information from
   <IncrementalHisto.dump>, and if jmap -histo works fine, the final
   result would be in <HistoResult.dump>.

      And the parallel dump will have multiple thread working on heap
   iteration, each thread generates intermediate data timely.

      The main reason of using separate file for incremental dump is
   due to the consideration of parallel incremental dump
   implementation, so that every heap-iteration thread could dump its
   own data in separate file, to avoid using file lock.

   However, it seems that the original design might confuse user by
   having two or more result files (intermediated result and final
   result).  So I want to ask your help to discuss it:

    1. For incremental dump without parallel, Intermediate result and
       the final result are dumped to the same file:

   In this case, the intermediate data are generated in the middle of
   heap iteration, they are written to file <HistoResult.dump> at the
   same time. And if jmap -histo exits normally, the final result
   will be also dump to <HistoResult.dump>, then all intermediate
   data are flushed.

    2. For parallel dump without incremental:

   Every thread generates its own thread-local dump buffer, and all
   thread local dump are merged and write to the <HistoResult.dump>
   file at the end.

   There is no incremental support, so the result is ?one or nothing?.

    3. For parallel + incremental dump, I think it?s a little
       complicated because of intermediate data processing:

        1. Every thread has its own thread-local intermediate data
           buffer, and all the thread-local buffers will be written
           to <HistoResult.dump> file while holding file lock. So
           there is only one data file generated, and if jmap -histo
           is interrupted,  the intermediated data are save in the
           same file.

   The problem is that the file write lock can be heavy, which may
   cause parallel heap dump slow.

        2. Every thread has its own thread-local intermediate data
           buffer, and every thread save its result in an temp file
           named <IntermediatedResult_[tid].dump>.

   So there is no  file lock. The parallel can be fast. But the
   problem is that there will be multiple files generated to save the
   thread-local intermediate results. And this might confuse the user.

        3. Every thread has its own thread-local intermediate data
           buffer, and another ?data-merging-thread? will be generated.

   The parallel threads write data to its thread local buffer, and
   enqueue the buffer when data reach some threshold. The
   ?data-merging-thread? consumes the queue, merge the data from
   different thread, save the merged data to the result file.

   In this case, there is only one <HistoResult.dump> file generated.
   And there is no file lock needed, but there is queue lock, and a
   separate ?merging thread? impl. Do you think this is a reasonable
   solution?

   So may I ask your suggestion ?

   Details of previous discussion can be found at
   https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-June/028276.html

   Thanks!

   BRs,

   Lin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191217/dc9918e3/attachment-0001.htm>

From linzang at tencent.com  Tue Dec 17 03:18:38 2019
From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=)
Date: Tue, 17 Dec 2019 03:18:38 +0000
Subject: Jhsdb jmap --heap print large value of MaxMetaspaceSize 
Message-ID: <6EDC6A6E-668A-4E7B-847C-E9155F6E6598@tencent.com>

Dear All, 
     I found jhsdb jmap ?heap print the value of uint_max (17592186044415 MB) when MaxMetaspaceSize is not set by user. This number confused me a little.
     And I also found the jcmd VM.metaspace prints ?unimited? if MaxMetaspaceSize is not set. Which seems more reasonable.
     So Do you think it is OK if I make "jhsdb jmap" print the same ?unlimited? value as jcmd does for MaxMetaspaceSize? 

BRs,
Lin 

From david.holmes at oracle.com  Tue Dec 17 04:04:16 2019
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 17 Dec 2019 14:04:16 +1000
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
 <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
 <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>
 <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com>
Message-ID: <c92a9b6e-71b6-c97e-fed5-5b2616052032@oracle.com>

Clarification ...

On 17/12/2019 12:40 pm, coleen.phillimore at oracle.com wrote:
> 
> Short answer below.
> 
> On 12/16/19 5:51 PM, David Holmes wrote:
>> Hi Coleen,
>>
>> A quick initial response ...
>>
>> On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 12/16/19 8:04 AM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>>>>> Summary: Start ServiceThread before compiler threads, and run 
>>>>> nmethod barriers for zgc before adding to the service thread queue, 
>>>>> or posting the events on the java thread queue.
>>>>
>>>> I can't comment on most of this but the earlier starting of the 
>>>> service thread has some concerns:
>>>>
>>>> - there is a lot of JDK level initialization which now will not have 
>>>> happened before the service thread is started and it is far from 
>>>> obvious that all possible initialization dependencies will be satisfied
>>>
>>> I agree that the order of initialization is very sensitive. From the 
>>> actions that the service thread does, the one that I found was a 
>>> problem was that events were posted before the LIVE phase (see 
>>> comment in has_events()), which could have happened with the existing 
>>> code, but the window for the race is a lot smaller. ? The other 
>>> actions can be run if there's a GC before initialization but would be 
>>> a bug in the initialization code, and I didn't find these bugs in all 
>>> my testing. There are some ordering dependencies that do have odd 
>>> side effects (between the compiler thread startup and initialization 
>>> jsr292 classes) which have comments.? This patch doesn't touch those.
>>>
>>>>
>>>> - current starting of the service thread in Management::initialize 
>>>> is guarded by "#if INCLUDE_MANAGEMENT", but now you are starting the 
>>>> service thread unconditionally for all builds. Hmm just saw your 
>>>> latest comment to the bug report - so the service thread is now (for 
>>>> quite some time?) being used for other than management tasks and so 
>>>> should always be present even if INCLUDE_MANAGEMENT is not enabled. 
>>>> Is that sufficient or are there likely to be other changes needed to 
>>>> actually ensure that all works correctly? e.g. any code the service 
>>>> thread executes that is only defined for INCLUDE_MANAGEMENT will 
>>>> need to be compiled out explicitly.
>>>>
>>>
>>> I asked Jie offline to check the minimal build.? I don't think there 
>>> are other INCLUDE_MANAGEMENT actions in the service thread and I'm 
>>> not sure why it was initialized there in the first place.? The 
>>> minimal vm would have been broken ie. hashtables would not have been 
>>> cleaned up, etc, but I'm not sure how well that is tested or if one 
>>> would notice.
>>>> - the service thread and the notification thread are (were?) closely 
>>>> related but now started at completely different times
>>>
>>> The notification thread is limited to "services" so it makes sense 
>>> where it is.? The ServiceThread does lots of other things.? Maybe it 
>>> needs renaming in 15.
>>>>
>>>> The bug report states the problem as:
>>>>
>>>> "The graal crash is because compiled_method_load events are added to 
>>>> the ServiceThread's deferred event queue before the ServiceThread is 
>>>> created so are not walked to keep them from being zombied."
>>>>
>>>> so why isn't the solution to ensure the deferred event queue is 
>>>> walked? I'm not clear how starting the service thread relates to 
>>>> walking the queue.
>>>>
>>>
>>> The service thread is responsible for walking the deferred event 
>>> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could 
>>> be changed to have some global walk somewhere of this queue, but 
>>> essentially this queue is processed by the service thread.
>>
>> Sorry I don't follow. I thought "oops_do" and friends are for the GC 
>> threads and/or VMThread to call to process oops when GC updates them.
> 
> The oops_do and nmethods_do() can be called by a thread walk in 
> handshakes (by the sweeper thread) and by parallel GC thread walks. 
> There isn't a single entry to do the thread-specific closures that we 
> need to do for these deferred event queues.?? I tried a version that 
> walked the queues with a static call but missed some places where it 
> would be needed to make this call (didn't work).? Keeping this 
> associated with the ServiceThread simplifies a lot.

Just to clarify that further, the thread walk requires the thread 
appears in ALL_JAVA_THREADS but that only happens after the 
ServiceThread has been started. So in essence we don't really need the 
ServiceThread to have commenced execution earlier, but we need it to 
have been created. Those two steps are combined in practice.

Cheers,
David

> thanks,
> Coleen
> 
>>
>> David
>> -----
>>
>>> I had an additional change to make the queue non-static but want to 
>>> limit the change at this point.
>>>
>>> Thanks,
>>> Coleen
>>>> Thanks,
>>>> David
>>>>
>>>>> See bug for description of the problems found with the new 
>>>>> Zombie.java test.
>>>>>
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>>>>
>>>>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
>>>>> original test failure from bug 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>
> 

From chris.plummer at oracle.com  Tue Dec 17 05:36:44 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 16 Dec 2019 21:36:44 -0800
Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA javascript
 support since it will always fail, and will likely be removed soon
Message-ID: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>

Hello,

Please review the following:

https://bugs.openjdk.java.net/browse/JDK-8236062
http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/

Since SA javascript support is broken as described in [1] JDK-8235594, 
and we'll likely remove it, I'd like to at least get it disabled now. 
I'd like to get this into 14 mostly because I really want to get [2] 
JDK-8234048 fixed in 14 because we'll start seeing the clhsdb test 
failures on macos 10.14 and 10.15 more often over the coming months as 
we deploy more macosx test hosts with those versions. However, [2] 
JDK-8234048 is blocked by [3] JDK-8234277 (which improves error checking 
and failure output for the clhsdb tests), and [3] JDK-8234277 is blocked 
by this CR because the exceptions produced when javascript fails to 
initialize end up cluttering the clhsdb test logs, even when the test 
passes (and is misleading when the test fails).

Sorry about all the bug references and inter-dependencies. It's taken a 
while myself to get my head wrapped around how I wanted to approach 
fixing them all in a meaningful order.

thanks,

Chris

[1] https://bugs.openjdk.java.net/browse/JDK-8235594
[2] https://bugs.openjdk.java.net/browse/JDK-8234048
[3] https://bugs.openjdk.java.net/browse/JDK-8234277

From david.holmes at oracle.com  Tue Dec 17 07:03:00 2019
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 17 Dec 2019 17:03:00 +1000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
Message-ID: <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>

<resend as my mailer crashed during last send>

David

On 17/12/2019 4:57 pm, David Holmes wrote:
> Hi Richard,
> 
> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
>> Hi David,
>>
>> ?? > Some further queries/concerns:
>> ?? >
>> ?? > src/hotspot/share/runtime/objectMonitor.cpp
>> ?? >
>> ?? > Can you please explain the changes to ObjectMonitor::wait:
>> ?? >
>> ?? > !?? _recursions = save????? // restore the old recursion count
>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>> ?? > increased by the deferred relock count
>> ?? >
>> ?? > what is the "deferred relock count"? I gather it relates to
>> ?? >
>> ?? > "The code was extended to be able to deoptimize objects of a 
>> frame that
>> ?? > is not the top frame and to let another thread than the owning 
>> thread do
>> ?? > it."
>>
>> Yes, these relate. Currently EA based optimizations are reverted, when 
>> a compiled frame is replaced
>> with corresponding interpreter frames. Part of this is relocking 
>> objects with eliminated
>> locking. New with the enhancement is that we do this also just before 
>> object references are acquired
>> through JVMTI. In this case we deoptimize also the owning compiled 
>> frame C and we register
>> deoptimized objects as deferred updates. When control returns to C it 
>> gets deoptimized, we notice
>> that objects are already deoptimized (reallocated and relocked), so we 
>> don't do it again (relocking
>> twice would be incorrect of course). Deferred updates are copied into 
>> the new interpreter frames.
>>
>> Problem: relocking is not possible if the target thread T is waiting 
>> on the monitor that needs to be
>> relocked. This happens only with non-local objects with 
>> EliminateNestedLocks. Instead relocking is
>> deferred until T owns the monitor again. This is what the piece of 
>> code above does.
> 
> Sorry I need some more detail here. How can you wait() on an object 
> monitor if the object allocation and/or locking was optimised away? And 
> what is a "non-local object" in this context? Isn't EA restricted to 
> thread-confined objects?
> 
> Is it just that some of the locking gets optimized away e.g.
> 
> synchronised(obj) {
>  ? synchronised(obj) {
>  ??? synchronised(obj) {
>  ????? obj.wait();
>  ??? }
>  ? }
> }
> 
> If this is reduced to a form as-if it were a single lock of the monitor 
> (due to EA) and the wait() triggers a JVM TI event which leads to the 
> escape of "obj" then we need to reconstruct the true lock state, and so 
> when the wait() internally unblocks and reacquires the monitor it has to 
> set the true recursion count to 3, not the 1 that it appeared to be when 
> wait() was initially called. Is that the scenario?
> 
> If so I find this truly awful. Anyone using wait() in a realistic form 
> requires a notification and so the object cannot be thread confined. In 
> which case I would strongly argue that upon hitting the wait() the deopt 
> should occur unconditionally and so the lock state is correct before we 
> wait and so we don't need to mess with the recursion count internally 
> when we reacquire the monitor.
> 
>>
>> ?? > which I don't like the sound of at all when it comes to 
>> ObjectMonitor
>> ?? > state. So I'd like to understand in detail exactly what is going 
>> on here
>> ?? > and why.? This is a very intrusive change that seems to badly break
>> ?? > encapsulation and impacts future changes to ObjectMonitor that 
>> are under
>> ?? > investigation.
>>
>> I would not regard this as breaking encapsulation. Certainly not badly.
>>
>> I've added a property relock_count_after_wait to JavaThread. The 
>> property is well
>> encapsulated. Future ObjectMonitor implementations have to deal with 
>> recursion too. They are free in
>> choosing a way to do that as long as that property is taken into 
>> account. This is hardly a
>> limitation.
> 
> I do think this badly breaks encapsulation as you have to add a callout 
> from the guts of the ObjectMonitor code to reach into the thread to get 
> this lock count adjustment. I understand why you have had to do this but 
> I would much rather see a change to the EA optimisation strategy so that 
> this is not needed.
> 
>> Note also that the property is a straight forward extension of the 
>> existing concept of deferred
>> local updates. It is embedded into the structure holding them. So not 
>> even the footprint of a
>> JavaThread is enlarged if no deferred updates are generated.
>>
>> ?? > ---
>> ?? >
>> ?? > src/hotspot/share/runtime/thread.cpp
>> ?? >
>> ?? > Can you please explain why 
>> JavaThread::wait_for_object_deoptimization
>> ?? > has to be handcrafted in this way rather than using proper 
>> transitions.
>> ?? >
>>
>> I wrote wait_for_object_deoptimization taking 
>> JavaThread::java_suspend_self_with_safepoint_check
>> as template. So in short: for the same reasons :)
>>
>> Threads reach both methods as part of thread state transitions, 
>> therefore special handling is
>> required to change thread state on top of ongoing transitions.
>>
>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing 
>> to see
>> ?? > it being added back (effectively). This seems like it may be 
>> something
>> ?? > that handshakes could be used for.
>>
>> Deopt suspend used to be something rather different with a similar 
>> name[1]. It is not being added back.
> 
> I stand corrected. Despite comments in the code to the contrary 
> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of 
> cleanup in this area 13 years ago :)
> 
>>
>> I'm actually duplicating the existing external suspend mechanism, 
>> because a thread can be suspended
>> at most once. And hey, and don't like that either! But it seems not 
>> unlikely that the duplicate can
>> be removed together with the original and the new type of handshakes 
>> that will be used for
>> thread suspend can be used for object deoptimization too. See today's 
>> discussion in JDK-8227745 [2].
> 
> I hope that discussion bears some fruit, at the moment it seems not to 
> be possible to use handshakes here. :(
> 
> The external suspend mechanism is a royal pain in the proverbial that we 
> have to carefully live with. The idea that we're duplicating that for 
> use in another fringe area of functionality does not thrill me at all.
> 
> To be clear, I understand the problem that exists and that you wish to 
> solve, but for the runtime parts I balk at the complexity cost of 
> solving it.
> 
> Thanks,
> David
> -----
> 
>> Thanks, Richard.
>>
>> [1] Deopt suspend was something like an async. handshake for 
>> architectures with register windows,
>> ???? where patching the return pc for deoptimization of a compiled 
>> frame was racy if the owner thread
>> ???? was in native code. Instead a "deopt" suspend flag was set on 
>> which the thread patched its own
>> ???? frame upon return from native. So no thread was suspended. It got 
>> its name only from the name of
>> ???? the flags.
>>
>> [2] Discussion about using handshakes to sync. with the target thread:
>>      
>> https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727 
>>
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Freitag, 13. Dezember 2019 00:56
>> To: Reingruber, Richard <richard.reingruber at sap.com>; 
>> serviceability-dev at openjdk.java.net; 
>> hotspot-compiler-dev at openjdk.java.net; 
>> hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>> Performance in the Presence of JVMTI Agents
>>
>> Hi Richard,
>>
>> Some further queries/concerns:
>>
>> src/hotspot/share/runtime/objectMonitor.cpp
>>
>> Can you please explain the changes to ObjectMonitor::wait:
>>
>> !?? _recursions = save????? // restore the old recursion count
>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>> increased by the deferred relock count
>>
>> what is the "deferred relock count"? I gather it relates to
>>
>> "The code was extended to be able to deoptimize objects of a frame that
>> is not the top frame and to let another thread than the owning thread do
>> it."
>>
>> which I don't like the sound of at all when it comes to ObjectMonitor
>> state. So I'd like to understand in detail exactly what is going on here
>> and why.? This is a very intrusive change that seems to badly break
>> encapsulation and impacts future changes to ObjectMonitor that are under
>> investigation.
>>
>> ---
>>
>> src/hotspot/share/runtime/thread.cpp
>>
>> Can you please explain why JavaThread::wait_for_object_deoptimization
>> has to be handcrafted in this way rather than using proper transitions.
>>
>> We got rid of "deopt suspend" some time ago and it is disturbing to see
>> it being added back (effectively). This seems like it may be something
>> that handshakes could be used for.
>>
>> Thanks,
>> David
>> -----
>>
>> On 12/12/2019 7:02 am, David Holmes wrote:
>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>>>> Hi David,
>>>>
>>>> ??? > Most of the details here are in areas I can comment on in detail,
>>>> but I
>>>> ??? > did take an initial general look at things.
>>>>
>>>> Thanks for taking the time!
>>>
>>> Apologies the above should read:
>>>
>>> "Most of the details here are in areas I *can't* comment on in detail 
>>> ..."
>>>
>>> David
>>>
>>>> ??? > The only thing that jumped out at me is that I think the
>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>> ??? >
>>>> ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>
>>>> Yes, it should. Will add the method like above.
>>>>
>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>> Without
>>>> ??? > active testing this will just bit-rot.
>>>>
>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>> workload. I will add a minimal test
>>>> to keep it fresh.
>>>>
>>>> ??? > Also on the tests I don't understand your @requires clause:
>>>> ??? >
>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>> ??? > (vm.opt.TieredCompilation != true))
>>>> ??? >
>>>> ??? > This seems to require that TieredCompilation is disabled, but
>>>> tiered is
>>>> ??? > our normal mode of operation. ??
>>>> ??? >
>>>>
>>>> I removed the clause. I guess I wanted to target the tests towards the
>>>> code they are supposed to
>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>> with just one compiler thread.
>>>>
>>>> Additionally I will make use of
>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>> -----Original Message-----
>>>> From: David Holmes <david.holmes at oracle.com>
>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>> serviceability-dev at openjdk.java.net;
>>>> hotspot-compiler-dev at openjdk.java.net;
>>>> hotspot-runtime-dev at openjdk.java.net
>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>> Performance in the Presence of JVMTI Agents
>>>>
>>>> Hi Richard,
>>>>
>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>> Hi,
>>>>>
>>>>> I would like to get reviews please for
>>>>>
>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>
>>>>> Corresponding RFE:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>
>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>
>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>> issues (thanks!). In addition the
>>>>> change is being tested at SAP since I posted the first RFR some
>>>>> months ago.
>>>>>
>>>>> The intention of this enhancement is to benefit performance wise from
>>>>> escape analysis even if JVMTI
>>>>> agents request capabilities that allow them to access local variable
>>>>> values. E.g. if you start-up
>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>> escape analysis is disabled right
>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>> should do so. With the
>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>> debugger attaches. EA based
>>>>> optimizations are reverted just before an agent acquires the
>>>>> reference to an object. In the JBS item
>>>>> you'll find more details.
>>>>
>>>> Most of the details here are in areas I can comment on in detail, but I
>>>> did take an initial general look at things.
>>>>
>>>> The only thing that jumped out at me is that I think the
>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>
>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>
>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>>>> Without
>>>> active testing this will just bit-rot.
>>>>
>>>> Also on the tests I don't understand your @requires clause:
>>>>
>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>> (vm.opt.TieredCompilation != true))
>>>>
>>>> This seems to require that TieredCompilation is disabled, but tiered is
>>>> our normal mode of operation. ??
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 
>>>>>
>>>>>
>>>>>

From suenaga at oss.nttdata.com  Tue Dec 17 07:36:12 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Tue, 17 Dec 2019 16:36:12 +0900
Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA
 javascript support since it will always fail, and will likely be removed soon
In-Reply-To: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
References: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
Message-ID: <286e679f-e887-dee3-afe0-b98cd67d9f0a@oss.nttdata.com>

Hi Chris,

Looks good.

BTW do you have any plan to provide alternative(s) on CLHSDB?


Thanks,

Yasumasa


On 2019/12/17 14:36, Chris Plummer wrote:
> Hello,
> 
> Please review the following:
> 
> https://bugs.openjdk.java.net/browse/JDK-8236062
> http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/
> 
> Since SA javascript support is broken as described in [1] JDK-8235594, and we'll likely remove it, I'd like to at least get it disabled now. I'd like to get this into 14 mostly because I really want to get [2] JDK-8234048 fixed in 14 because we'll start seeing the clhsdb test failures on macos 10.14 and 10.15 more often over the coming months as we deploy more macosx test hosts with those versions. However, [2] JDK-8234048 is blocked by [3] JDK-8234277 (which improves error checking and failure output for the clhsdb tests), and [3] JDK-8234277 is blocked by this CR because the exceptions produced when javascript fails to initialize end up cluttering the clhsdb test logs, even when the test passes (and is misleading when the test fails).
> 
> Sorry about all the bug references and inter-dependencies. It's taken a while myself to get my head wrapped around how I wanted to approach fixing them all in a meaningful order.
> 
> thanks,
> 
> Chris
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8235594
> [2] https://bugs.openjdk.java.net/browse/JDK-8234048
> [3] https://bugs.openjdk.java.net/browse/JDK-8234277

From chris.plummer at oracle.com  Tue Dec 17 07:55:06 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 16 Dec 2019 23:55:06 -0800
Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA
 javascript support since it will always fail, and will likely be removed soon
In-Reply-To: <286e679f-e887-dee3-afe0-b98cd67d9f0a@oss.nttdata.com>
References: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
 <286e679f-e887-dee3-afe0-b98cd67d9f0a@oss.nttdata.com>
Message-ID: <f47995b2-9b61-9361-1562-4922ffe4294c@oracle.com>

Hi Yasumasa,

Thanks for the review. If you mean plans for an alternate extension 
mechanism, we have none at the moment. We are of course open to 
suggestions. Some have been discussed on the recent thread regarding 
this topic, but there doesn't seem to be much consensus on how to 
approach this.

thanks,

Chris

On 12/16/19 11:36 PM, Yasumasa Suenaga wrote:
> Hi Chris,
>
> Looks good.
>
> BTW do you have any plan to provide alternative(s) on CLHSDB?
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2019/12/17 14:36, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8236062
>> http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/
>>
>> Since SA javascript support is broken as described in [1] 
>> JDK-8235594, and we'll likely remove it, I'd like to at least get it 
>> disabled now. I'd like to get this into 14 mostly because I really 
>> want to get [2] JDK-8234048 fixed in 14 because we'll start seeing 
>> the clhsdb test failures on macos 10.14 and 10.15 more often over the 
>> coming months as we deploy more macosx test hosts with those 
>> versions. However, [2] JDK-8234048 is blocked by [3] JDK-8234277 
>> (which improves error checking and failure output for the clhsdb 
>> tests), and [3] JDK-8234277 is blocked by this CR because the 
>> exceptions produced when javascript fails to initialize end up 
>> cluttering the clhsdb test logs, even when the test passes (and is 
>> misleading when the test fails).
>>
>> Sorry about all the bug references and inter-dependencies. It's taken 
>> a while myself to get my head wrapped around how I wanted to approach 
>> fixing them all in a meaningful order.
>>
>> thanks,
>>
>> Chris
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8235594
>> [2] https://bugs.openjdk.java.net/browse/JDK-8234048
>> [3] https://bugs.openjdk.java.net/browse/JDK-8234277


From per.liden at oracle.com  Tue Dec 17 08:14:37 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 17 Dec 2019 09:14:37 +0100
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
Message-ID: <7535d8d8-f245-78a0-fe58-f0625af43b3a@oracle.com>

Hi Coleen,

The "nmethod entry barrier"-part looks good to me. Just one minor nit, 
maybe JvmtiDeferredEventQueue::run_nmethod_entry_barrier should have an 
"s" on it (i.e. JvmtiDeferredEventQueue::run_nmethod_entry_barriers) 
since it loops over all entries in the queue?

But I don't dare to comment on the ServiceThread initialization order.

cheers,
Per

On 12/16/19 12:41 PM, coleen.phillimore at oracle.com wrote:
> Summary: Start ServiceThread before compiler threads, and run nmethod 
> barriers for zgc before adding to the service thread queue, or posting 
> the events on the java thread queue.
> 
> See bug for description of the problems found with the new Zombie.java 
> test.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
> 
> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
> original test failure from bug 
> https://bugs.openjdk.java.net/browse/JDK-8173361.
> 
> Thanks,
> Coleen

From robbin.ehn at oracle.com  Tue Dec 17 09:21:32 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 17 Dec 2019 10:21:32 +0100
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <e07b16a0-a2ca-6683-8a69-529fc41fd842@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
 <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>
 <e07b16a0-a2ca-6683-8a69-529fc41fd842@oracle.com>
Message-ID: <05aa6993-e1e0-c43e-0040-8fee71401f4c@oracle.com>

Hi Coleen,

On 12/16/19 9:21 PM, coleen.phillimore at oracle.com wrote:
> 
> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html
> 
> + NativeAccess<>::oop_store(_class_holder, class_holder_oop);
> 
> 
> This should probably be:
> 
>    41   NativeAccess<IS_DEST_UNINITIALIZED>::oop_store(handle, obj);
> 

I have not seen any stores to oopStorage that use that?
oopStorage should be 'initialized'.
So I prefer not adding another decorator if it's not needed.
That would just be confusing.

> 
> You can leave out using OopHandle.? I have a patch to add the missing 
> functionality and add it to your code.?? Actually, I was looking to see how much 
> OopHandle is used to see if it's helping anything and there is a lot of code 
> using it.? Most of it is to hide oop* in ClassLoaderData.
> 
> This change otherwise looks great.

Thanks, Robbin

> Thanks,
> Coleen
> 
> 
>> Thanks for having a look, Robbin
>>
>> On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> I have to think about this.?? Could there be breakpoints in old emcp methods 
>>> that we do not remove??? The metadata_do function is trying to keep old 
>>> Methods from being deleted while there are still references to them.
>>>
>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html 
>>>
>>>
>>> + oop* _class_holder; // keeps _method memory from being deallocated
>>>
>>>
>>> We created the class OopHandle to encapsulate strong oopStorage references, 
>>> although it's missing oop_store.? Can you use that?
>>
>>
>>>
>>> Coleen
>>>
>>> On 12/16/19 4:47 AM, Robbin Ehn wrote:
>>>> Hi all, please review.
>>>>
>>>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>>>>
>>>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in a vm 
>>>> operation) before they are installed in the safeopint and after they have 
>>>> been installed, walked with JvmtiCurrentBreakpoints::oops_do().
>>>> By putting the class holder inside oopStorage there is no need for this.
>>>>
>>>> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine classes 
>>>> actually removes the breakpoints before updating them (so there is no 
>>>> breakpoints to update).
>>>> We can just remove metadata_do.
>>>>
>>>>
>>>> I also removed some unused code.
>>>>
>>>> Changeset:
>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>>>>
>>>> Passes several runs of nsk jvmti/jdi and t1-7.
>>>>
>>>> Thanks, Robbin
>>>
> 

From richard.reingruber at sap.com  Tue Dec 17 10:24:53 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Tue, 17 Dec 2019 10:24:53 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <9f24ec2c-d737-f9b7-8821-5905264971a7@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com>
 <DB7PR02MB3612D26A0522C4B17B924E9A9B510@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <9f24ec2c-d737-f9b7-8821-5905264971a7@oracle.com>
Message-ID: <DB7PR02MB3612E86F4A19099A16FCB9029B500@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi Robbin,

  > Sorry I don't immediately see what issue there is in doing a handshake 
  > instead of:
  > VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread, 
  > owned_monitors_list);

VM_GetOwnedMonitorInfo /can/ be replaced by a handshake, but the calling_thread T1 needs to walk
java_thread T2's stack /before/ to reallocate and relock objects, because the GC interface does not
allow the VMThread to allocate from the java heap.

T1:
1. reallocate scalar replaced objects of T2   //  not possible as part of handshake/vmop,
                                              //  because GC interface does not allow VMThread
                                              //  to allocate from heap
2. execute VM_GetOwnedMonitorInfo() or equivalent handshake

while T2 is /not/ pushing new frames with EA based optimizations.

  > > 
  > > 1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot move anymore?
  > > 
  > > 2. Handshakes between two threads are synchronous, correct? If so, then T1 will block handshaking
  > >     T2, because either T2 or the VMThread will block in L10.
  > 
  > Yes, sorry, I forgot/confused myself about asynch handshake.
  > (I have a test prototype for that, which removes suspend flag)
  > 
  > > 
  > > I cannot figure out, how you mean this. Only if a helper thread H would handshake T2 then T1 could
  > > continue and call wait_until_eb_stopped(). But returning from there T1 would block if reallocating
  > > objects triggers GC or attempting to execute the vm operation in
  > > JvmtiEnv::GetOwnedMonitorStackDepthInfo().
  > > 
  > > It might be impossible to replace my suspend flag with handshakes that are available today, because
  > > if it was you could replace all the suspend flags right away, couldn't you?
  > 
  > So adding asynch handshakes and a per thread handshake queue, we can. 
  > (which this test prototype does)

Yes, should work for my use case too.

  > The issue I'm thinking of is if we need selective polling first.
  > Suspend flags are not checked in every transition, e.g. vm->native.
  > A JVM TI agent don't expect to suspend it's own thread when suspending
  > all threads.
  > (that thread would be suspended when trying to get back to agent code
  > when it does vm->native transition)

Note that JVM TI doesn't offer "suspending all threads" directly. It offers SuspendThreadList [1]
which can be used to self-suspend: "If the calling thread is specified in the request_list array,
this function will not return until some other thread resumes it"

Thanks, Richard.

[1] https://docs.oracle.com/en/java/javase/13/docs/specs/jvmti.html#SuspendThreadList

-----Original Message-----
From: Robbin Ehn <robbin.ehn at oracle.com> 
Sent: Montag, 16. Dezember 2019 18:21
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

On 2019-12-16 14:41, Reingruber, Richard wrote:
> Hi Robbin,
> 
> first of all: thanks a lot for providing feedback. I do appreciate it.
> 
> I am absolutely willing to move this to handshakes. Only I still can't see how to achieve it.
> 
> Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? [1]
> 
> I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() where calling_thread
> T1 would apply it on another thread T2.

Sorry I don't immediately see what issue there is in doing a handshake 
instead of:
VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread, 
owned_monitors_list);

> 
> 1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot move anymore?
> 
> 2. Handshakes between two threads are synchronous, correct? If so, then T1 will block handshaking
>     T2, because either T2 or the VMThread will block in L10.

Yes, sorry, I forgot/confused myself about asynch handshake.
(I have a test prototype for that, which removes suspend flag)

> 
> I cannot figure out, how you mean this. Only if a helper thread H would handshake T2 then T1 could
> continue and call wait_until_eb_stopped(). But returning from there T1 would block if reallocating
> objects triggers GC or attempting to execute the vm operation in
> JvmtiEnv::GetOwnedMonitorStackDepthInfo().
> 
> It might be impossible to replace my suspend flag with handshakes that are available today, because
> if it was you could replace all the suspend flags right away, couldn't you?

So adding asynch handshakes and a per thread handshake queue, we can. 
(which this test prototype does)
The issue I'm thinking of is if we need selective polling first.
Suspend flags are not checked in every transition, e.g. vm->native.
A JVM TI agent don't expect to suspend it's own thread when suspending
all threads.
(that thread would be suspended when trying to get back to agent code
when it does vm->native transition)

> 
> Or I'm simply missing something... quite possible... :)

No I think you got it right.

Thanks, Robbin

> 
> Thanks, Richard.
> 
> [1] Drafted by Robbin (thanks!)
> 
>       1	class EscapeBarrierSuspendHandshake : public HandshakeClosure {
>       2	  Semaphore _is_waiting;
>       3	  Semaphore _wait;
>       4	  bool _started;
>       5	public:
>       6	  EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"),
>       7	  _wait(0), _started(false) { }
>       8	  void do_thread(Thread* th) {
>       9	    _is_waiting.signal();
>      10	    _wait.wait();
>      11	    Atomic::store(&_started, true);
>      12	  }
>      13	  void wait_until_eb_stopped() { _is_waiting.wait(); }
>      14	  void start_thread() {
>      15	    _wait.signal();
>      16	    while(!Atomic::load(&_started)) {
>      17	      os::naked_yield();
>      18	    }
>      19	  }
>      20	};
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Montag, 16. Dezember 2019 11:21
> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents
> 
> Hi Richard, as mentioned it would be better if you could do this with
> handshakes, instead of using _suspend_flag (since they are going away).
> But I can't think of a way doing it without blocking safepoints, so we need to
> add some more features in handshakes first.
> When possible I hope you are willing to move this code to handshakes instead.
> 
> You could stop one thread with, e.g.:
> class EscapeBarrierSuspendHandshake : public HandshakeClosure {
>     Semaphore _is_waiting;
>     Semaphore _wait;
>     bool _started;
>    public:
>     EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"),
> _wait(0), _started(false) { }
>     void do_thread(Thread* th) {
>       _is_waiting.signal();
>       _wait.wait();
>       Atomic::store(&_started, true);
>     }
>     void wait_until_eb_stopped() { _is_waiting.wait(); }
>     void start_thread() {
>       _wait.signal();
>       while(!Atomic::load(&_started)) {
>         os::naked_yield();
>       }
>     }
> };
> 
> But it would block safepoints.
> 
> Thanks, Robbin
> 
> On 12/10/19 10:45 PM, Reingruber, Richard wrote:
>> Hi,
>>
>> I would like to get reviews please for
>>
>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>
>> Corresponding RFE:
>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>
>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>
>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
>> change is being tested at SAP since I posted the first RFR some months ago.
>>
>> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
>> agents request capabilities that allow them to access local variable values. E.g. if you start-up
>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
>> from the beginning, well before a debugger attaches -- if ever one should do so. With the
>> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
>> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
>> you'll find more details.
>>
>> Thanks,
>> Richard.
>>
>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>       http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
>>

From robbin.ehn at oracle.com  Tue Dec 17 11:01:03 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Tue, 17 Dec 2019 12:01:03 +0100
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB7PR02MB3612E86F4A19099A16FCB9029B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <2fbd9ac0-1cb0-98c4-c2c4-becd1cf49fef@oracle.com>
 <DB7PR02MB3612D26A0522C4B17B924E9A9B510@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <9f24ec2c-d737-f9b7-8821-5905264971a7@oracle.com>
 <DB7PR02MB3612E86F4A19099A16FCB9029B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <08d4f482-36a0-6499-0546-2a888da2a094@oracle.com>

Hi Richard,

On 12/17/19 11:24 AM, Reingruber, Richard wrote:

>    > So adding asynch handshakes and a per thread handshake queue, we can.
>    > (which this test prototype does)
> 
> Yes, should work for my use case too.

Great.

> 
>    > The issue I'm thinking of is if we need selective polling first.
>    > Suspend flags are not checked in every transition, e.g. vm->native.
>    > A JVM TI agent don't expect to suspend it's own thread when suspending
>    > all threads.
>    > (that thread would be suspended when trying to get back to agent code
>    > when it does vm->native transition)
> 
> Note that JVM TI doesn't offer "suspending all threads" directly. It offers SuspendThreadList [1]
> which can be used to self-suspend: "If the calling thread is specified in the request_list array,
> this function will not return until some other thread resumes it"

Maybe there is a test-bug here or it was more complicated scenario.
I have to investigate, but suspending threads in all transitions causes a chunk 
of test failure in jdi/jvmti.
The issue was suspending threads going vm->native (back to agent code).

Thanks, Robbin

> 
> Thanks, Richard.
> 
> [1] https://docs.oracle.com/en/java/javase/13/docs/specs/jvmti.html#SuspendThreadList
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Montag, 16. Dezember 2019 18:21
> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> On 2019-12-16 14:41, Reingruber, Richard wrote:
>> Hi Robbin,
>>
>> first of all: thanks a lot for providing feedback. I do appreciate it.
>>
>> I am absolutely willing to move this to handshakes. Only I still can't see how to achieve it.
>>
>> Could you explain the drafted class EscapeBarrierSuspendHandshake a little bit? [1]
>>
>> I'd like to look at it by example of JvmtiEnv::GetOwnedMonitorStackDepthInfo() where calling_thread
>> T1 would apply it on another thread T2.
> 
> Sorry I don't immediately see what issue there is in doing a handshake
> instead of:
> VM_GetOwnedMonitorInfo op(this, calling_thread, java_thread,
> owned_monitors_list);
> 
>>
>> 1. L13: is wait_until_eb_stopped to be called by T1 to wait until T2 cannot move anymore?
>>
>> 2. Handshakes between two threads are synchronous, correct? If so, then T1 will block handshaking
>>      T2, because either T2 or the VMThread will block in L10.
> 
> Yes, sorry, I forgot/confused myself about asynch handshake.
> (I have a test prototype for that, which removes suspend flag)
> 
>>
>> I cannot figure out, how you mean this. Only if a helper thread H would handshake T2 then T1 could
>> continue and call wait_until_eb_stopped(). But returning from there T1 would block if reallocating
>> objects triggers GC or attempting to execute the vm operation in
>> JvmtiEnv::GetOwnedMonitorStackDepthInfo().
>>
>> It might be impossible to replace my suspend flag with handshakes that are available today, because
>> if it was you could replace all the suspend flags right away, couldn't you?
> 
> So adding asynch handshakes and a per thread handshake queue, we can.
> (which this test prototype does)
> The issue I'm thinking of is if we need selective polling first.
> Suspend flags are not checked in every transition, e.g. vm->native.
> A JVM TI agent don't expect to suspend it's own thread when suspending
> all threads.
> (that thread would be suspended when trying to get back to agent code
> when it does vm->native transition)
> 
>>
>> Or I'm simply missing something... quite possible... :)
> 
> No I think you got it right.
> 
> Thanks, Robbin
> 
>>
>> Thanks, Richard.
>>
>> [1] Drafted by Robbin (thanks!)
>>
>>        1	class EscapeBarrierSuspendHandshake : public HandshakeClosure {
>>        2	  Semaphore _is_waiting;
>>        3	  Semaphore _wait;
>>        4	  bool _started;
>>        5	public:
>>        6	  EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"),
>>        7	  _wait(0), _started(false) { }
>>        8	  void do_thread(Thread* th) {
>>        9	    _is_waiting.signal();
>>       10	    _wait.wait();
>>       11	    Atomic::store(&_started, true);
>>       12	  }
>>       13	  void wait_until_eb_stopped() { _is_waiting.wait(); }
>>       14	  void start_thread() {
>>       15	    _wait.signal();
>>       16	    while(!Atomic::load(&_started)) {
>>       17	      os::naked_yield();
>>       18	    }
>>       19	  }
>>       20	};
>>
>> -----Original Message-----
>> From: Robbin Ehn <robbin.ehn at oracle.com>
>> Sent: Montag, 16. Dezember 2019 11:21
>> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents
>>
>> Hi Richard, as mentioned it would be better if you could do this with
>> handshakes, instead of using _suspend_flag (since they are going away).
>> But I can't think of a way doing it without blocking safepoints, so we need to
>> add some more features in handshakes first.
>> When possible I hope you are willing to move this code to handshakes instead.
>>
>> You could stop one thread with, e.g.:
>> class EscapeBarrierSuspendHandshake : public HandshakeClosure {
>>      Semaphore _is_waiting;
>>      Semaphore _wait;
>>      bool _started;
>>     public:
>>      EscapeBarrierSuspendHandshake() : HandshakeClosure("EscapeBarrierSuspend"),
>> _wait(0), _started(false) { }
>>      void do_thread(Thread* th) {
>>        _is_waiting.signal();
>>        _wait.wait();
>>        Atomic::store(&_started, true);
>>      }
>>      void wait_until_eb_stopped() { _is_waiting.wait(); }
>>      void start_thread() {
>>        _wait.signal();
>>        while(!Atomic::load(&_started)) {
>>          os::naked_yield();
>>        }
>>      }
>> };
>>
>> But it would block safepoints.
>>
>> Thanks, Robbin
>>
>> On 12/10/19 10:45 PM, Reingruber, Richard wrote:
>>> Hi,
>>>
>>> I would like to get reviews please for
>>>
>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>
>>> Corresponding RFE:
>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>
>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>
>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
>>> change is being tested at SAP since I posted the first RFR some months ago.
>>>
>>> The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
>>> agents request capabilities that allow them to access local variable values. E.g. if you start-up
>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
>>> from the beginning, well before a debugger attaches -- if ever one should do so. With the
>>> enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
>>> optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
>>> you'll find more details.
>>>
>>> Thanks,
>>> Richard.
>>>
>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>        http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
>>>

From coleen.phillimore at oracle.com  Tue Dec 17 13:59:00 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 17 Dec 2019 08:59:00 -0500
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <05aa6993-e1e0-c43e-0040-8fee71401f4c@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
 <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>
 <e07b16a0-a2ca-6683-8a69-529fc41fd842@oracle.com>
 <05aa6993-e1e0-c43e-0040-8fee71401f4c@oracle.com>
Message-ID: <624f3db0-351e-7f57-61f2-26323a75854a@oracle.com>


On 12/17/19 4:21 AM, Robbin Ehn wrote:
> Hi Coleen,
>
> On 12/16/19 9:21 PM, coleen.phillimore at oracle.com wrote:
>>
>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html 
>>
>>
>> + NativeAccess<>::oop_store(_class_holder, class_holder_oop);
>>
>>
>> This should probably be:
>>
>> ?? 41 NativeAccess<IS_DEST_UNINITIALIZED>::oop_store(handle, obj);
>>
>
> I have not seen any stores to oopStorage that use that?
> oopStorage should be 'initialized'.
> So I prefer not adding another decorator if it's not needed.
> That would just be confusing.

It's in the ClassLoaderData initialization.? I don't know what it means, 
you'd have to ask someone who knows.? I don't think it matters though.

Coleen
>
>>
>> You can leave out using OopHandle.? I have a patch to add the missing 
>> functionality and add it to your code.?? Actually, I was looking to 
>> see how much OopHandle is used to see if it's helping anything and 
>> there is a lot of code using it.? Most of it is to hide oop* in 
>> ClassLoaderData.
>>
>> This change otherwise looks great.
>
> Thanks, Robbin
>
>> Thanks,
>> Coleen
>>
>>
>>> Thanks for having a look, Robbin
>>>
>>> On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> I have to think about this.?? Could there be breakpoints in old 
>>>> emcp methods that we do not remove??? The metadata_do function is 
>>>> trying to keep old Methods from being deleted while there are still 
>>>> references to them.
>>>>
>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html 
>>>>
>>>>
>>>> + oop* _class_holder; // keeps _method memory from being deallocated
>>>>
>>>>
>>>> We created the class OopHandle to encapsulate strong oopStorage 
>>>> references, although it's missing oop_store.? Can you use that?
>>>
>>>
>>>>
>>>> Coleen
>>>>
>>>> On 12/16/19 4:47 AM, Robbin Ehn wrote:
>>>>> Hi all, please review.
>>>>>
>>>>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>>>>>
>>>>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint 
>>>>> is in a vm operation) before they are installed in the safeopint 
>>>>> and after they have been installed, walked with 
>>>>> JvmtiCurrentBreakpoints::oops_do().
>>>>> By putting the class holder inside oopStorage there is no need for 
>>>>> this.
>>>>>
>>>>> JvmtiCurrentBreakpoints::metadata_do is not needed because 
>>>>> redefine classes actually removes the breakpoints before updating 
>>>>> them (so there is no breakpoints to update).
>>>>> We can just remove metadata_do.
>>>>>
>>>>>
>>>>> I also removed some unused code.
>>>>>
>>>>> Changeset:
>>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>>>>>
>>>>> Passes several runs of nsk jvmti/jdi and t1-7.
>>>>>
>>>>> Thanks, Robbin
>>>>
>>


From coleen.phillimore at oracle.com  Tue Dec 17 14:08:21 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 17 Dec 2019 09:08:21 -0500
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <7535d8d8-f245-78a0-fe58-f0625af43b3a@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <7535d8d8-f245-78a0-fe58-f0625af43b3a@oracle.com>
Message-ID: <a4661b00-7e9e-1367-b2fd-3821e63fb2d3@oracle.com>


On 12/17/19 3:14 AM, Per Liden wrote:
> Hi Coleen,
>
> The "nmethod entry barrier"-part looks good to me. Just one minor nit, 
> maybe JvmtiDeferredEventQueue::run_nmethod_entry_barrier should have 
> an "s" on it (i.e. 
> JvmtiDeferredEventQueue::run_nmethod_entry_barriers) since it loops 
> over all entries in the queue?

I changed both entries in jvmtiImpl.hpp to run_nmethod_entry_barriers to 
avoid confusion on my part.? It then calls the nmethod version that is 
singular.?? Thanks for the suggestion of names.
>
> But I don't dare to comment on the ServiceThread initialization order.

I moved it before the compiler and jvmti initialization, and jvmti won't 
post events until the LIVE phase.? I tried to be very careful!

Thanks,
Coleen
>
> cheers,
> Per
>
> On 12/16/19 12:41 PM, coleen.phillimore at oracle.com wrote:
>> Summary: Start ServiceThread before compiler threads, and run nmethod 
>> barriers for zgc before adding to the service thread queue, or 
>> posting the events on the java thread queue.
>>
>> See bug for description of the problems found with the new 
>> Zombie.java test.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>
>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
>> original test failure from bug 
>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>
>> Thanks,
>> Coleen


From richard.reingruber at sap.com  Tue Dec 17 14:47:37 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Tue, 17 Dec 2019 14:47:37 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
Message-ID: <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi David,

  > >    > Some further queries/concerns:
  > >    >
  > >    > src/hotspot/share/runtime/objectMonitor.cpp
  > >    >
  > >    > Can you please explain the changes to ObjectMonitor::wait:
  > >    >
  > >    > !   _recursions = save      // restore the old recursion count
  > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
  > >    > increased by the deferred relock count
  > >    >
  > >    > what is the "deferred relock count"? I gather it relates to
  > >    >
  > >    > "The code was extended to be able to deoptimize objects of a 
  > > frame that
  > >    > is not the top frame and to let another thread than the owning 
  > > thread do
  > >    > it."
  > >
  > > Yes, these relate. Currently EA based optimizations are reverted, when a compiled frame is
  > > replaced with corresponding interpreter frames. Part of this is relocking objects with eliminated
  > > locking. New with the enhancement is that we do this also just before object references are
  > > acquired through JVMTI. In this case we deoptimize also the owning compiled frame C and we
  > > register deoptimized objects as deferred updates. When control returns to C it gets deoptimized,
  > > we notice that objects are already deoptimized (reallocated and relocked), so we don't do it again
  > > (relocking twice would be incorrect of course). Deferred updates are copied into the new
  > > interpreter frames.
  > >
  > > Problem: relocking is not possible if the target thread T is waiting on the monitor that needs to
  > > be relocked. This happens only with non-local objects with EliminateNestedLocks. Instead relocking
  > > is deferred until T owns the monitor again. This is what the piece of code above does.
  >  
  >  Sorry I need some more detail here. How can you wait() on an object 
  >  monitor if the object allocation and/or locking was optimised away? And 
  >  what is a "non-local object" in this context? Isn't EA restricted to 
  >  thread-confined objects?

"Non-local object" is an object that escapes its thread. The issue I'm addressing with the changes
in ObjectMonitor::wait are almost unrelated to EA. They are caused by EliminateNestedLocks, where C2
eliminates recursive locking of an already owned lock. The lock owning object exists on the heap, it
is locked and you can call wait() on it.

EliminateLocks is the C2 option that controls lock elimination based on EA.  Both optimizations have
in common that objects with eliminated locking need to be relocked when deoptimizing a frame,
i.e. when replacing a compiled frame with equivalent interpreter
frames. Deoptimization::relock_objects does that job for /all/ eliminated locks in scope. /All/ can
be a mix of eliminated nested locks and locks of not-escaping objects.

New with the enhancement: I call relock_objects earlier, just before objects pontentially
escape. But then later when the owning compiled frame gets deoptimized, I must not do it again:

See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:

 373   if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && EliminateLocks))
 374       && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
 375     bool unused;
 376     eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, unused);
 377   }

Now when calling relock_objects early it is quiet possible that I have to relock an object the
target thread currently waits for. Obviously I cannot relock in this case, instead I chose to
introduce relock_count_after_wait to JavaThread.

  >  Is it just that some of the locking gets optimized away e.g.
  >  
  >  synchronised(obj) {
  >     synchronised(obj) {
  >       synchronised(obj) {
  >         obj.wait();
  >       }
  >     }
  >  }
  >  
  >  If this is reduced to a form as-if it were a single lock of the monitor 
  >  (due to EA) and the wait() triggers a JVM TI event which leads to the 
  >  escape of "obj" then we need to reconstruct the true lock state, and so 
  >  when the wait() internally unblocks and reacquires the monitor it has to 
  >  set the true recursion count to 3, not the 1 that it appeared to be when 
  >  wait() was initially called. Is that the scenario?

Kind of... except that the locking is not eliminated due to EA and there is no JVM TI event
triggered by wait.

Add

LocalObject l1 = new LocalObject();

in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This triggers the code in
question.

See that relocking/reallocating is transactional. If it is done then for /all/ objects in scope and it is
done at most once. It wouldn't be quite so easy to split this in relocking of nested/EA-based
eliminated locks.

  >  If so I find this truly awful. Anyone using wait() in a realistic form 
  >  requires a notification and so the object cannot be thread confined. In

It is not thread confined.

  >  which case I would strongly argue that upon hitting the wait() the deopt 
  >  should occur unconditionally and so the lock state is correct before we 
  >  wait and so we don't need to mess with the recursion count internally 
  >  when we reacquire the monitor.
  >  
  > >
  > >    > which I don't like the sound of at all when it comes to ObjectMonitor
  > >    > state. So I'd like to understand in detail exactly what is going on here
  > >    > and why.  This is a very intrusive change that seems to badly break
  > >    > encapsulation and impacts future changes to ObjectMonitor that are under
  > >    > investigation.
  > >
  > > I would not regard this as breaking encapsulation. Certainly not badly.
  > >
  > > I've added a property relock_count_after_wait to JavaThread. The property is well
  > > encapsulated. Future ObjectMonitor implementations have to deal with recursion too. They are free
  > > in choosing a way to do that as long as that property is taken into account. This is hardly a
  > > limitation.
  >  
  >  I do think this badly breaks encapsulation as you have to add a callout 
  >  from the guts of the ObjectMonitor code to reach into the thread to get 
  >  this lock count adjustment. I understand why you have had to do this but 
  >  I would much rather see a change to the EA optimisation strategy so that 
  >  this is not needed.
  >  
  > > Note also that the property is a straight forward extension of the existing concept of deferred
  > > local updates. It is embedded into the structure holding them. So not even the footprint of a
  > > JavaThread is enlarged if no deferred updates are generated.
  > 
  > [...]
  >  
  > >
  > > I'm actually duplicating the existing external suspend mechanism, because a thread can be
  > > suspended at most once. And hey, and don't like that either! But it seems not unlikely that the
  > > duplicate can be removed together with the original and the new type of handshakes that will be
  > > used for thread suspend can be used for object deoptimization too. See today's discussion in
  > > JDK-8227745 [2].
  >  
  >  I hope that discussion bears some fruit, at the moment it seems not to 
  >  be possible to use handshakes here. :(
  >  
  >  The external suspend mechanism is a royal pain in the proverbial that we 
  >  have to carefully live with. The idea that we're duplicating that for 
  >  use in another fringe area of functionality does not thrill me at all.
  >  
  >  To be clear, I understand the problem that exists and that you wish to 
  >  solve, but for the runtime parts I balk at the complexity cost of 
  >  solving it.

I know it's complex, but by far no rocket science.

Also I find it hard to imagine another fix for JDK-8233915 besides changing the JVM TI specification.
 
Thanks, Richard.

-----Original Message-----
From: David Holmes <david.holmes at oracle.com> 
Sent: Dienstag, 17. Dezember 2019 08:03
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>
Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

<resend as my mailer crashed during last send>

David

On 17/12/2019 4:57 pm, David Holmes wrote:
> Hi Richard,
> 
> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
>> Hi David,
>>
>> ?? > Some further queries/concerns:
>> ?? >
>> ?? > src/hotspot/share/runtime/objectMonitor.cpp
>> ?? >
>> ?? > Can you please explain the changes to ObjectMonitor::wait:
>> ?? >
>> ?? > !?? _recursions = save????? // restore the old recursion count
>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>> ?? > increased by the deferred relock count
>> ?? >
>> ?? > what is the "deferred relock count"? I gather it relates to
>> ?? >
>> ?? > "The code was extended to be able to deoptimize objects of a 
>> frame that
>> ?? > is not the top frame and to let another thread than the owning 
>> thread do
>> ?? > it."
>>
>> Yes, these relate. Currently EA based optimizations are reverted, when 
>> a compiled frame is replaced
>> with corresponding interpreter frames. Part of this is relocking 
>> objects with eliminated
>> locking. New with the enhancement is that we do this also just before 
>> object references are acquired
>> through JVMTI. In this case we deoptimize also the owning compiled 
>> frame C and we register
>> deoptimized objects as deferred updates. When control returns to C it 
>> gets deoptimized, we notice
>> that objects are already deoptimized (reallocated and relocked), so we 
>> don't do it again (relocking
>> twice would be incorrect of course). Deferred updates are copied into 
>> the new interpreter frames.
>>
>> Problem: relocking is not possible if the target thread T is waiting 
>> on the monitor that needs to be
>> relocked. This happens only with non-local objects with 
>> EliminateNestedLocks. Instead relocking is
>> deferred until T owns the monitor again. This is what the piece of 
>> code above does.
> 
> Sorry I need some more detail here. How can you wait() on an object 
> monitor if the object allocation and/or locking was optimised away? And 
> what is a "non-local object" in this context? Isn't EA restricted to 
> thread-confined objects?
> 
> Is it just that some of the locking gets optimized away e.g.
> 
> synchronised(obj) {
>  ? synchronised(obj) {
>  ??? synchronised(obj) {
>  ????? obj.wait();
>  ??? }
>  ? }
> }
> 
> If this is reduced to a form as-if it were a single lock of the monitor 
> (due to EA) and the wait() triggers a JVM TI event which leads to the 
> escape of "obj" then we need to reconstruct the true lock state, and so 
> when the wait() internally unblocks and reacquires the monitor it has to 
> set the true recursion count to 3, not the 1 that it appeared to be when 
> wait() was initially called. Is that the scenario?
> 
> If so I find this truly awful. Anyone using wait() in a realistic form 
> requires a notification and so the object cannot be thread confined. In 
> which case I would strongly argue that upon hitting the wait() the deopt 
> should occur unconditionally and so the lock state is correct before we 
> wait and so we don't need to mess with the recursion count internally 
> when we reacquire the monitor.
> 
>>
>> ?? > which I don't like the sound of at all when it comes to 
>> ObjectMonitor
>> ?? > state. So I'd like to understand in detail exactly what is going 
>> on here
>> ?? > and why.? This is a very intrusive change that seems to badly break
>> ?? > encapsulation and impacts future changes to ObjectMonitor that 
>> are under
>> ?? > investigation.
>>
>> I would not regard this as breaking encapsulation. Certainly not badly.
>>
>> I've added a property relock_count_after_wait to JavaThread. The 
>> property is well
>> encapsulated. Future ObjectMonitor implementations have to deal with 
>> recursion too. They are free in
>> choosing a way to do that as long as that property is taken into 
>> account. This is hardly a
>> limitation.
> 
> I do think this badly breaks encapsulation as you have to add a callout 
> from the guts of the ObjectMonitor code to reach into the thread to get 
> this lock count adjustment. I understand why you have had to do this but 
> I would much rather see a change to the EA optimisation strategy so that 
> this is not needed.
> 
>> Note also that the property is a straight forward extension of the 
>> existing concept of deferred
>> local updates. It is embedded into the structure holding them. So not 
>> even the footprint of a
>> JavaThread is enlarged if no deferred updates are generated.
>>
>> ?? > ---
>> ?? >
>> ?? > src/hotspot/share/runtime/thread.cpp
>> ?? >
>> ?? > Can you please explain why 
>> JavaThread::wait_for_object_deoptimization
>> ?? > has to be handcrafted in this way rather than using proper 
>> transitions.
>> ?? >
>>
>> I wrote wait_for_object_deoptimization taking 
>> JavaThread::java_suspend_self_with_safepoint_check
>> as template. So in short: for the same reasons :)
>>
>> Threads reach both methods as part of thread state transitions, 
>> therefore special handling is
>> required to change thread state on top of ongoing transitions.
>>
>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing 
>> to see
>> ?? > it being added back (effectively). This seems like it may be 
>> something
>> ?? > that handshakes could be used for.
>>
>> Deopt suspend used to be something rather different with a similar 
>> name[1]. It is not being added back.
> 
> I stand corrected. Despite comments in the code to the contrary 
> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of 
> cleanup in this area 13 years ago :)
> 
>>
>> I'm actually duplicating the existing external suspend mechanism, 
>> because a thread can be suspended
>> at most once. And hey, and don't like that either! But it seems not 
>> unlikely that the duplicate can
>> be removed together with the original and the new type of handshakes 
>> that will be used for
>> thread suspend can be used for object deoptimization too. See today's 
>> discussion in JDK-8227745 [2].
> 
> I hope that discussion bears some fruit, at the moment it seems not to 
> be possible to use handshakes here. :(
> 
> The external suspend mechanism is a royal pain in the proverbial that we 
> have to carefully live with. The idea that we're duplicating that for 
> use in another fringe area of functionality does not thrill me at all.
> 
> To be clear, I understand the problem that exists and that you wish to 
> solve, but for the runtime parts I balk at the complexity cost of 
> solving it.
> 
> Thanks,
> David
> -----
> 
>> Thanks, Richard.
>>
>> [1] Deopt suspend was something like an async. handshake for 
>> architectures with register windows,
>> ???? where patching the return pc for deoptimization of a compiled 
>> frame was racy if the owner thread
>> ???? was in native code. Instead a "deopt" suspend flag was set on 
>> which the thread patched its own
>> ???? frame upon return from native. So no thread was suspended. It got 
>> its name only from the name of
>> ???? the flags.
>>
>> [2] Discussion about using handshakes to sync. with the target thread:
>>      
>> https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727 
>>
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Freitag, 13. Dezember 2019 00:56
>> To: Reingruber, Richard <richard.reingruber at sap.com>; 
>> serviceability-dev at openjdk.java.net; 
>> hotspot-compiler-dev at openjdk.java.net; 
>> hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better 
>> Performance in the Presence of JVMTI Agents
>>
>> Hi Richard,
>>
>> Some further queries/concerns:
>>
>> src/hotspot/share/runtime/objectMonitor.cpp
>>
>> Can you please explain the changes to ObjectMonitor::wait:
>>
>> !?? _recursions = save????? // restore the old recursion count
>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>> increased by the deferred relock count
>>
>> what is the "deferred relock count"? I gather it relates to
>>
>> "The code was extended to be able to deoptimize objects of a frame that
>> is not the top frame and to let another thread than the owning thread do
>> it."
>>
>> which I don't like the sound of at all when it comes to ObjectMonitor
>> state. So I'd like to understand in detail exactly what is going on here
>> and why.? This is a very intrusive change that seems to badly break
>> encapsulation and impacts future changes to ObjectMonitor that are under
>> investigation.
>>
>> ---
>>
>> src/hotspot/share/runtime/thread.cpp
>>
>> Can you please explain why JavaThread::wait_for_object_deoptimization
>> has to be handcrafted in this way rather than using proper transitions.
>>
>> We got rid of "deopt suspend" some time ago and it is disturbing to see
>> it being added back (effectively). This seems like it may be something
>> that handshakes could be used for.
>>
>> Thanks,
>> David
>> -----
>>
>> On 12/12/2019 7:02 am, David Holmes wrote:
>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>>>> Hi David,
>>>>
>>>> ??? > Most of the details here are in areas I can comment on in detail,
>>>> but I
>>>> ??? > did take an initial general look at things.
>>>>
>>>> Thanks for taking the time!
>>>
>>> Apologies the above should read:
>>>
>>> "Most of the details here are in areas I *can't* comment on in detail 
>>> ..."
>>>
>>> David
>>>
>>>> ??? > The only thing that jumped out at me is that I think the
>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>> ??? >
>>>> ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>
>>>> Yes, it should. Will add the method like above.
>>>>
>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>> Without
>>>> ??? > active testing this will just bit-rot.
>>>>
>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>> workload. I will add a minimal test
>>>> to keep it fresh.
>>>>
>>>> ??? > Also on the tests I don't understand your @requires clause:
>>>> ??? >
>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>> ??? > (vm.opt.TieredCompilation != true))
>>>> ??? >
>>>> ??? > This seems to require that TieredCompilation is disabled, but
>>>> tiered is
>>>> ??? > our normal mode of operation. ??
>>>> ??? >
>>>>
>>>> I removed the clause. I guess I wanted to target the tests towards the
>>>> code they are supposed to
>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>> with just one compiler thread.
>>>>
>>>> Additionally I will make use of
>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>> -----Original Message-----
>>>> From: David Holmes <david.holmes at oracle.com>
>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>> serviceability-dev at openjdk.java.net;
>>>> hotspot-compiler-dev at openjdk.java.net;
>>>> hotspot-runtime-dev at openjdk.java.net
>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>> Performance in the Presence of JVMTI Agents
>>>>
>>>> Hi Richard,
>>>>
>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>> Hi,
>>>>>
>>>>> I would like to get reviews please for
>>>>>
>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>
>>>>> Corresponding RFE:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>
>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>
>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>> issues (thanks!). In addition the
>>>>> change is being tested at SAP since I posted the first RFR some
>>>>> months ago.
>>>>>
>>>>> The intention of this enhancement is to benefit performance wise from
>>>>> escape analysis even if JVMTI
>>>>> agents request capabilities that allow them to access local variable
>>>>> values. E.g. if you start-up
>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>> escape analysis is disabled right
>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>> should do so. With the
>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>> debugger attaches. EA based
>>>>> optimizations are reverted just before an agent acquires the
>>>>> reference to an object. In the JBS item
>>>>> you'll find more details.
>>>>
>>>> Most of the details here are in areas I can comment on in detail, but I
>>>> did take an initial general look at things.
>>>>
>>>> The only thing that jumped out at me is that I think the
>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>
>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>
>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. 
>>>> Without
>>>> active testing this will just bit-rot.
>>>>
>>>> Also on the tests I don't understand your @requires clause:
>>>>
>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>> (vm.opt.TieredCompilation != true))
>>>>
>>>> This seems to require that TieredCompilation is disabled, but tiered is
>>>> our normal mode of operation. ??
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch 
>>>>>
>>>>>
>>>>>

From coleen.phillimore at oracle.com  Tue Dec 17 15:27:08 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 17 Dec 2019 10:27:08 -0500
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <c92a9b6e-71b6-c97e-fed5-5b2616052032@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
 <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
 <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>
 <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com>
 <c92a9b6e-71b6-c97e-fed5-5b2616052032@oracle.com>
Message-ID: <53208d1a-b969-d2e7-8657-d95f2d2d22e8@oracle.com>


On 12/16/19 11:04 PM, David Holmes wrote:
> Clarification ...
>
> On 17/12/2019 12:40 pm, coleen.phillimore at oracle.com wrote:
>>
>> Short answer below.
>>
>> On 12/16/19 5:51 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> A quick initial response ...
>>>
>>> On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 12/16/19 8:04 AM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Start ServiceThread before compiler threads, and run 
>>>>>> nmethod barriers for zgc before adding to the service thread 
>>>>>> queue, or posting the events on the java thread queue.
>>>>>
>>>>> I can't comment on most of this but the earlier starting of the 
>>>>> service thread has some concerns:
>>>>>
>>>>> - there is a lot of JDK level initialization which now will not 
>>>>> have happened before the service thread is started and it is far 
>>>>> from obvious that all possible initialization dependencies will be 
>>>>> satisfied
>>>>
>>>> I agree that the order of initialization is very sensitive. From 
>>>> the actions that the service thread does, the one that I found was 
>>>> a problem was that events were posted before the LIVE phase (see 
>>>> comment in has_events()), which could have happened with the 
>>>> existing code, but the window for the race is a lot smaller. ? The 
>>>> other actions can be run if there's a GC before initialization but 
>>>> would be a bug in the initialization code, and I didn't find these 
>>>> bugs in all my testing. There are some ordering dependencies that 
>>>> do have odd side effects (between the compiler thread startup and 
>>>> initialization jsr292 classes) which have comments.? This patch 
>>>> doesn't touch those.
>>>>
>>>>>
>>>>> - current starting of the service thread in Management::initialize 
>>>>> is guarded by "#if INCLUDE_MANAGEMENT", but now you are starting 
>>>>> the service thread unconditionally for all builds. Hmm just saw 
>>>>> your latest comment to the bug report - so the service thread is 
>>>>> now (for quite some time?) being used for other than management 
>>>>> tasks and so should always be present even if INCLUDE_MANAGEMENT 
>>>>> is not enabled. Is that sufficient or are there likely to be other 
>>>>> changes needed to actually ensure that all works correctly? e.g. 
>>>>> any code the service thread executes that is only defined for 
>>>>> INCLUDE_MANAGEMENT will need to be compiled out explicitly.
>>>>>
>>>>
>>>> I asked Jie offline to check the minimal build.? I don't think 
>>>> there are other INCLUDE_MANAGEMENT actions in the service thread 
>>>> and I'm not sure why it was initialized there in the first place.? 
>>>> The minimal vm would have been broken ie. hashtables would not have 
>>>> been cleaned up, etc, but I'm not sure how well that is tested or 
>>>> if one would notice.
>>>>> - the service thread and the notification thread are (were?) 
>>>>> closely related but now started at completely different times
>>>>
>>>> The notification thread is limited to "services" so it makes sense 
>>>> where it is.? The ServiceThread does lots of other things.? Maybe 
>>>> it needs renaming in 15.
>>>>>
>>>>> The bug report states the problem as:
>>>>>
>>>>> "The graal crash is because compiled_method_load events are added 
>>>>> to the ServiceThread's deferred event queue before the 
>>>>> ServiceThread is created so are not walked to keep them from being 
>>>>> zombied."
>>>>>
>>>>> so why isn't the solution to ensure the deferred event queue is 
>>>>> walked? I'm not clear how starting the service thread relates to 
>>>>> walking the queue.
>>>>>
>>>>
>>>> The service thread is responsible for walking the deferred event 
>>>> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could 
>>>> be changed to have some global walk somewhere of this queue, but 
>>>> essentially this queue is processed by the service thread.
>>>
>>> Sorry I don't follow. I thought "oops_do" and friends are for the GC 
>>> threads and/or VMThread to call to process oops when GC updates them.
>>
>> The oops_do and nmethods_do() can be called by a thread walk in 
>> handshakes (by the sweeper thread) and by parallel GC thread walks. 
>> There isn't a single entry to do the thread-specific closures that we 
>> need to do for these deferred event queues.?? I tried a version that 
>> walked the queues with a static call but missed some places where it 
>> would be needed to make this call (didn't work).? Keeping this 
>> associated with the ServiceThread simplifies a lot.
>
> Just to clarify that further, the thread walk requires the thread 
> appears in ALL_JAVA_THREADS but that only happens after the 
> ServiceThread has been started. So in essence we don't really need the 
> ServiceThread to have commenced execution earlier, but we need it to 
> have been created. Those two steps are combined in practice.

Yes.? Then the ServiceThread waits on the Service_lock until notified by 
these events:

 ????? while (((sensors_changed = (!UseNotificationThread && 
LowMemoryDetector::has_pending_requests())) |
 ????????????? (has_jvmti_events = _jvmti_service_queue.has_events()) |
 ????????????? (has_gc_notification_event = (!UseNotificationThread && 
GCNotifier::has_event())) |
 ????????????? (has_dcmd_notification_event = (!UseNotificationThread && 
DCmdFactory::has_pending_jmx_notification())) |
 ????????????? (stringtable_work = StringTable::has_work()) |
 ????????????? (symboltable_work = SymbolTable::has_work()) |
 ????????????? (resolved_method_table_work = 
ResolvedMethodTable::has_work()) |
 ????????????? (thread_id_table_work = ThreadIdTable::has_work()) |
 ????????????? (protection_domain_table_work = 
SystemDictionary::pd_cache_table()->has_work()) |
 ????????????? (oopstorage_work = OopStorage::has_cleanup_work_and_reset())
 ???????????? ) == 0) {

The first, third and fourth events are from management.cpp events that 
were initialized after the ServiceThread was started.
The second event I have changed, to wait until LIVE phase to return true.
The stringtable, symboltable, resolved_method_table, thread_id and pd 
table have static _has_work variables initialized to false.
The oopstorage_work has similar, but may update a time-based counter a 
bit earlier with the service thread starting earlier.? I think this is 
harmless.

It is possible that after the service thread starts and before the 
compiler thread starts, there could be a GC that notifies the 
stringtable to clean up.? This seems like a good thing that the GC would 
clean up these tables with this order.? I ran the tier4 graal tests and 
there were no failures.

Thanks,
Coleen
>
> Cheers,
> David
>
>> thanks,
>> Coleen
>>
>>>
>>> David
>>> -----
>>>
>>>> I had an additional change to make the queue non-static but want to 
>>>> limit the change at this point.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> See bug for description of the problems found with the new 
>>>>>> Zombie.java test.
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>>>>>
>>>>>> Ran tier1 all platforms, and tier2-8 testing, as well as 
>>>>>> rerunning original test failure from bug 
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>
>>


From alexey.menkov at oracle.com  Tue Dec 17 21:30:27 2019
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Tue, 17 Dec 2019 13:30:27 -0800
Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA
 javascript support since it will always fail, and will likely be removed soon
In-Reply-To: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
References: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
Message-ID: <0b5b53e5-5045-6d13-6107-c11c7621d53d@oracle.com>

LGTM

--alex

On 12/16/2019 21:36, Chris Plummer wrote:
> Hello,
> 
> Please review the following:
> 
> https://bugs.openjdk.java.net/browse/JDK-8236062
> http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/
> 
> Since SA javascript support is broken as described in [1] JDK-8235594, 
> and we'll likely remove it, I'd like to at least get it disabled now. 
> I'd like to get this into 14 mostly because I really want to get [2] 
> JDK-8234048 fixed in 14 because we'll start seeing the clhsdb test 
> failures on macos 10.14 and 10.15 more often over the coming months as 
> we deploy more macosx test hosts with those versions. However, [2] 
> JDK-8234048 is blocked by [3] JDK-8234277 (which improves error checking 
> and failure output for the clhsdb tests), and [3] JDK-8234277 is blocked 
> by this CR because the exceptions produced when javascript fails to 
> initialize end up cluttering the clhsdb test logs, even when the test 
> passes (and is misleading when the test fails).
> 
> Sorry about all the bug references and inter-dependencies. It's taken a 
> while myself to get my head wrapped around how I wanted to approach 
> fixing them all in a meaningful order.
> 
> thanks,
> 
> Chris
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8235594
> [2] https://bugs.openjdk.java.net/browse/JDK-8234048
> [3] https://bugs.openjdk.java.net/browse/JDK-8234277

From serguei.spitsyn at oracle.com  Wed Dec 18 01:17:52 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Dec 2019 17:17:52 -0800
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <624f3db0-351e-7f57-61f2-26323a75854a@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
 <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>
 <e07b16a0-a2ca-6683-8a69-529fc41fd842@oracle.com>
 <05aa6993-e1e0-c43e-0040-8fee71401f4c@oracle.com>
 <624f3db0-351e-7f57-61f2-26323a75854a@oracle.com>
Message-ID: <fca3e87b-d469-9fe3-3fd8-5868fba2614f@oracle.com>

Hi Coleen,

Is this webrev v2 right to look at? :
 ? http://cr.openjdk.java.net/~coleenp/2019/8235829.02/webrev/

It looks good to me.
Just one nit (sorry, if it is a duplicated comment):

http://cr.openjdk.java.net/~coleenp/2019/8235829.02/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html

1066 void JvmtiDeferredEventQueue::run_nmethod_entry_barrier() {
1067 for(QueueNode* node = _queue_head; node != NULL; node = node->next()) {
1068 node->event().run_nmethod_entry_barrier();
1069 }
1070 }

The function run_nmethod_entry_barrier() should have a plural form as it 
iterates over the queue.


Thanks
Serguei


On 12/17/19 5:59 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 12/17/19 4:21 AM, Robbin Ehn wrote:
>> Hi Coleen,
>>
>> On 12/16/19 9:21 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html 
>>>
>>>
>>> + NativeAccess<>::oop_store(_class_holder, class_holder_oop);
>>>
>>>
>>> This should probably be:
>>>
>>> ?? 41 NativeAccess<IS_DEST_UNINITIALIZED>::oop_store(handle, obj);
>>>
>>
>> I have not seen any stores to oopStorage that use that?
>> oopStorage should be 'initialized'.
>> So I prefer not adding another decorator if it's not needed.
>> That would just be confusing.
>
> It's in the ClassLoaderData initialization.? I don't know what it 
> means, you'd have to ask someone who knows.? I don't think it matters 
> though.
>
> Coleen
>>
>>>
>>> You can leave out using OopHandle.? I have a patch to add the 
>>> missing functionality and add it to your code.?? Actually, I was 
>>> looking to see how much OopHandle is used to see if it's helping 
>>> anything and there is a lot of code using it.? Most of it is to hide 
>>> oop* in ClassLoaderData.
>>>
>>> This change otherwise looks great.
>>
>> Thanks, Robbin
>>
>>> Thanks,
>>> Coleen
>>>
>>>
>>>> Thanks for having a look, Robbin
>>>>
>>>> On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> I have to think about this.?? Could there be breakpoints in old 
>>>>> emcp methods that we do not remove??? The metadata_do function is 
>>>>> trying to keep old Methods from being deleted while there are 
>>>>> still references to them.
>>>>>
>>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html 
>>>>>
>>>>>
>>>>> + oop* _class_holder; // keeps _method memory from being deallocated
>>>>>
>>>>>
>>>>> We created the class OopHandle to encapsulate strong oopStorage 
>>>>> references, although it's missing oop_store. Can you use that?
>>>>
>>>>
>>>>>
>>>>> Coleen
>>>>>
>>>>> On 12/16/19 4:47 AM, Robbin Ehn wrote:
>>>>>> Hi all, please review.
>>>>>>
>>>>>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>>>>>>
>>>>>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint 
>>>>>> is in a vm operation) before they are installed in the safeopint 
>>>>>> and after they have been installed, walked with 
>>>>>> JvmtiCurrentBreakpoints::oops_do().
>>>>>> By putting the class holder inside oopStorage there is no need 
>>>>>> for this.
>>>>>>
>>>>>> JvmtiCurrentBreakpoints::metadata_do is not needed because 
>>>>>> redefine classes actually removes the breakpoints before updating 
>>>>>> them (so there is no breakpoints to update).
>>>>>> We can just remove metadata_do.
>>>>>>
>>>>>>
>>>>>> I also removed some unused code.
>>>>>>
>>>>>> Changeset:
>>>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>>>>>>
>>>>>> Passes several runs of nsk jvmti/jdi and t1-7.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>
>>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191217/83b609a7/attachment.htm>

From coleen.phillimore at oracle.com  Wed Dec 18 02:09:55 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 17 Dec 2019 21:09:55 -0500
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <fca3e87b-d469-9fe3-3fd8-5868fba2614f@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
 <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>
 <e07b16a0-a2ca-6683-8a69-529fc41fd842@oracle.com>
 <05aa6993-e1e0-c43e-0040-8fee71401f4c@oracle.com>
 <624f3db0-351e-7f57-61f2-26323a75854a@oracle.com>
 <fca3e87b-d469-9fe3-3fd8-5868fba2614f@oracle.com>
Message-ID: <10582410-4904-5aa8-dcf5-00cf72e0eebe@oracle.com>

Hi Serguei,
The review thread should be "RFR 8235829: graal crashes with Zombie.java 
test".

On 12/17/19 8:17 PM, serguei.spitsyn at oracle.com wrote:
> Hi Coleen,
>
> Is this webrev v2 right to look at? :
> http://cr.openjdk.java.net/~coleenp/2019/8235829.02/webrev/
>
Actually, this one was something I tried that didn't work, so the review 
was 01.?? I created 02 with the change to make 
run_nmethod_entry_barriers plural:

open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235829.02/webrev

Can you review this one on the other thread?

Thanks,
Coleen

> It looks good to me.
> Just one nit (sorry, if it is a duplicated comment):
>
> http://cr.openjdk.java.net/~coleenp/2019/8235829.02/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.frames.html
> 1066 void JvmtiDeferredEventQueue::run_nmethod_entry_barrier() {
> 1067 for(QueueNode* node = _queue_head; node != NULL; node = 
> node->next()) {
> 1068 node->event().run_nmethod_entry_barrier();
> 1069 }
> 1070 }
> The function run_nmethod_entry_barrier() should have a plural form as 
> it iterates over the queue.
>
>
>
> Thanks
> Serguei
>
>
> On 12/17/19 5:59 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/17/19 4:21 AM, Robbin Ehn wrote:
>>> Hi Coleen,
>>>
>>> On 12/16/19 9:21 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html 
>>>>
>>>>
>>>> + NativeAccess<>::oop_store(_class_holder, class_holder_oop);
>>>>
>>>>
>>>> This should probably be:
>>>>
>>>> ?? 41 NativeAccess<IS_DEST_UNINITIALIZED>::oop_store(handle, obj);
>>>>
>>>
>>> I have not seen any stores to oopStorage that use that?
>>> oopStorage should be 'initialized'.
>>> So I prefer not adding another decorator if it's not needed.
>>> That would just be confusing.
>>
>> It's in the ClassLoaderData initialization.? I don't know what it 
>> means, you'd have to ask someone who knows.? I don't think it matters 
>> though.
>>
>> Coleen
>>>
>>>>
>>>> You can leave out using OopHandle.? I have a patch to add the 
>>>> missing functionality and add it to your code. Actually, I was 
>>>> looking to see how much OopHandle is used to see if it's helping 
>>>> anything and there is a lot of code using it.? Most of it is to 
>>>> hide oop* in ClassLoaderData.
>>>>
>>>> This change otherwise looks great.
>>>
>>> Thanks, Robbin
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>
>>>>> Thanks for having a look, Robbin
>>>>>
>>>>> On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> I have to think about this.?? Could there be breakpoints in old 
>>>>>> emcp methods that we do not remove??? The metadata_do function is 
>>>>>> trying to keep old Methods from being deleted while there are 
>>>>>> still references to them.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html 
>>>>>>
>>>>>>
>>>>>> + oop* _class_holder; // keeps _method memory from being deallocated
>>>>>>
>>>>>>
>>>>>> We created the class OopHandle to encapsulate strong oopStorage 
>>>>>> references, although it's missing oop_store. Can you use that?
>>>>>
>>>>>
>>>>>>
>>>>>> Coleen
>>>>>>
>>>>>> On 12/16/19 4:47 AM, Robbin Ehn wrote:
>>>>>>> Hi all, please review.
>>>>>>>
>>>>>>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>>>>>>>
>>>>>>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint 
>>>>>>> is in a vm operation) before they are installed in the safeopint 
>>>>>>> and after they have been installed, walked with 
>>>>>>> JvmtiCurrentBreakpoints::oops_do().
>>>>>>> By putting the class holder inside oopStorage there is no need 
>>>>>>> for this.
>>>>>>>
>>>>>>> JvmtiCurrentBreakpoints::metadata_do is not needed because 
>>>>>>> redefine classes actually removes the breakpoints before 
>>>>>>> updating them (so there is no breakpoints to update).
>>>>>>> We can just remove metadata_do.
>>>>>>>
>>>>>>>
>>>>>>> I also removed some unused code.
>>>>>>>
>>>>>>> Changeset:
>>>>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>>>>>>>
>>>>>>> Passes several runs of nsk jvmti/jdi and t1-7.
>>>>>>>
>>>>>>> Thanks, Robbin
>>>>>>
>>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191217/7d3286d2/attachment-0001.htm>

From chris.plummer at oracle.com  Wed Dec 18 03:15:37 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 17 Dec 2019 19:15:37 -0800
Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA
 javascript support since it will always fail, and will likely be removed soon
In-Reply-To: <0b5b53e5-5045-6d13-6107-c11c7621d53d@oracle.com>
References: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
 <0b5b53e5-5045-6d13-6107-c11c7621d53d@oracle.com>
Message-ID: <3ce5c822-b8ec-04b5-523d-2616f43fcee5@oracle.com>

Thanks Alex!

Chris

On 12/17/19 1:30 PM, Alex Menkov wrote:
> LGTM
>
> --alex
>
> On 12/16/2019 21:36, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8236062
>> http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/
>>
>> Since SA javascript support is broken as described in [1] 
>> JDK-8235594, and we'll likely remove it, I'd like to at least get it 
>> disabled now. I'd like to get this into 14 mostly because I really 
>> want to get [2] JDK-8234048 fixed in 14 because we'll start seeing 
>> the clhsdb test failures on macos 10.14 and 10.15 more often over the 
>> coming months as we deploy more macosx test hosts with those 
>> versions. However, [2] JDK-8234048 is blocked by [3] JDK-8234277 
>> (which improves error checking and failure output for the clhsdb 
>> tests), and [3] JDK-8234277 is blocked by this CR because the 
>> exceptions produced when javascript fails to initialize end up 
>> cluttering the clhsdb test logs, even when the test passes (and is 
>> misleading when the test fails).
>>
>> Sorry about all the bug references and inter-dependencies. It's taken 
>> a while myself to get my head wrapped around how I wanted to approach 
>> fixing them all in a meaningful order.
>>
>> thanks,
>>
>> Chris
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8235594
>> [2] https://bugs.openjdk.java.net/browse/JDK-8234048
>> [3] https://bugs.openjdk.java.net/browse/JDK-8234277


From chris.plummer at oracle.com  Wed Dec 18 03:23:45 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 17 Dec 2019 19:23:45 -0800
Subject: Jhsdb jmap --heap print large value of MaxMetaspaceSize
In-Reply-To: <6EDC6A6E-668A-4E7B-847C-E9155F6E6598@tencent.com>
References: <6EDC6A6E-668A-4E7B-847C-E9155F6E6598@tencent.com>
Message-ID: <fe66f6ff-0027-ecd4-720b-6774b71910e0@oracle.com>

That sounds reasonable, but I'd like to hear from people that are 
actually putting the "jmap --heap print" output to use. Although the 
exact format of the output is not specified, it's possible that users 
could be parsing the output by looking for an actual numeric size. There 
might even be tests that do this. This also puts into question whether 
or not you would need a CSR.

thanks,

Chris

On 12/16/19 7:18 PM, linzang(??) wrote:
> Dear All,
>       I found jhsdb jmap ?heap print the value of uint_max (17592186044415 MB) when MaxMetaspaceSize is not set by user. This number confused me a little.
>       And I also found the jcmd VM.metaspace prints ?unimited? if MaxMetaspaceSize is not set. Which seems more reasonable.
>       So Do you think it is OK if I make "jhsdb jmap" print the same ?unlimited? value as jcmd does for MaxMetaspaceSize?
>
> BRs,
> Lin


From robbin.ehn at oracle.com  Wed Dec 18 11:41:07 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 18 Dec 2019 12:41:07 +0100
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <624f3db0-351e-7f57-61f2-26323a75854a@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
 <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>
 <e07b16a0-a2ca-6683-8a69-529fc41fd842@oracle.com>
 <05aa6993-e1e0-c43e-0040-8fee71401f4c@oracle.com>
 <624f3db0-351e-7f57-61f2-26323a75854a@oracle.com>
Message-ID: <5a7930c5-f7ea-f033-07f6-d325ad57e92f@oracle.com>

Hi Coleen,

> It's in the ClassLoaderData initialization.? I don't know what it means, you'd 
> have to ask someone who knows.? I don't think it matters though.

It means that if G1 concurrently scans this oop and we over-write the
uninitialized oop, the uninitialized oop would be saved in the SATB-barrier.
The oopStorage contains strong roots which are done in a safeopint.

Thanks, Robbin

> 
> Coleen
>>
>>>
>>> You can leave out using OopHandle.? I have a patch to add the missing 
>>> functionality and add it to your code.?? Actually, I was looking to see how 
>>> much OopHandle is used to see if it's helping anything and there is a lot of 
>>> code using it.? Most of it is to hide oop* in ClassLoaderData.
>>>
>>> This change otherwise looks great.
>>
>> Thanks, Robbin
>>
>>> Thanks,
>>> Coleen
>>>
>>>
>>>> Thanks for having a look, Robbin
>>>>
>>>> On 12/16/19 1:32 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> I have to think about this.?? Could there be breakpoints in old emcp 
>>>>> methods that we do not remove??? The metadata_do function is trying to keep 
>>>>> old Methods from being deleted while there are still references to them.
>>>>>
>>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/src/hotspot/share/prims/jvmtiImpl.hpp.udiff.html 
>>>>>
>>>>>
>>>>> + oop* _class_holder; // keeps _method memory from being deallocated
>>>>>
>>>>>
>>>>> We created the class OopHandle to encapsulate strong oopStorage references, 
>>>>> although it's missing oop_store.? Can you use that?
>>>>
>>>>
>>>>>
>>>>> Coleen
>>>>>
>>>>> On 12/16/19 4:47 AM, Robbin Ehn wrote:
>>>>>> Hi all, please review.
>>>>>>
>>>>>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>>>>>>
>>>>>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in a 
>>>>>> vm operation) before they are installed in the safeopint and after they 
>>>>>> have been installed, walked with JvmtiCurrentBreakpoints::oops_do().
>>>>>> By putting the class holder inside oopStorage there is no need for this.
>>>>>>
>>>>>> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine 
>>>>>> classes actually removes the breakpoints before updating them (so there is 
>>>>>> no breakpoints to update).
>>>>>> We can just remove metadata_do.
>>>>>>
>>>>>>
>>>>>> I also removed some unused code.
>>>>>>
>>>>>> Changeset:
>>>>>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>>>>>>
>>>>>> Passes several runs of nsk jvmti/jdi and t1-7.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>
>>>
> 

From rkennke at redhat.com  Wed Dec 18 13:05:24 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 18 Dec 2019 14:05:24 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
Message-ID: <8870829e-c558-c956-2184-00204632abb6@redhat.com>

Hello all,

Issue:
https://bugs.openjdk.java.net/browse/JDK-8227269

I am proposing what amounts to a rewrite of classTrack.c. It avoids 
throwing away the class cache on GC, and instead keeps track of 
loaded/unloaded classes one-by-one.

In addition to that, it avoids this whole dance until an agent registers 
interest in EI_GC_FINISH.

Webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/

Testing: manual testing of provided test scenarios and timing.

Eg with the testcase provided here:
https://bugzilla.redhat.com/show_bug.cgi?id=1751985

I am getting those numbers:
unpatched: no debug: 84s with debug: 225s
patched:   no debug: 85s with debug: 95s

I also tested successfully through jdk/submit repo

Can I please get a review?

Thanks,
Roman


From david.holmes at oracle.com  Wed Dec 18 13:45:25 2019
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 18 Dec 2019 23:45:25 +1000
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <53208d1a-b969-d2e7-8657-d95f2d2d22e8@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
 <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
 <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>
 <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com>
 <c92a9b6e-71b6-c97e-fed5-5b2616052032@oracle.com>
 <53208d1a-b969-d2e7-8657-d95f2d2d22e8@oracle.com>
Message-ID: <f14525b5-870c-1f09-21d3-a218deb962c0@oracle.com>

Thanks for the additional info Coleen!

Just to add a bit more to the initialization history. The ServiceThread 
is a generalization of the LowMemoryDetectorThread that was part of the 
management API, and so it was initialized in Management::initialize. 
When it turned into the ServiceThread - to process JVMTI deferred events 
in addition to the low-memory-detector events - the initialization 
placement remained the same. Then later the INCLUDE_MANAGEMENT guards 
were added (JDK-7189254, October 2012). Later still we started adding 
other items of work for the ServiceThread. The earliest was the 
AllocationContextService notification in September 2014 but as that no 
longer exists I can't tell if that was the first non-management related 
use. Then the StringTable use was added 18 months ago - which definitely 
was outside the realm of the management API. So that is when the 
MinimalVM was first "broken". So it is good that is fixed.

With regard to the placement in the initialization order, my remaining 
concern was with JVMTI event processing that might happen via events 
generated very early in the init sequence. But you have now modified 
things so that we will only process events in the LIVE phase, which only 
activates after all the class library initialization is complete.

So overall I'm no longer significantly concerned about the change to the 
initialization order as I think you have it all covered. Thanks for 
bearing with me and all the off-list discussion.

Cheers,
David
-----

On 18/12/2019 1:27 am, coleen.phillimore at oracle.com wrote:
> 
> 
> On 12/16/19 11:04 PM, David Holmes wrote:
>> Clarification ...
>>
>> On 17/12/2019 12:40 pm, coleen.phillimore at oracle.com wrote:
>>>
>>> Short answer below.
>>>
>>> On 12/16/19 5:51 PM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> A quick initial response ...
>>>>
>>>> On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 12/16/19 8:04 AM, David Holmes wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>>>>>>> Summary: Start ServiceThread before compiler threads, and run 
>>>>>>> nmethod barriers for zgc before adding to the service thread 
>>>>>>> queue, or posting the events on the java thread queue.
>>>>>>
>>>>>> I can't comment on most of this but the earlier starting of the 
>>>>>> service thread has some concerns:
>>>>>>
>>>>>> - there is a lot of JDK level initialization which now will not 
>>>>>> have happened before the service thread is started and it is far 
>>>>>> from obvious that all possible initialization dependencies will be 
>>>>>> satisfied
>>>>>
>>>>> I agree that the order of initialization is very sensitive. From 
>>>>> the actions that the service thread does, the one that I found was 
>>>>> a problem was that events were posted before the LIVE phase (see 
>>>>> comment in has_events()), which could have happened with the 
>>>>> existing code, but the window for the race is a lot smaller. ? The 
>>>>> other actions can be run if there's a GC before initialization but 
>>>>> would be a bug in the initialization code, and I didn't find these 
>>>>> bugs in all my testing. There are some ordering dependencies that 
>>>>> do have odd side effects (between the compiler thread startup and 
>>>>> initialization jsr292 classes) which have comments.? This patch 
>>>>> doesn't touch those.
>>>>>
>>>>>>
>>>>>> - current starting of the service thread in Management::initialize 
>>>>>> is guarded by "#if INCLUDE_MANAGEMENT", but now you are starting 
>>>>>> the service thread unconditionally for all builds. Hmm just saw 
>>>>>> your latest comment to the bug report - so the service thread is 
>>>>>> now (for quite some time?) being used for other than management 
>>>>>> tasks and so should always be present even if INCLUDE_MANAGEMENT 
>>>>>> is not enabled. Is that sufficient or are there likely to be other 
>>>>>> changes needed to actually ensure that all works correctly? e.g. 
>>>>>> any code the service thread executes that is only defined for 
>>>>>> INCLUDE_MANAGEMENT will need to be compiled out explicitly.
>>>>>>
>>>>>
>>>>> I asked Jie offline to check the minimal build.? I don't think 
>>>>> there are other INCLUDE_MANAGEMENT actions in the service thread 
>>>>> and I'm not sure why it was initialized there in the first place. 
>>>>> The minimal vm would have been broken ie. hashtables would not have 
>>>>> been cleaned up, etc, but I'm not sure how well that is tested or 
>>>>> if one would notice.
>>>>>> - the service thread and the notification thread are (were?) 
>>>>>> closely related but now started at completely different times
>>>>>
>>>>> The notification thread is limited to "services" so it makes sense 
>>>>> where it is.? The ServiceThread does lots of other things.? Maybe 
>>>>> it needs renaming in 15.
>>>>>>
>>>>>> The bug report states the problem as:
>>>>>>
>>>>>> "The graal crash is because compiled_method_load events are added 
>>>>>> to the ServiceThread's deferred event queue before the 
>>>>>> ServiceThread is created so are not walked to keep them from being 
>>>>>> zombied."
>>>>>>
>>>>>> so why isn't the solution to ensure the deferred event queue is 
>>>>>> walked? I'm not clear how starting the service thread relates to 
>>>>>> walking the queue.
>>>>>>
>>>>>
>>>>> The service thread is responsible for walking the deferred event 
>>>>> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design could 
>>>>> be changed to have some global walk somewhere of this queue, but 
>>>>> essentially this queue is processed by the service thread.
>>>>
>>>> Sorry I don't follow. I thought "oops_do" and friends are for the GC 
>>>> threads and/or VMThread to call to process oops when GC updates them.
>>>
>>> The oops_do and nmethods_do() can be called by a thread walk in 
>>> handshakes (by the sweeper thread) and by parallel GC thread walks. 
>>> There isn't a single entry to do the thread-specific closures that we 
>>> need to do for these deferred event queues.?? I tried a version that 
>>> walked the queues with a static call but missed some places where it 
>>> would be needed to make this call (didn't work).? Keeping this 
>>> associated with the ServiceThread simplifies a lot.
>>
>> Just to clarify that further, the thread walk requires the thread 
>> appears in ALL_JAVA_THREADS but that only happens after the 
>> ServiceThread has been started. So in essence we don't really need the 
>> ServiceThread to have commenced execution earlier, but we need it to 
>> have been created. Those two steps are combined in practice.
> 
> Yes.? Then the ServiceThread waits on the Service_lock until notified by 
> these events:
> 
>  ????? while (((sensors_changed = (!UseNotificationThread && 
> LowMemoryDetector::has_pending_requests())) |
>  ????????????? (has_jvmti_events = _jvmti_service_queue.has_events()) |
>  ????????????? (has_gc_notification_event = (!UseNotificationThread && 
> GCNotifier::has_event())) |
>  ????????????? (has_dcmd_notification_event = (!UseNotificationThread && 
> DCmdFactory::has_pending_jmx_notification())) |
>  ????????????? (stringtable_work = StringTable::has_work()) |
>  ????????????? (symboltable_work = SymbolTable::has_work()) |
>  ????????????? (resolved_method_table_work = 
> ResolvedMethodTable::has_work()) |
>  ????????????? (thread_id_table_work = ThreadIdTable::has_work()) |
>  ????????????? (protection_domain_table_work = 
> SystemDictionary::pd_cache_table()->has_work()) |
>  ????????????? (oopstorage_work = OopStorage::has_cleanup_work_and_reset())
>  ???????????? ) == 0) {
> 
> The first, third and fourth events are from management.cpp events that 
> were initialized after the ServiceThread was started.
> The second event I have changed, to wait until LIVE phase to return true.
> The stringtable, symboltable, resolved_method_table, thread_id and pd 
> table have static _has_work variables initialized to false.
> The oopstorage_work has similar, but may update a time-based counter a 
> bit earlier with the service thread starting earlier.? I think this is 
> harmless.
> 
> It is possible that after the service thread starts and before the 
> compiler thread starts, there could be a GC that notifies the 
> stringtable to clean up.? This seems like a good thing that the GC would 
> clean up these tables with this order.? I ran the tier4 graal tests and 
> there were no failures.
> 
> Thanks,
> Coleen
>>
>> Cheers,
>> David
>>
>>> thanks,
>>> Coleen
>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>>> I had an additional change to make the queue non-static but want to 
>>>>> limit the change at this point.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> See bug for description of the problems found with the new 
>>>>>>> Zombie.java test.
>>>>>>>
>>>>>>> open webrev at 
>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>>>>>>
>>>>>>> Ran tier1 all platforms, and tier2-8 testing, as well as 
>>>>>>> rerunning original test failure from bug 
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>
>>>
> 

From robbin.ehn at oracle.com  Wed Dec 18 15:00:10 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Wed, 18 Dec 2019 16:00:10 +0100
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
Message-ID: <86019ad1-39fa-cb1a-d34f-6b2b521d9dba@oracle.com>

Hi Coleen,

I looked at v2:
http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev

Seems good.

But we do a lot of work to keep the nmethod alive while in queue.
Instead we should try copy the data from nmethod and enqueue this copy.
Thus not having to keep the nmethod alive.
Not for this change-set, but a potential future simplification.

Thanks, Robbin

On 12/16/19 12:41 PM, coleen.phillimore at oracle.com wrote:
> Summary: Start ServiceThread before compiler threads, and run nmethod barriers 
> for zgc before adding to the service thread queue, or posting the events on the 
> java thread queue.
> 
> See bug for description of the problems found with the new Zombie.java test.
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
> 
> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning original test 
> failure from bug https://bugs.openjdk.java.net/browse/JDK-8173361.
> 
> Thanks,
> Coleen

From coleen.phillimore at oracle.com  Wed Dec 18 16:36:29 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 18 Dec 2019 11:36:29 -0500
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <86019ad1-39fa-cb1a-d34f-6b2b521d9dba@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <86019ad1-39fa-cb1a-d34f-6b2b521d9dba@oracle.com>
Message-ID: <aa2a0c42-c177-4649-476a-90afdf15f7ca@oracle.com>


Robbin, Thank you for looking at it.

On 12/18/19 10:00 AM, Robbin Ehn wrote:
> Hi Coleen,
>
> I looked at v2:
> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>
> Seems good.
>
> But we do a lot of work to keep the nmethod alive while in queue.
> Instead we should try copy the data from nmethod and enqueue this copy.
> Thus not having to keep the nmethod alive.
> Not for this change-set, but a potential future simplification.

I agree.? Thank you for thinking of this idea.? This might work a lot 
better, and I'll try to write this simplification and see if it works 
for 15.

Thanks for the code review.
Coleen

>
> Thanks, Robbin
>
> On 12/16/19 12:41 PM, coleen.phillimore at oracle.com wrote:
>> Summary: Start ServiceThread before compiler threads, and run nmethod 
>> barriers for zgc before adding to the service thread queue, or 
>> posting the events on the java thread queue.
>>
>> See bug for description of the problems found with the new 
>> Zombie.java test.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>
>> Ran tier1 all platforms, and tier2-8 testing, as well as rerunning 
>> original test failure from bug 
>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>
>> Thanks,
>> Coleen


From coleen.phillimore at oracle.com  Wed Dec 18 16:42:45 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 18 Dec 2019 11:42:45 -0500
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <f14525b5-870c-1f09-21d3-a218deb962c0@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
 <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
 <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>
 <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com>
 <c92a9b6e-71b6-c97e-fed5-5b2616052032@oracle.com>
 <53208d1a-b969-d2e7-8657-d95f2d2d22e8@oracle.com>
 <f14525b5-870c-1f09-21d3-a218deb962c0@oracle.com>
Message-ID: <07b6370a-ff4f-641c-7661-e5b1231a7d8c@oracle.com>


On 12/18/19 8:45 AM, David Holmes wrote:
> Thanks for the additional info Coleen!
>
> Just to add a bit more to the initialization history. The 
> ServiceThread is a generalization of the LowMemoryDetectorThread that 
> was part of the management API, and so it was initialized in 
> Management::initialize. When it turned into the ServiceThread - to 
> process JVMTI deferred events in addition to the low-memory-detector 
> events - the initialization placement remained the same. Then later 
> the INCLUDE_MANAGEMENT guards were added (JDK-7189254, October 2012). 
> Later still we started adding other items of work for the 
> ServiceThread. The earliest was the AllocationContextService 
> notification in September 2014 but as that no longer exists I can't 
> tell if that was the first non-management related use. Then the 
> StringTable use was added 18 months ago - which definitely was outside 
> the realm of the management API. So that is when the MinimalVM was 
> first "broken". So it is good that is fixed.
>
> With regard to the placement in the initialization order, my remaining 
> concern was with JVMTI event processing that might happen via events 
> generated very early in the init sequence. But you have now modified 
> things so that we will only process events in the LIVE phase, which 
> only activates after all the class library initialization is complete.
>
> So overall I'm no longer significantly concerned about the change to 
> the initialization order as I think you have it all covered. Thanks 
> for bearing with me and all the off-list discussion.

Thank you for having this discussion with me and provoking me to recheck 
the ServiceThread.?? I think we can do further work to future-proof 
initialization order but the design needs to be improved.

Thanks for reviewing,
Coleen
>
> Cheers,
> David
> -----
>
> On 18/12/2019 1:27 am, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/16/19 11:04 PM, David Holmes wrote:
>>> Clarification ...
>>>
>>> On 17/12/2019 12:40 pm, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Short answer below.
>>>>
>>>> On 12/16/19 5:51 PM, David Holmes wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> A quick initial response ...
>>>>>
>>>>> On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>>
>>>>>> On 12/16/19 8:04 AM, David Holmes wrote:
>>>>>>> Hi Coleen,
>>>>>>>
>>>>>>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>>>>>>>> Summary: Start ServiceThread before compiler threads, and run 
>>>>>>>> nmethod barriers for zgc before adding to the service thread 
>>>>>>>> queue, or posting the events on the java thread queue.
>>>>>>>
>>>>>>> I can't comment on most of this but the earlier starting of the 
>>>>>>> service thread has some concerns:
>>>>>>>
>>>>>>> - there is a lot of JDK level initialization which now will not 
>>>>>>> have happened before the service thread is started and it is far 
>>>>>>> from obvious that all possible initialization dependencies will 
>>>>>>> be satisfied
>>>>>>
>>>>>> I agree that the order of initialization is very sensitive. From 
>>>>>> the actions that the service thread does, the one that I found 
>>>>>> was a problem was that events were posted before the LIVE phase 
>>>>>> (see comment in has_events()), which could have happened with the 
>>>>>> existing code, but the window for the race is a lot smaller. ? 
>>>>>> The other actions can be run if there's a GC before 
>>>>>> initialization but would be a bug in the initialization code, and 
>>>>>> I didn't find these bugs in all my testing. There are some 
>>>>>> ordering dependencies that do have odd side effects (between the 
>>>>>> compiler thread startup and initialization jsr292 classes) which 
>>>>>> have comments.? This patch doesn't touch those.
>>>>>>
>>>>>>>
>>>>>>> - current starting of the service thread in 
>>>>>>> Management::initialize is guarded by "#if INCLUDE_MANAGEMENT", 
>>>>>>> but now you are starting the service thread unconditionally for 
>>>>>>> all builds. Hmm just saw your latest comment to the bug report - 
>>>>>>> so the service thread is now (for quite some time?) being used 
>>>>>>> for other than management tasks and so should always be present 
>>>>>>> even if INCLUDE_MANAGEMENT is not enabled. Is that sufficient or 
>>>>>>> are there likely to be other changes needed to actually ensure 
>>>>>>> that all works correctly? e.g. any code the service thread 
>>>>>>> executes that is only defined for INCLUDE_MANAGEMENT will need 
>>>>>>> to be compiled out explicitly.
>>>>>>>
>>>>>>
>>>>>> I asked Jie offline to check the minimal build.? I don't think 
>>>>>> there are other INCLUDE_MANAGEMENT actions in the service thread 
>>>>>> and I'm not sure why it was initialized there in the first place. 
>>>>>> The minimal vm would have been broken ie. hashtables would not 
>>>>>> have been cleaned up, etc, but I'm not sure how well that is 
>>>>>> tested or if one would notice.
>>>>>>> - the service thread and the notification thread are (were?) 
>>>>>>> closely related but now started at completely different times
>>>>>>
>>>>>> The notification thread is limited to "services" so it makes 
>>>>>> sense where it is.? The ServiceThread does lots of other things.? 
>>>>>> Maybe it needs renaming in 15.
>>>>>>>
>>>>>>> The bug report states the problem as:
>>>>>>>
>>>>>>> "The graal crash is because compiled_method_load events are 
>>>>>>> added to the ServiceThread's deferred event queue before the 
>>>>>>> ServiceThread is created so are not walked to keep them from 
>>>>>>> being zombied."
>>>>>>>
>>>>>>> so why isn't the solution to ensure the deferred event queue is 
>>>>>>> walked? I'm not clear how starting the service thread relates to 
>>>>>>> walking the queue.
>>>>>>>
>>>>>>
>>>>>> The service thread is responsible for walking the deferred event 
>>>>>> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design 
>>>>>> could be changed to have some global walk somewhere of this 
>>>>>> queue, but essentially this queue is processed by the service 
>>>>>> thread.
>>>>>
>>>>> Sorry I don't follow. I thought "oops_do" and friends are for the 
>>>>> GC threads and/or VMThread to call to process oops when GC updates 
>>>>> them.
>>>>
>>>> The oops_do and nmethods_do() can be called by a thread walk in 
>>>> handshakes (by the sweeper thread) and by parallel GC thread walks. 
>>>> There isn't a single entry to do the thread-specific closures that 
>>>> we need to do for these deferred event queues.?? I tried a version 
>>>> that walked the queues with a static call but missed some places 
>>>> where it would be needed to make this call (didn't work).? Keeping 
>>>> this associated with the ServiceThread simplifies a lot.
>>>
>>> Just to clarify that further, the thread walk requires the thread 
>>> appears in ALL_JAVA_THREADS but that only happens after the 
>>> ServiceThread has been started. So in essence we don't really need 
>>> the ServiceThread to have commenced execution earlier, but we need 
>>> it to have been created. Those two steps are combined in practice.
>>
>> Yes.? Then the ServiceThread waits on the Service_lock until notified 
>> by these events:
>>
>> ?????? while (((sensors_changed = (!UseNotificationThread && 
>> LowMemoryDetector::has_pending_requests())) |
>> ?????????????? (has_jvmti_events = _jvmti_service_queue.has_events()) |
>> ?????????????? (has_gc_notification_event = (!UseNotificationThread 
>> && GCNotifier::has_event())) |
>> ?????????????? (has_dcmd_notification_event = (!UseNotificationThread 
>> && DCmdFactory::has_pending_jmx_notification())) |
>> ?????????????? (stringtable_work = StringTable::has_work()) |
>> ?????????????? (symboltable_work = SymbolTable::has_work()) |
>> ?????????????? (resolved_method_table_work = 
>> ResolvedMethodTable::has_work()) |
>> ?????????????? (thread_id_table_work = ThreadIdTable::has_work()) |
>> ?????????????? (protection_domain_table_work = 
>> SystemDictionary::pd_cache_table()->has_work()) |
>> ?????????????? (oopstorage_work = 
>> OopStorage::has_cleanup_work_and_reset())
>> ????????????? ) == 0) {
>>
>> The first, third and fourth events are from management.cpp events 
>> that were initialized after the ServiceThread was started.
>> The second event I have changed, to wait until LIVE phase to return 
>> true.
>> The stringtable, symboltable, resolved_method_table, thread_id and pd 
>> table have static _has_work variables initialized to false.
>> The oopstorage_work has similar, but may update a time-based counter 
>> a bit earlier with the service thread starting earlier. I think this 
>> is harmless.
>>
>> It is possible that after the service thread starts and before the 
>> compiler thread starts, there could be a GC that notifies the 
>> stringtable to clean up.? This seems like a good thing that the GC 
>> would clean up these tables with this order.? I ran the tier4 graal 
>> tests and there were no failures.
>>
>> Thanks,
>> Coleen
>>>
>>> Cheers,
>>> David
>>>
>>>> thanks,
>>>> Coleen
>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> I had an additional change to make the queue non-static but want 
>>>>>> to limit the change at this point.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> See bug for description of the problems found with the new 
>>>>>>>> Zombie.java test.
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>>>>>>>
>>>>>>>> Ran tier1 all platforms, and tier2-8 testing, as well as 
>>>>>>>> rerunning original test failure from bug 
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>
>>>>
>>


From serguei.spitsyn at oracle.com  Wed Dec 18 16:57:38 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Dec 2019 08:57:38 -0800
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <10582410-4904-5aa8-dcf5-00cf72e0eebe@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <c0094eb6-710f-19b6-981c-e83511ad0461@oracle.com>
 <e803bda4-c57c-6eec-45e7-c2a6028d3ae6@oracle.com>
 <e07b16a0-a2ca-6683-8a69-529fc41fd842@oracle.com>
 <05aa6993-e1e0-c43e-0040-8fee71401f4c@oracle.com>
 <624f3db0-351e-7f57-61f2-26323a75854a@oracle.com>
 <fca3e87b-d469-9fe3-3fd8-5868fba2614f@oracle.com>
 <10582410-4904-5aa8-dcf5-00cf72e0eebe@oracle.com>
Message-ID: <48e68565-1774-9022-ce55-ba15c84b0cf5@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191218/2c9752d4/attachment.htm>

From serguei.spitsyn at oracle.com  Wed Dec 18 18:33:08 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Dec 2019 10:33:08 -0800
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <07b6370a-ff4f-641c-7661-e5b1231a7d8c@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
 <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
 <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>
 <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com>
 <c92a9b6e-71b6-c97e-fed5-5b2616052032@oracle.com>
 <53208d1a-b969-d2e7-8657-d95f2d2d22e8@oracle.com>
 <f14525b5-870c-1f09-21d3-a218deb962c0@oracle.com>
 <07b6370a-ff4f-641c-7661-e5b1231a7d8c@oracle.com>
Message-ID: <102b841c-6797-dcda-4e33-e7282a2336b0@oracle.com>

Hi Coleen,

Just wanted to confirm the webrev V2 version looks okay to me.
Sorry for replying on the wrong mailing thread.

Thanks,
Serguei


On 12/18/19 08:42, coleen.phillimore at oracle.com wrote:
>
>
> On 12/18/19 8:45 AM, David Holmes wrote:
>> Thanks for the additional info Coleen!
>>
>> Just to add a bit more to the initialization history. The 
>> ServiceThread is a generalization of the LowMemoryDetectorThread that 
>> was part of the management API, and so it was initialized in 
>> Management::initialize. When it turned into the ServiceThread - to 
>> process JVMTI deferred events in addition to the low-memory-detector 
>> events - the initialization placement remained the same. Then later 
>> the INCLUDE_MANAGEMENT guards were added (JDK-7189254, October 2012). 
>> Later still we started adding other items of work for the 
>> ServiceThread. The earliest was the AllocationContextService 
>> notification in September 2014 but as that no longer exists I can't 
>> tell if that was the first non-management related use. Then the 
>> StringTable use was added 18 months ago - which definitely was 
>> outside the realm of the management API. So that is when the 
>> MinimalVM was first "broken". So it is good that is fixed.
>>
>> With regard to the placement in the initialization order, my 
>> remaining concern was with JVMTI event processing that might happen 
>> via events generated very early in the init sequence. But you have 
>> now modified things so that we will only process events in the LIVE 
>> phase, which only activates after all the class library 
>> initialization is complete.
>>
>> So overall I'm no longer significantly concerned about the change to 
>> the initialization order as I think you have it all covered. Thanks 
>> for bearing with me and all the off-list discussion.
>
> Thank you for having this discussion with me and provoking me to 
> recheck the ServiceThread.?? I think we can do further work to 
> future-proof initialization order but the design needs to be improved.
>
> Thanks for reviewing,
> Coleen
>>
>> Cheers,
>> David
>> -----
>>
>> On 18/12/2019 1:27 am, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 12/16/19 11:04 PM, David Holmes wrote:
>>>> Clarification ...
>>>>
>>>> On 17/12/2019 12:40 pm, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Short answer below.
>>>>>
>>>>> On 12/16/19 5:51 PM, David Holmes wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> A quick initial response ...
>>>>>>
>>>>>> On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 12/16/19 8:04 AM, David Holmes wrote:
>>>>>>>> Hi Coleen,
>>>>>>>>
>>>>>>>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>>>>>>>>> Summary: Start ServiceThread before compiler threads, and run 
>>>>>>>>> nmethod barriers for zgc before adding to the service thread 
>>>>>>>>> queue, or posting the events on the java thread queue.
>>>>>>>>
>>>>>>>> I can't comment on most of this but the earlier starting of the 
>>>>>>>> service thread has some concerns:
>>>>>>>>
>>>>>>>> - there is a lot of JDK level initialization which now will not 
>>>>>>>> have happened before the service thread is started and it is 
>>>>>>>> far from obvious that all possible initialization dependencies 
>>>>>>>> will be satisfied
>>>>>>>
>>>>>>> I agree that the order of initialization is very sensitive. From 
>>>>>>> the actions that the service thread does, the one that I found 
>>>>>>> was a problem was that events were posted before the LIVE phase 
>>>>>>> (see comment in has_events()), which could have happened with 
>>>>>>> the existing code, but the window for the race is a lot smaller. 
>>>>>>> ? The other actions can be run if there's a GC before 
>>>>>>> initialization but would be a bug in the initialization code, 
>>>>>>> and I didn't find these bugs in all my testing. There are some 
>>>>>>> ordering dependencies that do have odd side effects (between the 
>>>>>>> compiler thread startup and initialization jsr292 classes) which 
>>>>>>> have comments.? This patch doesn't touch those.
>>>>>>>
>>>>>>>>
>>>>>>>> - current starting of the service thread in 
>>>>>>>> Management::initialize is guarded by "#if INCLUDE_MANAGEMENT", 
>>>>>>>> but now you are starting the service thread unconditionally for 
>>>>>>>> all builds. Hmm just saw your latest comment to the bug report 
>>>>>>>> - so the service thread is now (for quite some time?) being 
>>>>>>>> used for other than management tasks and so should always be 
>>>>>>>> present even if INCLUDE_MANAGEMENT is not enabled. Is that 
>>>>>>>> sufficient or are there likely to be other changes needed to 
>>>>>>>> actually ensure that all works correctly? e.g. any code the 
>>>>>>>> service thread executes that is only defined for 
>>>>>>>> INCLUDE_MANAGEMENT will need to be compiled out explicitly.
>>>>>>>>
>>>>>>>
>>>>>>> I asked Jie offline to check the minimal build.? I don't think 
>>>>>>> there are other INCLUDE_MANAGEMENT actions in the service thread 
>>>>>>> and I'm not sure why it was initialized there in the first 
>>>>>>> place. The minimal vm would have been broken ie. hashtables 
>>>>>>> would not have been cleaned up, etc, but I'm not sure how well 
>>>>>>> that is tested or if one would notice.
>>>>>>>> - the service thread and the notification thread are (were?) 
>>>>>>>> closely related but now started at completely different times
>>>>>>>
>>>>>>> The notification thread is limited to "services" so it makes 
>>>>>>> sense where it is.? The ServiceThread does lots of other 
>>>>>>> things.? Maybe it needs renaming in 15.
>>>>>>>>
>>>>>>>> The bug report states the problem as:
>>>>>>>>
>>>>>>>> "The graal crash is because compiled_method_load events are 
>>>>>>>> added to the ServiceThread's deferred event queue before the 
>>>>>>>> ServiceThread is created so are not walked to keep them from 
>>>>>>>> being zombied."
>>>>>>>>
>>>>>>>> so why isn't the solution to ensure the deferred event queue is 
>>>>>>>> walked? I'm not clear how starting the service thread relates 
>>>>>>>> to walking the queue.
>>>>>>>>
>>>>>>>
>>>>>>> The service thread is responsible for walking the deferred event 
>>>>>>> queue.?? See ServiceThread::oops_do/nmethods_do.?? The design 
>>>>>>> could be changed to have some global walk somewhere of this 
>>>>>>> queue, but essentially this queue is processed by the service 
>>>>>>> thread.
>>>>>>
>>>>>> Sorry I don't follow. I thought "oops_do" and friends are for the 
>>>>>> GC threads and/or VMThread to call to process oops when GC 
>>>>>> updates them.
>>>>>
>>>>> The oops_do and nmethods_do() can be called by a thread walk in 
>>>>> handshakes (by the sweeper thread) and by parallel GC thread 
>>>>> walks. There isn't a single entry to do the thread-specific 
>>>>> closures that we need to do for these deferred event queues.?? I 
>>>>> tried a version that walked the queues with a static call but 
>>>>> missed some places where it would be needed to make this call 
>>>>> (didn't work).? Keeping this associated with the ServiceThread 
>>>>> simplifies a lot.
>>>>
>>>> Just to clarify that further, the thread walk requires the thread 
>>>> appears in ALL_JAVA_THREADS but that only happens after the 
>>>> ServiceThread has been started. So in essence we don't really need 
>>>> the ServiceThread to have commenced execution earlier, but we need 
>>>> it to have been created. Those two steps are combined in practice.
>>>
>>> Yes.? Then the ServiceThread waits on the Service_lock until 
>>> notified by these events:
>>>
>>> ?????? while (((sensors_changed = (!UseNotificationThread && 
>>> LowMemoryDetector::has_pending_requests())) |
>>> ?????????????? (has_jvmti_events = _jvmti_service_queue.has_events()) |
>>> ?????????????? (has_gc_notification_event = (!UseNotificationThread 
>>> && GCNotifier::has_event())) |
>>> ?????????????? (has_dcmd_notification_event = 
>>> (!UseNotificationThread && 
>>> DCmdFactory::has_pending_jmx_notification())) |
>>> ?????????????? (stringtable_work = StringTable::has_work()) |
>>> ?????????????? (symboltable_work = SymbolTable::has_work()) |
>>> ?????????????? (resolved_method_table_work = 
>>> ResolvedMethodTable::has_work()) |
>>> ?????????????? (thread_id_table_work = ThreadIdTable::has_work()) |
>>> ?????????????? (protection_domain_table_work = 
>>> SystemDictionary::pd_cache_table()->has_work()) |
>>> ?????????????? (oopstorage_work = 
>>> OopStorage::has_cleanup_work_and_reset())
>>> ????????????? ) == 0) {
>>>
>>> The first, third and fourth events are from management.cpp events 
>>> that were initialized after the ServiceThread was started.
>>> The second event I have changed, to wait until LIVE phase to return 
>>> true.
>>> The stringtable, symboltable, resolved_method_table, thread_id and 
>>> pd table have static _has_work variables initialized to false.
>>> The oopstorage_work has similar, but may update a time-based counter 
>>> a bit earlier with the service thread starting earlier. I think this 
>>> is harmless.
>>>
>>> It is possible that after the service thread starts and before the 
>>> compiler thread starts, there could be a GC that notifies the 
>>> stringtable to clean up.? This seems like a good thing that the GC 
>>> would clean up these tables with this order.? I ran the tier4 graal 
>>> tests and there were no failures.
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>>> thanks,
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> I had an additional change to make the queue non-static but want 
>>>>>>> to limit the change at this point.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> See bug for description of the problems found with the new 
>>>>>>>>> Zombie.java test.
>>>>>>>>>
>>>>>>>>> open webrev at 
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>>>>>>>>
>>>>>>>>> Ran tier1 all platforms, and tier2-8 testing, as well as 
>>>>>>>>> rerunning original test failure from bug 
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>
>>>>>
>>>
>


From coleen.phillimore at oracle.com  Wed Dec 18 22:06:08 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 18 Dec 2019 17:06:08 -0500
Subject: [14] RFR 8235829: graal crashes with Zombie.java test
In-Reply-To: <102b841c-6797-dcda-4e33-e7282a2336b0@oracle.com>
References: <c78fce63-6dda-c190-0346-981804a10493@oracle.com>
 <447990f9-d276-56c3-aebe-fa84d0cfcc27@oracle.com>
 <975b6e46-95c1-8a37-b160-b6a57a9633a8@oracle.com>
 <06d6e9f4-b919-3062-7d18-1245e66b27b2@oracle.com>
 <39716b03-dd5e-580e-2c91-004da64bcc19@oracle.com>
 <c92a9b6e-71b6-c97e-fed5-5b2616052032@oracle.com>
 <53208d1a-b969-d2e7-8657-d95f2d2d22e8@oracle.com>
 <f14525b5-870c-1f09-21d3-a218deb962c0@oracle.com>
 <07b6370a-ff4f-641c-7661-e5b1231a7d8c@oracle.com>
 <102b841c-6797-dcda-4e33-e7282a2336b0@oracle.com>
Message-ID: <7803e51d-42d0-fe15-41fb-70a7d8a630ab@oracle.com>

Thanks Serguei!
Coleen

On 12/18/19 1:33 PM, serguei.spitsyn at oracle.com wrote:
> Hi Coleen,
>
> Just wanted to confirm the webrev V2 version looks okay to me.
> Sorry for replying on the wrong mailing thread.
>
> Thanks,
> Serguei
>
>
> On 12/18/19 08:42, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 12/18/19 8:45 AM, David Holmes wrote:
>>> Thanks for the additional info Coleen!
>>>
>>> Just to add a bit more to the initialization history. The 
>>> ServiceThread is a generalization of the LowMemoryDetectorThread 
>>> that was part of the management API, and so it was initialized in 
>>> Management::initialize. When it turned into the ServiceThread - to 
>>> process JVMTI deferred events in addition to the low-memory-detector 
>>> events - the initialization placement remained the same. Then later 
>>> the INCLUDE_MANAGEMENT guards were added (JDK-7189254, October 
>>> 2012). Later still we started adding other items of work for the 
>>> ServiceThread. The earliest was the AllocationContextService 
>>> notification in September 2014 but as that no longer exists I can't 
>>> tell if that was the first non-management related use. Then the 
>>> StringTable use was added 18 months ago - which definitely was 
>>> outside the realm of the management API. So that is when the 
>>> MinimalVM was first "broken". So it is good that is fixed.
>>>
>>> With regard to the placement in the initialization order, my 
>>> remaining concern was with JVMTI event processing that might happen 
>>> via events generated very early in the init sequence. But you have 
>>> now modified things so that we will only process events in the LIVE 
>>> phase, which only activates after all the class library 
>>> initialization is complete.
>>>
>>> So overall I'm no longer significantly concerned about the change to 
>>> the initialization order as I think you have it all covered. Thanks 
>>> for bearing with me and all the off-list discussion.
>>
>> Thank you for having this discussion with me and provoking me to 
>> recheck the ServiceThread.?? I think we can do further work to 
>> future-proof initialization order but the design needs to be improved.
>>
>> Thanks for reviewing,
>> Coleen
>>>
>>> Cheers,
>>> David
>>> -----
>>>
>>> On 18/12/2019 1:27 am, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 12/16/19 11:04 PM, David Holmes wrote:
>>>>> Clarification ...
>>>>>
>>>>> On 17/12/2019 12:40 pm, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Short answer below.
>>>>>>
>>>>>> On 12/16/19 5:51 PM, David Holmes wrote:
>>>>>>> Hi Coleen,
>>>>>>>
>>>>>>> A quick initial response ...
>>>>>>>
>>>>>>> On 16/12/2019 11:26 pm, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/16/19 8:04 AM, David Holmes wrote:
>>>>>>>>> Hi Coleen,
>>>>>>>>>
>>>>>>>>> On 16/12/2019 9:41 pm, coleen.phillimore at oracle.com wrote:
>>>>>>>>>> Summary: Start ServiceThread before compiler threads, and run 
>>>>>>>>>> nmethod barriers for zgc before adding to the service thread 
>>>>>>>>>> queue, or posting the events on the java thread queue.
>>>>>>>>>
>>>>>>>>> I can't comment on most of this but the earlier starting of 
>>>>>>>>> the service thread has some concerns:
>>>>>>>>>
>>>>>>>>> - there is a lot of JDK level initialization which now will 
>>>>>>>>> not have happened before the service thread is started and it 
>>>>>>>>> is far from obvious that all possible initialization 
>>>>>>>>> dependencies will be satisfied
>>>>>>>>
>>>>>>>> I agree that the order of initialization is very sensitive. 
>>>>>>>> From the actions that the service thread does, the one that I 
>>>>>>>> found was a problem was that events were posted before the LIVE 
>>>>>>>> phase (see comment in has_events()), which could have happened 
>>>>>>>> with the existing code, but the window for the race is a lot 
>>>>>>>> smaller. ? The other actions can be run if there's a GC before 
>>>>>>>> initialization but would be a bug in the initialization code, 
>>>>>>>> and I didn't find these bugs in all my testing. There are some 
>>>>>>>> ordering dependencies that do have odd side effects (between 
>>>>>>>> the compiler thread startup and initialization jsr292 classes) 
>>>>>>>> which have comments. This patch doesn't touch those.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> - current starting of the service thread in 
>>>>>>>>> Management::initialize is guarded by "#if INCLUDE_MANAGEMENT", 
>>>>>>>>> but now you are starting the service thread unconditionally 
>>>>>>>>> for all builds. Hmm just saw your latest comment to the bug 
>>>>>>>>> report - so the service thread is now (for quite some time?) 
>>>>>>>>> being used for other than management tasks and so should 
>>>>>>>>> always be present even if INCLUDE_MANAGEMENT is not enabled. 
>>>>>>>>> Is that sufficient or are there likely to be other changes 
>>>>>>>>> needed to actually ensure that all works correctly? e.g. any 
>>>>>>>>> code the service thread executes that is only defined for 
>>>>>>>>> INCLUDE_MANAGEMENT will need to be compiled out explicitly.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I asked Jie offline to check the minimal build.? I don't think 
>>>>>>>> there are other INCLUDE_MANAGEMENT actions in the service 
>>>>>>>> thread and I'm not sure why it was initialized there in the 
>>>>>>>> first place. The minimal vm would have been broken ie. 
>>>>>>>> hashtables would not have been cleaned up, etc, but I'm not 
>>>>>>>> sure how well that is tested or if one would notice.
>>>>>>>>> - the service thread and the notification thread are (were?) 
>>>>>>>>> closely related but now started at completely different times
>>>>>>>>
>>>>>>>> The notification thread is limited to "services" so it makes 
>>>>>>>> sense where it is.? The ServiceThread does lots of other 
>>>>>>>> things.? Maybe it needs renaming in 15.
>>>>>>>>>
>>>>>>>>> The bug report states the problem as:
>>>>>>>>>
>>>>>>>>> "The graal crash is because compiled_method_load events are 
>>>>>>>>> added to the ServiceThread's deferred event queue before the 
>>>>>>>>> ServiceThread is created so are not walked to keep them from 
>>>>>>>>> being zombied."
>>>>>>>>>
>>>>>>>>> so why isn't the solution to ensure the deferred event queue 
>>>>>>>>> is walked? I'm not clear how starting the service thread 
>>>>>>>>> relates to walking the queue.
>>>>>>>>>
>>>>>>>>
>>>>>>>> The service thread is responsible for walking the deferred 
>>>>>>>> event queue.?? See ServiceThread::oops_do/nmethods_do.?? The 
>>>>>>>> design could be changed to have some global walk somewhere of 
>>>>>>>> this queue, but essentially this queue is processed by the 
>>>>>>>> service thread.
>>>>>>>
>>>>>>> Sorry I don't follow. I thought "oops_do" and friends are for 
>>>>>>> the GC threads and/or VMThread to call to process oops when GC 
>>>>>>> updates them.
>>>>>>
>>>>>> The oops_do and nmethods_do() can be called by a thread walk in 
>>>>>> handshakes (by the sweeper thread) and by parallel GC thread 
>>>>>> walks. There isn't a single entry to do the thread-specific 
>>>>>> closures that we need to do for these deferred event queues.?? I 
>>>>>> tried a version that walked the queues with a static call but 
>>>>>> missed some places where it would be needed to make this call 
>>>>>> (didn't work).? Keeping this associated with the ServiceThread 
>>>>>> simplifies a lot.
>>>>>
>>>>> Just to clarify that further, the thread walk requires the thread 
>>>>> appears in ALL_JAVA_THREADS but that only happens after the 
>>>>> ServiceThread has been started. So in essence we don't really need 
>>>>> the ServiceThread to have commenced execution earlier, but we need 
>>>>> it to have been created. Those two steps are combined in practice.
>>>>
>>>> Yes.? Then the ServiceThread waits on the Service_lock until 
>>>> notified by these events:
>>>>
>>>> ?????? while (((sensors_changed = (!UseNotificationThread && 
>>>> LowMemoryDetector::has_pending_requests())) |
>>>> ?????????????? (has_jvmti_events = 
>>>> _jvmti_service_queue.has_events()) |
>>>> ?????????????? (has_gc_notification_event = (!UseNotificationThread 
>>>> && GCNotifier::has_event())) |
>>>> ?????????????? (has_dcmd_notification_event = 
>>>> (!UseNotificationThread && 
>>>> DCmdFactory::has_pending_jmx_notification())) |
>>>> ?????????????? (stringtable_work = StringTable::has_work()) |
>>>> ?????????????? (symboltable_work = SymbolTable::has_work()) |
>>>> ?????????????? (resolved_method_table_work = 
>>>> ResolvedMethodTable::has_work()) |
>>>> ?????????????? (thread_id_table_work = ThreadIdTable::has_work()) |
>>>> ?????????????? (protection_domain_table_work = 
>>>> SystemDictionary::pd_cache_table()->has_work()) |
>>>> ?????????????? (oopstorage_work = 
>>>> OopStorage::has_cleanup_work_and_reset())
>>>> ????????????? ) == 0) {
>>>>
>>>> The first, third and fourth events are from management.cpp events 
>>>> that were initialized after the ServiceThread was started.
>>>> The second event I have changed, to wait until LIVE phase to return 
>>>> true.
>>>> The stringtable, symboltable, resolved_method_table, thread_id and 
>>>> pd table have static _has_work variables initialized to false.
>>>> The oopstorage_work has similar, but may update a time-based 
>>>> counter a bit earlier with the service thread starting earlier. I 
>>>> think this is harmless.
>>>>
>>>> It is possible that after the service thread starts and before the 
>>>> compiler thread starts, there could be a GC that notifies the 
>>>> stringtable to clean up.? This seems like a good thing that the GC 
>>>> would clean up these tables with this order.? I ran the tier4 graal 
>>>> tests and there were no failures.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>>> thanks,
>>>>>> Coleen
>>>>>>
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> I had an additional change to make the queue non-static but 
>>>>>>>> want to limit the change at this point.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> See bug for description of the problems found with the new 
>>>>>>>>>> Zombie.java test.
>>>>>>>>>>
>>>>>>>>>> open webrev at 
>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8235829.01/webrev
>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8235829
>>>>>>>>>>
>>>>>>>>>> Ran tier1 all platforms, and tier2-8 testing, as well as 
>>>>>>>>>> rerunning original test failure from bug 
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8173361.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>
>>>>>>
>>>>
>>
>


From david.holmes at oracle.com  Thu Dec 19 02:11:59 2019
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 19 Dec 2019 12:11:59 +1000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>

Hi Richard,

I think my issue is with the way EliminateNestedLocks works so I'm going 
to look into that more deeply.

Thanks for the explanations.

David

On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> Hi David,
> 
>    > >    > Some further queries/concerns:
>    > >    >
>    > >    > src/hotspot/share/runtime/objectMonitor.cpp
>    > >    >
>    > >    > Can you please explain the changes to ObjectMonitor::wait:
>    > >    >
>    > >    > !   _recursions = save      // restore the old recursion count
>    > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
>    > >    > increased by the deferred relock count
>    > >    >
>    > >    > what is the "deferred relock count"? I gather it relates to
>    > >    >
>    > >    > "The code was extended to be able to deoptimize objects of a
>    > > frame that
>    > >    > is not the top frame and to let another thread than the owning
>    > > thread do
>    > >    > it."
>    > >
>    > > Yes, these relate. Currently EA based optimizations are reverted, when a compiled frame is
>    > > replaced with corresponding interpreter frames. Part of this is relocking objects with eliminated
>    > > locking. New with the enhancement is that we do this also just before object references are
>    > > acquired through JVMTI. In this case we deoptimize also the owning compiled frame C and we
>    > > register deoptimized objects as deferred updates. When control returns to C it gets deoptimized,
>    > > we notice that objects are already deoptimized (reallocated and relocked), so we don't do it again
>    > > (relocking twice would be incorrect of course). Deferred updates are copied into the new
>    > > interpreter frames.
>    > >
>    > > Problem: relocking is not possible if the target thread T is waiting on the monitor that needs to
>    > > be relocked. This happens only with non-local objects with EliminateNestedLocks. Instead relocking
>    > > is deferred until T owns the monitor again. This is what the piece of code above does.
>    >
>    >  Sorry I need some more detail here. How can you wait() on an object
>    >  monitor if the object allocation and/or locking was optimised away? And
>    >  what is a "non-local object" in this context? Isn't EA restricted to
>    >  thread-confined objects?
> 
> "Non-local object" is an object that escapes its thread. The issue I'm addressing with the changes
> in ObjectMonitor::wait are almost unrelated to EA. They are caused by EliminateNestedLocks, where C2
> eliminates recursive locking of an already owned lock. The lock owning object exists on the heap, it
> is locked and you can call wait() on it.
> 
> EliminateLocks is the C2 option that controls lock elimination based on EA.  Both optimizations have
> in common that objects with eliminated locking need to be relocked when deoptimizing a frame,
> i.e. when replacing a compiled frame with equivalent interpreter
> frames. Deoptimization::relock_objects does that job for /all/ eliminated locks in scope. /All/ can
> be a mix of eliminated nested locks and locks of not-escaping objects.
> 
> New with the enhancement: I call relock_objects earlier, just before objects pontentially
> escape. But then later when the owning compiled frame gets deoptimized, I must not do it again:
> 
> See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp:
> 
>   373   if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && EliminateLocks))
>   374       && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
>   375     bool unused;
>   376     eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, unused);
>   377   }
> 
> Now when calling relock_objects early it is quiet possible that I have to relock an object the
> target thread currently waits for. Obviously I cannot relock in this case, instead I chose to
> introduce relock_count_after_wait to JavaThread.
> 
>    >  Is it just that some of the locking gets optimized away e.g.
>    >
>    >  synchronised(obj) {
>    >     synchronised(obj) {
>    >       synchronised(obj) {
>    >         obj.wait();
>    >       }
>    >     }
>    >  }
>    >
>    >  If this is reduced to a form as-if it were a single lock of the monitor
>    >  (due to EA) and the wait() triggers a JVM TI event which leads to the
>    >  escape of "obj" then we need to reconstruct the true lock state, and so
>    >  when the wait() internally unblocks and reacquires the monitor it has to
>    >  set the true recursion count to 3, not the 1 that it appeared to be when
>    >  wait() was initially called. Is that the scenario?
> 
> Kind of... except that the locking is not eliminated due to EA and there is no JVM TI event
> triggered by wait.
> 
> Add
> 
> LocalObject l1 = new LocalObject();
> 
> in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This triggers the code in
> question.
> 
> See that relocking/reallocating is transactional. If it is done then for /all/ objects in scope and it is
> done at most once. It wouldn't be quite so easy to split this in relocking of nested/EA-based
> eliminated locks.
> 
>    >  If so I find this truly awful. Anyone using wait() in a realistic form
>    >  requires a notification and so the object cannot be thread confined. In
> 
> It is not thread confined.
> 
>    >  which case I would strongly argue that upon hitting the wait() the deopt
>    >  should occur unconditionally and so the lock state is correct before we
>    >  wait and so we don't need to mess with the recursion count internally
>    >  when we reacquire the monitor.
>    >
>    > >
>    > >    > which I don't like the sound of at all when it comes to ObjectMonitor
>    > >    > state. So I'd like to understand in detail exactly what is going on here
>    > >    > and why.  This is a very intrusive change that seems to badly break
>    > >    > encapsulation and impacts future changes to ObjectMonitor that are under
>    > >    > investigation.
>    > >
>    > > I would not regard this as breaking encapsulation. Certainly not badly.
>    > >
>    > > I've added a property relock_count_after_wait to JavaThread. The property is well
>    > > encapsulated. Future ObjectMonitor implementations have to deal with recursion too. They are free
>    > > in choosing a way to do that as long as that property is taken into account. This is hardly a
>    > > limitation.
>    >
>    >  I do think this badly breaks encapsulation as you have to add a callout
>    >  from the guts of the ObjectMonitor code to reach into the thread to get
>    >  this lock count adjustment. I understand why you have had to do this but
>    >  I would much rather see a change to the EA optimisation strategy so that
>    >  this is not needed.
>    >
>    > > Note also that the property is a straight forward extension of the existing concept of deferred
>    > > local updates. It is embedded into the structure holding them. So not even the footprint of a
>    > > JavaThread is enlarged if no deferred updates are generated.
>    >
>    > [...]
>    >
>    > >
>    > > I'm actually duplicating the existing external suspend mechanism, because a thread can be
>    > > suspended at most once. And hey, and don't like that either! But it seems not unlikely that the
>    > > duplicate can be removed together with the original and the new type of handshakes that will be
>    > > used for thread suspend can be used for object deoptimization too. See today's discussion in
>    > > JDK-8227745 [2].
>    >
>    >  I hope that discussion bears some fruit, at the moment it seems not to
>    >  be possible to use handshakes here. :(
>    >
>    >  The external suspend mechanism is a royal pain in the proverbial that we
>    >  have to carefully live with. The idea that we're duplicating that for
>    >  use in another fringe area of functionality does not thrill me at all.
>    >
>    >  To be clear, I understand the problem that exists and that you wish to
>    >  solve, but for the runtime parts I balk at the complexity cost of
>    >  solving it.
> 
> I know it's complex, but by far no rocket science.
> 
> Also I find it hard to imagine another fix for JDK-8233915 besides changing the JVM TI specification.
>   
> Thanks, Richard.
> 
> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Dienstag, 17. Dezember 2019 08:03
> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>
> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents
> 
> <resend as my mailer crashed during last send>
> 
> David
> 
> On 17/12/2019 4:57 pm, David Holmes wrote:
>> Hi Richard,
>>
>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
>>> Hi David,
>>>
>>>  ?? > Some further queries/concerns:
>>>  ?? >
>>>  ?? > src/hotspot/share/runtime/objectMonitor.cpp
>>>  ?? >
>>>  ?? > Can you please explain the changes to ObjectMonitor::wait:
>>>  ?? >
>>>  ?? > !?? _recursions = save????? // restore the old recursion count
>>>  ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>>>  ?? > increased by the deferred relock count
>>>  ?? >
>>>  ?? > what is the "deferred relock count"? I gather it relates to
>>>  ?? >
>>>  ?? > "The code was extended to be able to deoptimize objects of a
>>> frame that
>>>  ?? > is not the top frame and to let another thread than the owning
>>> thread do
>>>  ?? > it."
>>>
>>> Yes, these relate. Currently EA based optimizations are reverted, when
>>> a compiled frame is replaced
>>> with corresponding interpreter frames. Part of this is relocking
>>> objects with eliminated
>>> locking. New with the enhancement is that we do this also just before
>>> object references are acquired
>>> through JVMTI. In this case we deoptimize also the owning compiled
>>> frame C and we register
>>> deoptimized objects as deferred updates. When control returns to C it
>>> gets deoptimized, we notice
>>> that objects are already deoptimized (reallocated and relocked), so we
>>> don't do it again (relocking
>>> twice would be incorrect of course). Deferred updates are copied into
>>> the new interpreter frames.
>>>
>>> Problem: relocking is not possible if the target thread T is waiting
>>> on the monitor that needs to be
>>> relocked. This happens only with non-local objects with
>>> EliminateNestedLocks. Instead relocking is
>>> deferred until T owns the monitor again. This is what the piece of
>>> code above does.
>>
>> Sorry I need some more detail here. How can you wait() on an object
>> monitor if the object allocation and/or locking was optimised away? And
>> what is a "non-local object" in this context? Isn't EA restricted to
>> thread-confined objects?
>>
>> Is it just that some of the locking gets optimized away e.g.
>>
>> synchronised(obj) {
>>   ? synchronised(obj) {
>>   ??? synchronised(obj) {
>>   ????? obj.wait();
>>   ??? }
>>   ? }
>> }
>>
>> If this is reduced to a form as-if it were a single lock of the monitor
>> (due to EA) and the wait() triggers a JVM TI event which leads to the
>> escape of "obj" then we need to reconstruct the true lock state, and so
>> when the wait() internally unblocks and reacquires the monitor it has to
>> set the true recursion count to 3, not the 1 that it appeared to be when
>> wait() was initially called. Is that the scenario?
>>
>> If so I find this truly awful. Anyone using wait() in a realistic form
>> requires a notification and so the object cannot be thread confined. In
>> which case I would strongly argue that upon hitting the wait() the deopt
>> should occur unconditionally and so the lock state is correct before we
>> wait and so we don't need to mess with the recursion count internally
>> when we reacquire the monitor.
>>
>>>
>>>  ?? > which I don't like the sound of at all when it comes to
>>> ObjectMonitor
>>>  ?? > state. So I'd like to understand in detail exactly what is going
>>> on here
>>>  ?? > and why.? This is a very intrusive change that seems to badly break
>>>  ?? > encapsulation and impacts future changes to ObjectMonitor that
>>> are under
>>>  ?? > investigation.
>>>
>>> I would not regard this as breaking encapsulation. Certainly not badly.
>>>
>>> I've added a property relock_count_after_wait to JavaThread. The
>>> property is well
>>> encapsulated. Future ObjectMonitor implementations have to deal with
>>> recursion too. They are free in
>>> choosing a way to do that as long as that property is taken into
>>> account. This is hardly a
>>> limitation.
>>
>> I do think this badly breaks encapsulation as you have to add a callout
>> from the guts of the ObjectMonitor code to reach into the thread to get
>> this lock count adjustment. I understand why you have had to do this but
>> I would much rather see a change to the EA optimisation strategy so that
>> this is not needed.
>>
>>> Note also that the property is a straight forward extension of the
>>> existing concept of deferred
>>> local updates. It is embedded into the structure holding them. So not
>>> even the footprint of a
>>> JavaThread is enlarged if no deferred updates are generated.
>>>
>>>  ?? > ---
>>>  ?? >
>>>  ?? > src/hotspot/share/runtime/thread.cpp
>>>  ?? >
>>>  ?? > Can you please explain why
>>> JavaThread::wait_for_object_deoptimization
>>>  ?? > has to be handcrafted in this way rather than using proper
>>> transitions.
>>>  ?? >
>>>
>>> I wrote wait_for_object_deoptimization taking
>>> JavaThread::java_suspend_self_with_safepoint_check
>>> as template. So in short: for the same reasons :)
>>>
>>> Threads reach both methods as part of thread state transitions,
>>> therefore special handling is
>>> required to change thread state on top of ongoing transitions.
>>>
>>>  ?? > We got rid of "deopt suspend" some time ago and it is disturbing
>>> to see
>>>  ?? > it being added back (effectively). This seems like it may be
>>> something
>>>  ?? > that handshakes could be used for.
>>>
>>> Deopt suspend used to be something rather different with a similar
>>> name[1]. It is not being added back.
>>
>> I stand corrected. Despite comments in the code to the contrary
>> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of
>> cleanup in this area 13 years ago :)
>>
>>>
>>> I'm actually duplicating the existing external suspend mechanism,
>>> because a thread can be suspended
>>> at most once. And hey, and don't like that either! But it seems not
>>> unlikely that the duplicate can
>>> be removed together with the original and the new type of handshakes
>>> that will be used for
>>> thread suspend can be used for object deoptimization too. See today's
>>> discussion in JDK-8227745 [2].
>>
>> I hope that discussion bears some fruit, at the moment it seems not to
>> be possible to use handshakes here. :(
>>
>> The external suspend mechanism is a royal pain in the proverbial that we
>> have to carefully live with. The idea that we're duplicating that for
>> use in another fringe area of functionality does not thrill me at all.
>>
>> To be clear, I understand the problem that exists and that you wish to
>> solve, but for the runtime parts I balk at the complexity cost of
>> solving it.
>>
>> Thanks,
>> David
>> -----
>>
>>> Thanks, Richard.
>>>
>>> [1] Deopt suspend was something like an async. handshake for
>>> architectures with register windows,
>>>  ???? where patching the return pc for deoptimization of a compiled
>>> frame was racy if the owner thread
>>>  ???? was in native code. Instead a "deopt" suspend flag was set on
>>> which the thread patched its own
>>>  ???? frame upon return from native. So no thread was suspended. It got
>>> its name only from the name of
>>>  ???? the flags.
>>>
>>> [2] Discussion about using handshakes to sync. with the target thread:
>>>       
>>> https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727
>>>
>>>
>>> -----Original Message-----
>>> From: David Holmes <david.holmes at oracle.com>
>>> Sent: Freitag, 13. Dezember 2019 00:56
>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>> serviceability-dev at openjdk.java.net;
>>> hotspot-compiler-dev at openjdk.java.net;
>>> hotspot-runtime-dev at openjdk.java.net
>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>> Performance in the Presence of JVMTI Agents
>>>
>>> Hi Richard,
>>>
>>> Some further queries/concerns:
>>>
>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>
>>> Can you please explain the changes to ObjectMonitor::wait:
>>>
>>> !?? _recursions = save????? // restore the old recursion count
>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
>>> increased by the deferred relock count
>>>
>>> what is the "deferred relock count"? I gather it relates to
>>>
>>> "The code was extended to be able to deoptimize objects of a frame that
>>> is not the top frame and to let another thread than the owning thread do
>>> it."
>>>
>>> which I don't like the sound of at all when it comes to ObjectMonitor
>>> state. So I'd like to understand in detail exactly what is going on here
>>> and why.? This is a very intrusive change that seems to badly break
>>> encapsulation and impacts future changes to ObjectMonitor that are under
>>> investigation.
>>>
>>> ---
>>>
>>> src/hotspot/share/runtime/thread.cpp
>>>
>>> Can you please explain why JavaThread::wait_for_object_deoptimization
>>> has to be handcrafted in this way rather than using proper transitions.
>>>
>>> We got rid of "deopt suspend" some time ago and it is disturbing to see
>>> it being added back (effectively). This seems like it may be something
>>> that handshakes could be used for.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>> On 12/12/2019 7:02 am, David Holmes wrote:
>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
>>>>> Hi David,
>>>>>
>>>>>  ??? > Most of the details here are in areas I can comment on in detail,
>>>>> but I
>>>>>  ??? > did take an initial general look at things.
>>>>>
>>>>> Thanks for taking the time!
>>>>
>>>> Apologies the above should read:
>>>>
>>>> "Most of the details here are in areas I *can't* comment on in detail
>>>> ..."
>>>>
>>>> David
>>>>
>>>>>  ??? > The only thing that jumped out at me is that I think the
>>>>>  ??? > DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>  ??? >
>>>>>  ??? > +? bool is_hidden_from_external_view() const { return true; }
>>>>>
>>>>> Yes, it should. Will add the method like above.
>>>>>
>>>>>  ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>> Without
>>>>>  ??? > active testing this will just bit-rot.
>>>>>
>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
>>>>> workload. I will add a minimal test
>>>>> to keep it fresh.
>>>>>
>>>>>  ??? > Also on the tests I don't understand your @requires clause:
>>>>>  ??? >
>>>>>  ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>>  ??? > (vm.opt.TieredCompilation != true))
>>>>>  ??? >
>>>>>  ??? > This seems to require that TieredCompilation is disabled, but
>>>>> tiered is
>>>>>  ??? > our normal mode of operation. ??
>>>>>  ??? >
>>>>>
>>>>> I removed the clause. I guess I wanted to target the tests towards the
>>>>> code they are supposed to
>>>>> test, and it's easier to analyze failures w/o tiered compilation and
>>>>> with just one compiler thread.
>>>>>
>>>>> Additionally I will make use of
>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests.
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
>>>>> serviceability-dev at openjdk.java.net;
>>>>> hotspot-compiler-dev at openjdk.java.net;
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
>>>>> Performance in the Presence of JVMTI Agents
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I would like to get reviews please for
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
>>>>>>
>>>>>> Corresponding RFE:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>>>
>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]
>>>>>>
>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without
>>>>>> issues (thanks!). In addition the
>>>>>> change is being tested at SAP since I posted the first RFR some
>>>>>> months ago.
>>>>>>
>>>>>> The intention of this enhancement is to benefit performance wise from
>>>>>> escape analysis even if JVMTI
>>>>>> agents request capabilities that allow them to access local variable
>>>>>> values. E.g. if you start-up
>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then
>>>>>> escape analysis is disabled right
>>>>>> from the beginning, well before a debugger attaches -- if ever one
>>>>>> should do so. With the
>>>>>> enhancement, escape analysis will remain enabled until and after a
>>>>>> debugger attaches. EA based
>>>>>> optimizations are reverted just before an agent acquires the
>>>>>> reference to an object. In the JBS item
>>>>>> you'll find more details.
>>>>>
>>>>> Most of the details here are in areas I can comment on in detail, but I
>>>>> did take an initial general look at things.
>>>>>
>>>>> The only thing that jumped out at me is that I think the
>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
>>>>>
>>>>> +? bool is_hidden_from_external_view() const { return true; }
>>>>>
>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
>>>>> Without
>>>>> active testing this will just bit-rot.
>>>>>
>>>>> Also on the tests I don't understand your @requires clause:
>>>>>
>>>>>  ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled &
>>>>> (vm.opt.TieredCompilation != true))
>>>>>
>>>>> This seems to require that TieredCompilation is disabled, but tiered is
>>>>> our normal mode of operation. ??
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
>>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch
>>>>>>
>>>>>>
>>>>>>

From serguei.spitsyn at oracle.com  Thu Dec 19 04:33:28 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Dec 2019 20:33:28 -0800
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
Message-ID: <baf2c6e0-2abc-c267-278b-03133d510ea1@oracle.com>

Hi Robbin,

The fix looks good to me.
At least, I do not see any issues with it.
Thank you for removing the unused code!

Could you be more precise about on what jvmti/jdi tests you run?
For good test coverage we need these test suites:
vmTestbase_nsk_jvmti, vmTestbase_nsk_jdi, vmTestbase_nsk_jdb, jdk_jdi

They have to be present in the t1-7.
I list them in a case if you want to run them with some specific options.

Thanks,
Serguei


On 12/16/19 01:47, Robbin Ehn wrote:
> Hi all, please review.
>
> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>
> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in 
> a vm operation) before they are installed in the safeopint and after 
> they have been installed, walked with JvmtiCurrentBreakpoints::oops_do().
> By putting the class holder inside oopStorage there is no need for this.
>
> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine 
> classes actually removes the breakpoints before updating them (so 
> there is no breakpoints to update).
> We can just remove metadata_do.
>
>
> I also removed some unused code.
>
> Changeset:
> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>
> Passes several runs of nsk jvmti/jdi and t1-7.
>
> Thanks, Robbin


From serguei.spitsyn at oracle.com  Thu Dec 19 04:40:01 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Dec 2019 20:40:01 -0800
Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA
 javascript support since it will always fail, and will likely be removed soon
In-Reply-To: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
References: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
Message-ID: <9dd98e5c-7dc7-82d1-7a40-a0e928bc095b@oracle.com>

An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191218/6e40ca03/attachment.htm>

From chris.plummer at oracle.com  Thu Dec 19 04:43:57 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Dec 2019 20:43:57 -0800
Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA
 javascript support since it will always fail, and will likely be removed soon
In-Reply-To: <9dd98e5c-7dc7-82d1-7a40-a0e928bc095b@oracle.com>
References: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
 <9dd98e5c-7dc7-82d1-7a40-a0e928bc095b@oracle.com>
Message-ID: <b7e6f257-2408-eb70-3275-8a7168db8681@oracle.com>

We don't document the javascript support anywhere that I'm aware of. The 
only way you would become aware of it is if after launching clhsdb you 
did a "help" and saw the jsload and jseval commands, but that was only 
when javascript initialization was working. When it's broken you don't 
see these commands.

Chris

On 12/18/19 8:40 PM, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
> It looks okay.
> I wonder if we have itdocumented anywhere.
> Do we also need any doc cleanup?
>
> Thanks,
> Serguei
>
>
> On 12/16/19 21:36, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8236062
>> http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/
>>
>> Since SA javascript support is broken as described in [1] 
>> JDK-8235594, and we'll likely remove it, I'd like to at least get it 
>> disabled now. I'd like to get this into 14 mostly because I really 
>> want to get [2] JDK-8234048 fixed in 14 because we'll start seeing 
>> the clhsdb test failures on macos 10.14 and 10.15 more often over the 
>> coming months as we deploy more macosx test hosts with those 
>> versions. However, [2] JDK-8234048 is blocked by [3] JDK-8234277 
>> (which improves error checking and failure output for the clhsdb 
>> tests), and [3] JDK-8234277 is blocked by this CR because the 
>> exceptions produced when javascript fails to initialize end up 
>> cluttering the clhsdb test logs, even when the test passes (and is 
>> misleading when the test fails).
>>
>> Sorry about all the bug references and inter-dependencies. It's taken 
>> a while myself to get my head wrapped around how I wanted to approach 
>> fixing them all in a meaningful order.
>>
>> thanks,
>>
>> Chris
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8235594
>> [2] https://bugs.openjdk.java.net/browse/JDK-8234048
>> [3] https://bugs.openjdk.java.net/browse/JDK-8234277
>


From serguei.spitsyn at oracle.com  Thu Dec 19 05:09:01 2019
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Dec 2019 21:09:01 -0800
Subject: [14]RFR(XS): 8236062: Disable clhsdb initialization of SA
 javascript support since it will always fail, and will likely be removed soon
In-Reply-To: <b7e6f257-2408-eb70-3275-8a7168db8681@oracle.com>
References: <d3fefe95-d109-4f71-ea2d-c1257d6b93b6@oracle.com>
 <9dd98e5c-7dc7-82d1-7a40-a0e928bc095b@oracle.com>
 <b7e6f257-2408-eb70-3275-8a7168db8681@oracle.com>
Message-ID: <c2139b2c-45f7-11a3-56f1-d6e5404fc978@oracle.com>

Okay, thanks!
Serguei


On 12/18/19 20:43, Chris Plummer wrote:
> We don't document the javascript support anywhere that I'm aware of. 
> The only way you would become aware of it is if after launching clhsdb 
> you did a "help" and saw the jsload and jseval commands, but that was 
> only when javascript initialization was working. When it's broken you 
> don't see these commands.
>
> Chris
>
> On 12/18/19 8:40 PM, serguei.spitsyn at oracle.com wrote:
>> Hi Chris,
>>
>> It looks okay.
>> I wonder if we have itdocumented anywhere.
>> Do we also need any doc cleanup?
>>
>> Thanks,
>> Serguei
>>
>>
>> On 12/16/19 21:36, Chris Plummer wrote:
>>> Hello,
>>>
>>> Please review the following:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8236062
>>> http://cr.openjdk.java.net/~cjplummer/8236062/webrev.00/
>>>
>>> Since SA javascript support is broken as described in [1] 
>>> JDK-8235594, and we'll likely remove it, I'd like to at least get it 
>>> disabled now. I'd like to get this into 14 mostly because I really 
>>> want to get [2] JDK-8234048 fixed in 14 because we'll start seeing 
>>> the clhsdb test failures on macos 10.14 and 10.15 more often over 
>>> the coming months as we deploy more macosx test hosts with those 
>>> versions. However, [2] JDK-8234048 is blocked by [3] JDK-8234277 
>>> (which improves error checking and failure output for the clhsdb 
>>> tests), and [3] JDK-8234277 is blocked by this CR because the 
>>> exceptions produced when javascript fails to initialize end up 
>>> cluttering the clhsdb test logs, even when the test passes (and is 
>>> misleading when the test fails).
>>>
>>> Sorry about all the bug references and inter-dependencies. It's 
>>> taken a while myself to get my head wrapped around how I wanted to 
>>> approach fixing them all in a meaningful order.
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8235594
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8234048
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8234277
>>
>


From chris.plummer at oracle.com  Thu Dec 19 07:04:30 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Dec 2019 23:04:30 -0800
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
Message-ID: <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>

Hi Roman,

I'll have a look at this, although it might not be for a few days. In 
the meantime, maybe you can describe your new implementation in 
classTrack.c so it's easier to look through the changes.

thanks,

Chris

On 12/18/19 5:05 AM, Roman Kennke wrote:
> Hello all,
>
> Issue:
> https://bugs.openjdk.java.net/browse/JDK-8227269
>
> I am proposing what amounts to a rewrite of classTrack.c. It avoids 
> throwing away the class cache on GC, and instead keeps track of 
> loaded/unloaded classes one-by-one.
>
> In addition to that, it avoids this whole dance until an agent 
> registers interest in EI_GC_FINISH.
>
> Webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>
> Testing: manual testing of provided test scenarios and timing.
>
> Eg with the testcase provided here:
> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>
> I am getting those numbers:
> unpatched: no debug: 84s with debug: 225s
> patched:?? no debug: 85s with debug: 95s
>
> I also tested successfully through jdk/submit repo
>
> Can I please get a review?
>
> Thanks,
> Roman
>


From suenaga at oss.nttdata.com  Thu Dec 19 08:17:06 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Thu, 19 Dec 2019 17:17:06 +0900
Subject: Removal of SA javascript support
In-Reply-To: <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
Message-ID: <997585a0-e257-7152-d4cb-3177b8337bd4@oss.nttdata.com>

Hi,

I think we can provide API for SA as following:

   Patch: http://cr.openjdk.java.net/~ysuenaga/sa-api/webrev/
   Plugin examples:
     browse: http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples/
     download: http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples.tar.gz

I think JS plugin (loading via `jsload` CLHSDB command) was supported "AS IS".
If HotSpot and/or SA code is changed, the user should follow it if need.
SA is not part of Java SE. We need not to maintain SA API when it happen IMHO.

The user who want to expand SA features (includes me!) should have responsible for it.
So I did not expose jdk.hotspot.agent module - the user need to build with --add-exports.

My proposal can write SA plugins with pure Java. So we don't need to depend on script engine.


Comments are welcome.

Thanks,

Yasumasa


On 2019/12/11 21:47, sundararajan.athijegannathan at oracle.com wrote:
> Effectively you're asking for SA as API. I don't think that is a good idea. That implies supporting hotspot data structures as Java *API*. That will be maintainability nightmare - we've to keep tracking hotspot data structures in SA code. That itself is problematic. API would be next level nightmare.
> 
> -Sundar
> 
> On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
>> Hi,
>>
>> IMHO we need to export all packages in SA if we do not provide new API for SA.
>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 (before Jigsaw), so we could make various functions if we need.
>>
>> OTOH we cannot know what classes are needed by the SA users. All packages in jdk.hotspot.agent module provides features, and they require other packages. For example, sun.jvm.hotspot.oops.Oop requires sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger .
>> It is difficult to track and to export minimally.
>> (I worked for it in JDK-8157947, but I gave up...)
>>
>> Thus I guess it is a big challenge to export SA classes without refactoring.
>> If we provide new API for SA plugin, I guess we need to work some refactoring.
>>
>>
>> Yasumasa
>>
>>
>> On 2019/12/11 15:00, Chris Plummer wrote:
>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>>> Hi?Yasumasa,
>>>>>
>>>>> That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right?
>>>>
>>>> Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc...
>>>>
>>> Yes, or export them. I should have read this email before posting my previous one.
>>>
>>> Chris
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>> [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>>
>>>>> - Kris
>>>>>
>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>>
>>>>> ??? Hi Chris,
>>>>>
>>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw.
>>>>> ??? However I want SA to implement pluggable feature.
>>>>> ??? I use custom script to list compiled codes in CodeCache.
>>>>>
>>>>> ??? I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash.
>>>>>
>>>>>
>>>>> ??? Thanks,
>>>>>
>>>>> ??? Yasumasa
>>>>>
>>>>>
>>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>>> ???? > Hi,
>>>>> ???? >
>>>>> ???? > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it.
>>>>> ???? >
>>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation.
>>>>> ???? >
>>>>> ???? > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following:
>>>>> ???? >
>>>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass");
>>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>>>> ???? >
>>>>> ???? > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1].
>>>>> ???? >
>>>>> ???? > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented.
>>>>> ???? >
>>>>> ???? > The real purpose of the email is to propose removal of this support. Here are the reasons:
>>>>> ???? >
>>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered.
>>>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>>> ???? > (3) We have very little understanding of the javascript support.
>>>>> ???? > (4) No resources to work on it (unless there is a community volunteer).
>>>>> ???? > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value.
>>>>> ???? >
>>>>> ???? > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript.
>>>>> ???? >
>>>>> ???? > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb:
>>>>> ???? >
>>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not be available.
>>>>> ???? >
>>>>> ???? > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about.
>>>>> ???? >
>>>>> ???? > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15.
>>>>> ???? >
>>>>> ???? > Please let me know what you think.
>>>>> ???? >
>>>>> ???? > thanks,
>>>>> ???? >
>>>>> ???? > Chris
>>>>> ???? >
>>>>> ???? > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>>> ???? > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>>> ???? >
>>>>>
>>>
>>>

From rkennke at redhat.com  Thu Dec 19 10:45:48 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 19 Dec 2019 11:45:48 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
Message-ID: <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>

Hi Chris,

> I'll have a look at this, although it might not be for a few days. In
> the meantime, maybe you can describe your new implementation in
> classTrack.c so it's easier to look through the changes.

Sure.

The purpose of this class-tracking is to be able to determine the
signatures of unloaded classes when GC/class-unloading happened, so that
we can generate the appropriate JDWP event.

The current implementation does so by maintaining a table of currently
prepared classes by building that table when classTrack is initialized,
and then add new classes whenever a class gets loaded. When unloading
occurs, that cache is rebuilt into a new table, and compared with the
old table, and whatever is in the old, but not in the new table gets
returned. The problem is that when GCs happen frequently and/or many
classes get loaded+unloaded, this amounts to O(classCount*gcCount)
complexity.

The new implementation keeps a linked-list of prepared classes, and also
tracks unloads via the listener cbTrackingObjectFree(). Whenever an
unload/GC occurs, the list of prepared classes is scanned, and classes
that are also in the deletedTagBag are unlinked (thus maintaining the
prepared-classes-list) and its signature put in the list that gets returned.

The implementation is not perfect. In order to determine whether or not
a class is unloaded, it needs to scan the deletedTagBag. That process is
therefore still O(unloadedClassCount). The assumption here is that
unloadedClassCount << classCount. In my experiments this seems to be
true, and also reasonable to expect.

(I have some ideas how to improve the implementation to ~O(1) but it
would be considerably more complex: have to maintain a (hash)table that
maps tags -> KlassNode*, unlink them directly upon unload, and build the
unloaded-signatures list there, but I don't currently see that it's
worth the effort).

In addition to all that, this process is only activated when there's an
actual listener registered for EI_GC_FINISH.

Thanks,
Roman


> Chris
> 
> On 12/18/19 5:05 AM, Roman Kennke wrote:
>> Hello all,
>>
>> Issue:
>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>
>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>> throwing away the class cache on GC, and instead keeps track of
>> loaded/unloaded classes one-by-one.
>>
>> In addition to that, it avoids this whole dance until an agent
>> registers interest in EI_GC_FINISH.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>
>> Testing: manual testing of provided test scenarios and timing.
>>
>> Eg with the testcase provided here:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>
>> I am getting those numbers:
>> unpatched: no debug: 84s with debug: 225s
>> patched:?? no debug: 85s with debug: 95s
>>
>> I also tested successfully through jdk/submit repo
>>
>> Can I please get a review?
>>
>> Thanks,
>> Roman
>>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191219/040d273b/signature.asc>

From robbin.ehn at oracle.com  Thu Dec 19 12:08:57 2019
From: robbin.ehn at oracle.com (Robbin Ehn)
Date: Thu, 19 Dec 2019 13:08:57 +0100
Subject: RFR(s): 8235912: JvmtiBreakpoint remove oops_do and metadata_do
In-Reply-To: <baf2c6e0-2abc-c267-278b-03133d510ea1@oracle.com>
References: <ae042006-8aa7-e7c8-0c35-7336eda849cd@oracle.com>
 <baf2c6e0-2abc-c267-278b-03133d510ea1@oracle.com>
Message-ID: <3475ad31-bc61-bd49-6b38-1c008adf7094@oracle.com>

Hi Serguei,

On 12/19/19 5:33 AM, serguei.spitsyn at oracle.com wrote:
> Hi Robbin,
> 
> The fix looks good to me.
> At least, I do not see any issues with it.
> Thank you for removing the unused code!

Thanks!

> 
> Could you be more precise about on what jvmti/jdi tests you run?
> For good test coverage we need these test suites:
> vmTestbase_nsk_jvmti, vmTestbase_nsk_jdi, vmTestbase_nsk_jdb, jdk_jdi

It's these two vmTestbase_nsk_jvmti/vmTestbase_nsk_jdi.

> 
> They have to be present in the t1-7.
> I list them in a case if you want to run them with some specific options.

I took a pass on jdk_jdi, no issues, thanks!

/Robbin

> 
> Thanks,
> Serguei
> 
> 
> On 12/16/19 01:47, Robbin Ehn wrote:
>> Hi all, please review.
>>
>> From issue, https://bugs.openjdk.java.net/browse/JDK-8235912:
>>
>> JvmtiBreakpoints are walked via VMThread oops_do (the breakpoint is in a vm 
>> operation) before they are installed in the safeopint and after they have been 
>> installed, walked with JvmtiCurrentBreakpoints::oops_do().
>> By putting the class holder inside oopStorage there is no need for this.
>>
>> JvmtiCurrentBreakpoints::metadata_do is not needed because redefine classes 
>> actually removes the breakpoints before updating them (so there is no 
>> breakpoints to update).
>> We can just remove metadata_do.
>>
>>
>> I also removed some unused code.
>>
>> Changeset:
>> http://cr.openjdk.java.net/~rehn/8235912/v1/webrev/
>>
>> Passes several runs of nsk jvmti/jdi and t1-7.
>>
>> Thanks, Robbin
> 

From rkennke at redhat.com  Thu Dec 19 13:11:56 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 19 Dec 2019 14:11:56 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
Message-ID: <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>

Alright, the perfectionist in me got me. I am implementing the even more
efficient ~O(1) class tracking. Please hold off reviewing for now.

Thanks,Roman

 Hi Chris,
> 
>> I'll have a look at this, although it might not be for a few days. In
>> the meantime, maybe you can describe your new implementation in
>> classTrack.c so it's easier to look through the changes.
> 
> Sure.
> 
> The purpose of this class-tracking is to be able to determine the
> signatures of unloaded classes when GC/class-unloading happened, so that
> we can generate the appropriate JDWP event.
> 
> The current implementation does so by maintaining a table of currently
> prepared classes by building that table when classTrack is initialized,
> and then add new classes whenever a class gets loaded. When unloading
> occurs, that cache is rebuilt into a new table, and compared with the
> old table, and whatever is in the old, but not in the new table gets
> returned. The problem is that when GCs happen frequently and/or many
> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
> complexity.
> 
> The new implementation keeps a linked-list of prepared classes, and also
> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
> unload/GC occurs, the list of prepared classes is scanned, and classes
> that are also in the deletedTagBag are unlinked (thus maintaining the
> prepared-classes-list) and its signature put in the list that gets returned.
> 
> The implementation is not perfect. In order to determine whether or not
> a class is unloaded, it needs to scan the deletedTagBag. That process is
> therefore still O(unloadedClassCount). The assumption here is that
> unloadedClassCount << classCount. In my experiments this seems to be
> true, and also reasonable to expect.
> 
> (I have some ideas how to improve the implementation to ~O(1) but it
> would be considerably more complex: have to maintain a (hash)table that
> maps tags -> KlassNode*, unlink them directly upon unload, and build the
> unloaded-signatures list there, but I don't currently see that it's
> worth the effort).
> 
> In addition to all that, this process is only activated when there's an
> actual listener registered for EI_GC_FINISH.
> 
> Thanks,
> Roman
> 
> 
>> Chris
>>
>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>> Hello all,
>>>
>>> Issue:
>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>
>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>> throwing away the class cache on GC, and instead keeps track of
>>> loaded/unloaded classes one-by-one.
>>>
>>> In addition to that, it avoids this whole dance until an agent
>>> registers interest in EI_GC_FINISH.
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>
>>> Testing: manual testing of provided test scenarios and timing.
>>>
>>> Eg with the testcase provided here:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>
>>> I am getting those numbers:
>>> unpatched: no debug: 84s with debug: 225s
>>> patched:?? no debug: 85s with debug: 95s
>>>
>>> I also tested successfully through jdk/submit repo
>>>
>>> Can I please get a review?
>>>
>>> Thanks,
>>> Roman
>>>
>>
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191219/c708c501/signature.asc>

From chris.plummer at oracle.com  Thu Dec 19 16:28:26 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 19 Dec 2019 08:28:26 -0800
Subject: Removal of SA javascript support
In-Reply-To: <997585a0-e257-7152-d4cb-3177b8337bd4@oss.nttdata.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
 <997585a0-e257-7152-d4cb-3177b8337bd4@oss.nttdata.com>
Message-ID: <6a8913b6-0e5d-d514-12c5-326663fa3ba8@oracle.com>

Hi Yasumasa,

I've had similar thoughts about how to extend clhsdb. Why not export 
everything since what we would export is not part of a spec, and the 
javascript support had the same issue of potentially breaking when the 
SA API changed. But maybe this type of unspec'd API exporting is 
considered bad policy. I'm not sure. I'll let the API spec guru's 
comment on that aspect of it.

thanks,

Chris

On 12/19/19 12:17 AM, Yasumasa Suenaga wrote:
> Hi,
>
> I think we can provide API for SA as following:
>
> ? Patch: http://cr.openjdk.java.net/~ysuenaga/sa-api/webrev/
> ? Plugin examples:
> ??? browse: http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples/
> ??? download: 
> http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples.tar.gz
>
> I think JS plugin (loading via `jsload` CLHSDB command) was supported 
> "AS IS".
> If HotSpot and/or SA code is changed, the user should follow it if need.
> SA is not part of Java SE. We need not to maintain SA API when it 
> happen IMHO.
>
> The user who want to expand SA features (includes me!) should have 
> responsible for it.
> So I did not expose jdk.hotspot.agent module - the user need to build 
> with --add-exports.
>
> My proposal can write SA plugins with pure Java. So we don't need to 
> depend on script engine.
>
>
> Comments are welcome.
>
> Thanks,
>
> Yasumasa
>
>
> On 2019/12/11 21:47, sundararajan.athijegannathan at oracle.com wrote:
>> Effectively you're asking for SA as API. I don't think that is a good 
>> idea. That implies supporting hotspot data structures as Java *API*. 
>> That will be maintainability nightmare - we've to keep tracking 
>> hotspot data structures in SA code. That itself is problematic. API 
>> would be next level nightmare.
>>
>> -Sundar
>>
>> On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
>>> Hi,
>>>
>>> IMHO we need to export all packages in SA if we do not provide new 
>>> API for SA.
>>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 
>>> (before Jigsaw), so we could make various functions if we need.
>>>
>>> OTOH we cannot know what classes are needed by the SA users. All 
>>> packages in jdk.hotspot.agent module provides features, and they 
>>> require other packages. For example, sun.jvm.hotspot.oops.Oop 
>>> requires sun.jvm.hotspot.types, and it requires 
>>> sun.jvm.hotspot.debugger .
>>> It is difficult to track and to export minimally.
>>> (I worked for it in JDK-8157947, but I gave up...)
>>>
>>> Thus I guess it is a big challenge to export SA classes without 
>>> refactoring.
>>> If we provide new API for SA plugin, I guess we need to work some 
>>> refactoring.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2019/12/11 15:00, Chris Plummer wrote:
>>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>>>> Hi?Yasumasa,
>>>>>>
>>>>>> That's a very nice idea. Basically what you're asking for is 
>>>>>> exposing the Command interface [1] so that plugins can implement 
>>>>>> it and get dynamically loaded / registered into CLHSDB / HSDB, 
>>>>>> right?
>>>>>
>>>>> Yes, but we also need proxy API to access internal SA objects e.g. 
>>>>> CodeCache, JavaThread, TypeDataBase, etc...
>>>>>
>>>> Yes, or export them. I should have read this email before posting 
>>>> my previous one.
>>>>
>>>> Chris
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> [1]: 
>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>>>
>>>>>> - Kris
>>>>>>
>>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga 
>>>>>> <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>>>
>>>>>> ??? Hi Chris,
>>>>>>
>>>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in 
>>>>>> JS is difficult since Jigsaw.
>>>>>> ??? However I want SA to implement pluggable feature.
>>>>>> ??? I use custom script to list compiled codes in CodeCache.
>>>>>>
>>>>>> ??? I guess other troubleshooters also want similar feature (via 
>>>>>> jsload) in future if they encounter JVM crash.
>>>>>>
>>>>>>
>>>>>> ??? Thanks,
>>>>>>
>>>>>> ??? Yasumasa
>>>>>>
>>>>>>
>>>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>>>> ???? > Hi,
>>>>>> ???? >
>>>>>> ???? > I like to propose the removal of SA javascript support. 
>>>>>> Few people even realize this support exists, and hopefully even 
>>>>>> fewer are using it since I'd like to remove it. Since I'm new to 
>>>>>> this myself, let me first explain what I know about it's 
>>>>>> existence, and then explain why I want to remove it.
>>>>>> ???? >
>>>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval 
>>>>>> commands. Don't look for them in anything post JDK 8. I'll 
>>>>>> explain why later. jsload is used to load a javascript file. In 
>>>>>> that file you can register new clhsdb commands that are written 
>>>>>> in javascript. You can also evaluate javascript using the jseval 
>>>>>> command. Some of this is explained in [1], which is the only 
>>>>>> place I can find any reference to this support. It does not 
>>>>>> appear to be officially supported, nor is there any oracle 
>>>>>> provided documentation.
>>>>>> ???? >
>>>>>> ???? > There also appear to be a few clhsdb commands that are 
>>>>>> written in javascript. Doing a grep for "registerCommand" in 
>>>>>> sa.js shows the following:
>>>>>> ???? >
>>>>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>>>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name 
>>>>>> } [ directory ]", "dclass");
>>>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", 
>>>>>> "dumpHeap");
>>>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", 
>>>>>> "printMem");
>>>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>>>>> ???? >
>>>>>> ???? > Once again, don't go looking for these in anything newer 
>>>>>> than JDK8. You won't find them. Again the only documentation I 
>>>>>> can fine is [1].
>>>>>> ???? >
>>>>>> ???? > The other use of Javascript is the SOQL command (Simple 
>>>>>> Object Query Language), a tool used to query the heap, and also 
>>>>>> the JSDB command. The only SOQL documentation I could find is the 
>>>>>> blog reference [2]. I could not find HSDB documentation, but I 
>>>>>> believe is is a javascript support for looking at hotspot. So 
>>>>>> once again, neither of these seem to be officially supported or 
>>>>>> documented.
>>>>>> ???? >
>>>>>> ???? > The real purpose of the email is to propose removal of 
>>>>>> this support. Here are the reasons:
>>>>>> ???? >
>>>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is 
>>>>>> why you don't see the javascript related commands in clhsdb. 
>>>>>> Javascript fails to initialize, so none of the javascript related 
>>>>>> commands are registered.
>>>>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>>>> ???? > (3) We have very little understanding of the javascript 
>>>>>> support.
>>>>>> ???? > (4) No resources to work on it (unless there is a 
>>>>>> community volunteer).
>>>>>> ???? > (5) Very questionable value (lack of users). The fact this 
>>>>>> support has been broken since JDK 9 and no bug was filed until I 
>>>>>> did so this week is a good indication of that. Another is that 
>>>>>> there are no other SA Javascript related bugs filed. Lastly, the 
>>>>>> lack of any official documentation and only minimal mention of it 
>>>>>> on the web is another good indication of it's (lack of) value.
>>>>>> ???? >
>>>>>> ???? > Also, regarding the 7 commands listed above that would be 
>>>>>> lost (but currently don't work now anyway), if they are really 
>>>>>> wanted, they could be implemented in java instead of javascript.
>>>>>> ???? >
>>>>>> ???? > I'd like to remove javascript support in two steps. The 
>>>>>> first is simply disable the clhsdb code that tries to initialize 
>>>>>> the javascript support. I'd like to do this in 14 (actually as 
>>>>>> soon as possible). I'd like to actually do this now even if we 
>>>>>> decide to keep javascript support and eventually fix it because 
>>>>>> it will get rid of the warning you see whenever you attach from 
>>>>>> clhsdb:
>>>>>> ???? >
>>>>>> ???? >? ???? Warning! JS Engine can't start, some commands will 
>>>>>> not be available.
>>>>>> ???? >
>>>>>> ???? > This warning will become more of an issue for the clhsdb 
>>>>>> tests after I push [4] because then you will also see the full 
>>>>>> stacktrace for the underlying exception that caused the 
>>>>>> Javascript to fail to start. Besides being unnecessary noise in 
>>>>>> passing test cases, it can also be misleading in any test that 
>>>>>> fails because the exception will be unrelated to the failure. 
>>>>>> This is actually what got me going down this path of what the 
>>>>>> javascript support is all about.
>>>>>> ???? >
>>>>>> ???? > The next step would be to strip out all Javascript related 
>>>>>> code, including the SOQL and JSDB tools. This would be done in 15.
>>>>>> ???? >
>>>>>> ???? > Please let me know what you think.
>>>>>> ???? >
>>>>>> ???? > thanks,
>>>>>> ???? >
>>>>>> ???? > Chris
>>>>>> ???? >
>>>>>> ???? > [1] 
>>>>>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>>>> ???? > [2] 
>>>>>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>>>> ???? >
>>>>>>
>>>>
>>>>


From coleen.phillimore at oracle.com  Thu Dec 19 19:46:34 2019
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 19 Dec 2019 14:46:34 -0500
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
Message-ID: <fc342687-0a6e-fcbb-76ce-3d86cdc41201@oracle.com>


On 12/19/19 5:45 AM, Roman Kennke wrote:
> Hi Chris,
>
>> I'll have a look at this, although it might not be for a few days. In
>> the meantime, maybe you can describe your new implementation in
>> classTrack.c so it's easier to look through the changes.
> Sure.
>
> The purpose of this class-tracking is to be able to determine the
> signatures of unloaded classes when GC/class-unloading happened, so that
> we can generate the appropriate JDWP event.
>
> The current implementation does so by maintaining a table of currently
> prepared classes by building that table when classTrack is initialized,
> and then add new classes whenever a class gets loaded. When unloading
> occurs, that cache is rebuilt into a new table, and compared with the
> old table, and whatever is in the old, but not in the new table gets
> returned. The problem is that when GCs happen frequently and/or many
> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
> complexity.
>
> The new implementation keeps a linked-list of prepared classes, and also
> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
> unload/GC occurs, the list of prepared classes is scanned, and classes
> that are also in the deletedTagBag are unlinked (thus maintaining the
> prepared-classes-list) and its signature put in the list that gets returned.
>
> The implementation is not perfect. In order to determine whether or not
> a class is unloaded, it needs to scan the deletedTagBag. That process is
> therefore still O(unloadedClassCount). The assumption here is that
> unloadedClassCount << classCount. In my experiments this seems to be
> true, and also reasonable to expect.

I don't know if you should use this, but I recently fixed the 
ClassUnload Jvmti Extension event to return the name of a class that's 
unloaded.

Coleen
>
> (I have some ideas how to improve the implementation to ~O(1) but it
> would be considerably more complex: have to maintain a (hash)table that
> maps tags -> KlassNode*, unlink them directly upon unload, and build the
> unloaded-signatures list there, but I don't currently see that it's
> worth the effort).
>
> In addition to all that, this process is only activated when there's an
> actual listener registered for EI_GC_FINISH.
>
> Thanks,
> Roman
>
>
>> Chris
>>
>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>> Hello all,
>>>
>>> Issue:
>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>
>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>> throwing away the class cache on GC, and instead keeps track of
>>> loaded/unloaded classes one-by-one.
>>>
>>> In addition to that, it avoids this whole dance until an agent
>>> registers interest in EI_GC_FINISH.
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>
>>> Testing: manual testing of provided test scenarios and timing.
>>>
>>> Eg with the testcase provided here:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>
>>> I am getting those numbers:
>>> unpatched: no debug: 84s with debug: 225s
>>> patched:?? no debug: 85s with debug: 95s
>>>
>>> I also tested successfully through jdk/submit repo
>>>
>>> Can I please get a review?
>>>
>>> Thanks,
>>> Roman
>>>
>>


From alexey.menkov at oracle.com  Thu Dec 19 23:34:02 2019
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Thu, 19 Dec 2019 15:34:02 -0800
Subject: RFR: JDK-8235846: Improve WindbgDebuggerLocal implementation
Message-ID: <8ca073bf-0f82-46d5-7fd1-0a5b5757bfa4@oracle.com>

Hi all,

Please review a fix for
https://bugs.openjdk.java.net/browse/JDK-8235846
webrev:
http://cr.openjdk.java.net/~amenkov/jdk15/WinDbg_improve/webrev.01/

Main goal of the change is to improve error reporting (we have several 
bugs and need at least COM error codes for WinDbg calls).
Also the fix improves/rearranges this quite old code.

--alex

From suenaga at oss.nttdata.com  Fri Dec 20 01:08:28 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 20 Dec 2019 10:08:28 +0900
Subject: Removal of SA javascript support
In-Reply-To: <6a8913b6-0e5d-d514-12c5-326663fa3ba8@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
 <997585a0-e257-7152-d4cb-3177b8337bd4@oss.nttdata.com>
 <6a8913b6-0e5d-d514-12c5-326663fa3ba8@oracle.com>
Message-ID: <082cbc39-6af8-bd61-3354-e1396872a870@oss.nttdata.com>

Hi Chris,

Can we treat (part of) jdk.hotspot.agent like jdk.unsupported module?
jdk.unsupported exports unspec'd API like Unsafe.

If we do so, we might need to separate SA API into exported class and internal class.

I've proposed to export all SA packages in JDK-8157947, but it was rejected.


Thanks,

Yasumasa


On 2019/12/20 1:28, Chris Plummer wrote:
> Hi Yasumasa,
> 
> I've had similar thoughts about how to extend clhsdb. Why not export everything since what we would export is not part of a spec, and the javascript support had the same issue of potentially breaking when the SA API changed. But maybe this type of unspec'd API exporting is considered bad policy. I'm not sure. I'll let the API spec guru's comment on that aspect of it.
> 
> thanks,
> 
> Chris
> 
> On 12/19/19 12:17 AM, Yasumasa Suenaga wrote:
>> Hi,
>>
>> I think we can provide API for SA as following:
>>
>> ? Patch: http://cr.openjdk.java.net/~ysuenaga/sa-api/webrev/
>> ? Plugin examples:
>> ??? browse: http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples/
>> ??? download: http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples.tar.gz
>>
>> I think JS plugin (loading via `jsload` CLHSDB command) was supported "AS IS".
>> If HotSpot and/or SA code is changed, the user should follow it if need.
>> SA is not part of Java SE. We need not to maintain SA API when it happen IMHO.
>>
>> The user who want to expand SA features (includes me!) should have responsible for it.
>> So I did not expose jdk.hotspot.agent module - the user need to build with --add-exports.
>>
>> My proposal can write SA plugins with pure Java. So we don't need to depend on script engine.
>>
>>
>> Comments are welcome.
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2019/12/11 21:47, sundararajan.athijegannathan at oracle.com wrote:
>>> Effectively you're asking for SA as API. I don't think that is a good idea. That implies supporting hotspot data structures as Java *API*. That will be maintainability nightmare - we've to keep tracking hotspot data structures in SA code. That itself is problematic. API would be next level nightmare.
>>>
>>> -Sundar
>>>
>>> On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
>>>> Hi,
>>>>
>>>> IMHO we need to export all packages in SA if we do not provide new API for SA.
>>>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 (before Jigsaw), so we could make various functions if we need.
>>>>
>>>> OTOH we cannot know what classes are needed by the SA users. All packages in jdk.hotspot.agent module provides features, and they require other packages. For example, sun.jvm.hotspot.oops.Oop requires sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger .
>>>> It is difficult to track and to export minimally.
>>>> (I worked for it in JDK-8157947, but I gave up...)
>>>>
>>>> Thus I guess it is a big challenge to export SA classes without refactoring.
>>>> If we provide new API for SA plugin, I guess we need to work some refactoring.
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2019/12/11 15:00, Chris Plummer wrote:
>>>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>>>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>>>>> Hi?Yasumasa,
>>>>>>>
>>>>>>> That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right?
>>>>>>
>>>>>> Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc...
>>>>>>
>>>>> Yes, or export them. I should have read this email before posting my previous one.
>>>>>
>>>>> Chris
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>> [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>>>>
>>>>>>> - Kris
>>>>>>>
>>>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>>>>
>>>>>>> ??? Hi Chris,
>>>>>>>
>>>>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw.
>>>>>>> ??? However I want SA to implement pluggable feature.
>>>>>>> ??? I use custom script to list compiled codes in CodeCache.
>>>>>>>
>>>>>>> ??? I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash.
>>>>>>>
>>>>>>>
>>>>>>> ??? Thanks,
>>>>>>>
>>>>>>> ??? Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>>>>> ???? > Hi,
>>>>>>> ???? >
>>>>>>> ???? > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it.
>>>>>>> ???? >
>>>>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation.
>>>>>>> ???? >
>>>>>>> ???? > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following:
>>>>>>> ???? >
>>>>>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>>>>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass");
>>>>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>>>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>>>>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>>>>>> ???? >
>>>>>>> ???? > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1].
>>>>>>> ???? >
>>>>>>> ???? > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented.
>>>>>>> ???? >
>>>>>>> ???? > The real purpose of the email is to propose removal of this support. Here are the reasons:
>>>>>>> ???? >
>>>>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered.
>>>>>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>>>>> ???? > (3) We have very little understanding of the javascript support.
>>>>>>> ???? > (4) No resources to work on it (unless there is a community volunteer).
>>>>>>> ???? > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value.
>>>>>>> ???? >
>>>>>>> ???? > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript.
>>>>>>> ???? >
>>>>>>> ???? > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb:
>>>>>>> ???? >
>>>>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not be available.
>>>>>>> ???? >
>>>>>>> ???? > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about.
>>>>>>> ???? >
>>>>>>> ???? > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15.
>>>>>>> ???? >
>>>>>>> ???? > Please let me know what you think.
>>>>>>> ???? >
>>>>>>> ???? > thanks,
>>>>>>> ???? >
>>>>>>> ???? > Chris
>>>>>>> ???? >
>>>>>>> ???? > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>>>>> ???? > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>>>>> ???? >
>>>>>>>
>>>>>
>>>>>
> 
> 

From denghui.ddh at alibaba-inc.com  Fri Dec 20 02:56:28 2019
From: denghui.ddh at alibaba-inc.com (Denghui Dong)
Date: Fri, 20 Dec 2019 10:56:28 +0800
Subject: =?UTF-8?B?UkZSOiBDU1I6IFZNX0hlYXBEdW1wZXIgaGl0cyBhc3NlcnQgd2l0aCBiYWQgZHVtcF9sZW4=?=
Message-ID: <9e846cb4-a06f-4dbc-83ea-fc0dbae940f7.denghui.ddh@alibaba-inc.com>

Hi,

Please review this draft of the CSR for jdk8u: VM_HeapDumper hits assert with bad dump_len

CSR: https://bugs.openjdk.java.net/browse/JDK-8235300

Thanks,
Denghui Dong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191220/0edda0f6/attachment.htm>

From sundararajan.athijegannathan at oracle.com  Fri Dec 20 03:25:44 2019
From: sundararajan.athijegannathan at oracle.com (sundararajan.athijegannathan at oracle.com)
Date: Fri, 20 Dec 2019 08:55:44 +0530
Subject: Removal of SA javascript support
In-Reply-To: <082cbc39-6af8-bd61-3354-e1396872a870@oss.nttdata.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
 <997585a0-e257-7152-d4cb-3177b8337bd4@oss.nttdata.com>
 <6a8913b6-0e5d-d514-12c5-326663fa3ba8@oracle.com>
 <082cbc39-6af8-bd61-3354-e1396872a870@oss.nttdata.com>
Message-ID: <e96bad3d-ddc4-8de6-0ae2-5a4f13512d63@oracle.com>

Hi,

I am going to reiterate. This will lead to maintenance nightmare! Any 
"Unsupported Java API" is still an API! Remember Unsafe? Once blessed in 
any form, it is very difficult to remove. Exposing hotspot VM internals 
as a Java API is very bad idea. No, not even as "unsupported API".

Exposing SA to scripts was a different beast. Scripting languages are 
dynamically typed. It is quite common in scripting world to check the 
existence of an attribute & then use it (if (!obj["foo"]) / if (typeof 
obj["foo"] == 'function') kind of code).? That'd be extremely painful in 
a statically typed language like Java (reflection or method/var 
handles?).? Platform debuggers like dbx support a scripting language 
along with access to the data structures from the target process. SA 
scripting was inspired from that model. Also scripts were never meant to 
be written & maintained for long time! Most scripts were expected to be 
written for a specific debugging exercise/session (thrown away).

-Sundar

On 20/12/19 6:38 am, Yasumasa Suenaga wrote:
> Hi Chris,
>
> Can we treat (part of) jdk.hotspot.agent like jdk.unsupported module?
> jdk.unsupported exports unspec'd API like Unsafe.
>
> If we do so, we might need to separate SA API into exported class and 
> internal class.
>
> I've proposed to export all SA packages in JDK-8157947, but it was 
> rejected.
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2019/12/20 1:28, Chris Plummer wrote:
>> Hi Yasumasa,
>>
>> I've had similar thoughts about how to extend clhsdb. Why not export 
>> everything since what we would export is not part of a spec, and the 
>> javascript support had the same issue of potentially breaking when 
>> the SA API changed. But maybe this type of unspec'd API exporting is 
>> considered bad policy. I'm not sure. I'll let the API spec guru's 
>> comment on that aspect of it.
>>
>> thanks,
>>
>> Chris
>>
>> On 12/19/19 12:17 AM, Yasumasa Suenaga wrote:
>>> Hi,
>>>
>>> I think we can provide API for SA as following:
>>>
>>> ? Patch: http://cr.openjdk.java.net/~ysuenaga/sa-api/webrev/
>>> ? Plugin examples:
>>> ??? browse: 
>>> http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples/
>>> ??? download: 
>>> http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples.tar.gz
>>>
>>> I think JS plugin (loading via `jsload` CLHSDB command) was 
>>> supported "AS IS".
>>> If HotSpot and/or SA code is changed, the user should follow it if 
>>> need.
>>> SA is not part of Java SE. We need not to maintain SA API when it 
>>> happen IMHO.
>>>
>>> The user who want to expand SA features (includes me!) should have 
>>> responsible for it.
>>> So I did not expose jdk.hotspot.agent module - the user need to 
>>> build with --add-exports.
>>>
>>> My proposal can write SA plugins with pure Java. So we don't need to 
>>> depend on script engine.
>>>
>>>
>>> Comments are welcome.
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2019/12/11 21:47, sundararajan.athijegannathan at oracle.com wrote:
>>>> Effectively you're asking for SA as API. I don't think that is a 
>>>> good idea. That implies supporting hotspot data structures as Java 
>>>> *API*. That will be maintainability nightmare - we've to keep 
>>>> tracking hotspot data structures in SA code. That itself is 
>>>> problematic. API would be next level nightmare.
>>>>
>>>> -Sundar
>>>>
>>>> On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
>>>>> Hi,
>>>>>
>>>>> IMHO we need to export all packages in SA if we do not provide new 
>>>>> API for SA.
>>>>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 
>>>>> (before Jigsaw), so we could make various functions if we need.
>>>>>
>>>>> OTOH we cannot know what classes are needed by the SA users. All 
>>>>> packages in jdk.hotspot.agent module provides features, and they 
>>>>> require other packages. For example, sun.jvm.hotspot.oops.Oop 
>>>>> requires sun.jvm.hotspot.types, and it requires 
>>>>> sun.jvm.hotspot.debugger .
>>>>> It is difficult to track and to export minimally.
>>>>> (I worked for it in JDK-8157947, but I gave up...)
>>>>>
>>>>> Thus I guess it is a big challenge to export SA classes without 
>>>>> refactoring.
>>>>> If we provide new API for SA plugin, I guess we need to work some 
>>>>> refactoring.
>>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2019/12/11 15:00, Chris Plummer wrote:
>>>>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>>>>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>>>>>> Hi?Yasumasa,
>>>>>>>>
>>>>>>>> That's a very nice idea. Basically what you're asking for is 
>>>>>>>> exposing the Command interface [1] so that plugins can 
>>>>>>>> implement it and get dynamically loaded / registered into 
>>>>>>>> CLHSDB / HSDB, right?
>>>>>>>
>>>>>>> Yes, but we also need proxy API to access internal SA objects 
>>>>>>> e.g. CodeCache, JavaThread, TypeDataBase, etc...
>>>>>>>
>>>>>> Yes, or export them. I should have read this email before posting 
>>>>>> my previous one.
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>> [1]: 
>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>>>>>
>>>>>>>> - Kris
>>>>>>>>
>>>>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga 
>>>>>>>> <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>>>>>
>>>>>>>> ??? Hi Chris,
>>>>>>>>
>>>>>>>> ??? It's a sad proposal, but I agree with you. To maintain SA 
>>>>>>>> in JS is difficult since Jigsaw.
>>>>>>>> ??? However I want SA to implement pluggable feature.
>>>>>>>> ??? I use custom script to list compiled codes in CodeCache.
>>>>>>>>
>>>>>>>> ??? I guess other troubleshooters also want similar feature 
>>>>>>>> (via jsload) in future if they encounter JVM crash.
>>>>>>>>
>>>>>>>>
>>>>>>>> ??? Thanks,
>>>>>>>>
>>>>>>>> ??? Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>>>>>> ???? > Hi,
>>>>>>>> ???? >
>>>>>>>> ???? > I like to propose the removal of SA javascript support. 
>>>>>>>> Few people even realize this support exists, and hopefully even 
>>>>>>>> fewer are using it since I'd like to remove it. Since I'm new 
>>>>>>>> to this myself, let me first explain what I know about it's 
>>>>>>>> existence, and then explain why I want to remove it.
>>>>>>>> ???? >
>>>>>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval 
>>>>>>>> commands. Don't look for them in anything post JDK 8. I'll 
>>>>>>>> explain why later. jsload is used to load a javascript file. In 
>>>>>>>> that file you can register new clhsdb commands that are written 
>>>>>>>> in javascript. You can also evaluate javascript using the 
>>>>>>>> jseval command. Some of this is explained in [1], which is the 
>>>>>>>> only place I can find any reference to this support. It does 
>>>>>>>> not appear to be officially supported, nor is there any oracle 
>>>>>>>> provided documentation.
>>>>>>>> ???? >
>>>>>>>> ???? > There also appear to be a few clhsdb commands that are 
>>>>>>>> written in javascript. Doing a grep for "registerCommand" in 
>>>>>>>> sa.js shows the following:
>>>>>>>> ???? >
>>>>>>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>>>>>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>>>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | 
>>>>>>>> name } [ directory ]", "dclass");
>>>>>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", 
>>>>>>>> "dumpHeap");
>>>>>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", 
>>>>>>>> "printMem");
>>>>>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>>>>>> ???? >? ?registerCommand("whatis", "whatis address", 
>>>>>>>> "printWhatis");
>>>>>>>> ???? >
>>>>>>>> ???? > Once again, don't go looking for these in anything newer 
>>>>>>>> than JDK8. You won't find them. Again the only documentation I 
>>>>>>>> can fine is [1].
>>>>>>>> ???? >
>>>>>>>> ???? > The other use of Javascript is the SOQL command (Simple 
>>>>>>>> Object Query Language), a tool used to query the heap, and also 
>>>>>>>> the JSDB command. The only SOQL documentation I could find is 
>>>>>>>> the blog reference [2]. I could not find HSDB documentation, 
>>>>>>>> but I believe is is a javascript support for looking at 
>>>>>>>> hotspot. So once again, neither of these seem to be officially 
>>>>>>>> supported or documented.
>>>>>>>> ???? >
>>>>>>>> ???? > The real purpose of the email is to propose removal of 
>>>>>>>> this support. Here are the reasons:
>>>>>>>> ???? >
>>>>>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is 
>>>>>>>> why you don't see the javascript related commands in clhsdb. 
>>>>>>>> Javascript fails to initialize, so none of the javascript 
>>>>>>>> related commands are registered.
>>>>>>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>>>>>> ???? > (3) We have very little understanding of the javascript 
>>>>>>>> support.
>>>>>>>> ???? > (4) No resources to work on it (unless there is a 
>>>>>>>> community volunteer).
>>>>>>>> ???? > (5) Very questionable value (lack of users). The fact 
>>>>>>>> this support has been broken since JDK 9 and no bug was filed 
>>>>>>>> until I did so this week is a good indication of that. Another 
>>>>>>>> is that there are no other SA Javascript related bugs filed. 
>>>>>>>> Lastly, the lack of any official documentation and only minimal 
>>>>>>>> mention of it on the web is another good indication of it's 
>>>>>>>> (lack of) value.
>>>>>>>> ???? >
>>>>>>>> ???? > Also, regarding the 7 commands listed above that would 
>>>>>>>> be lost (but currently don't work now anyway), if they are 
>>>>>>>> really wanted, they could be implemented in java instead of 
>>>>>>>> javascript.
>>>>>>>> ???? >
>>>>>>>> ???? > I'd like to remove javascript support in two steps. The 
>>>>>>>> first is simply disable the clhsdb code that tries to 
>>>>>>>> initialize the javascript support. I'd like to do this in 14 
>>>>>>>> (actually as soon as possible). I'd like to actually do this 
>>>>>>>> now even if we decide to keep javascript support and eventually 
>>>>>>>> fix it because it will get rid of the warning you see whenever 
>>>>>>>> you attach from clhsdb:
>>>>>>>> ???? >
>>>>>>>> ???? >? ???? Warning! JS Engine can't start, some commands will 
>>>>>>>> not be available.
>>>>>>>> ???? >
>>>>>>>> ???? > This warning will become more of an issue for the clhsdb 
>>>>>>>> tests after I push [4] because then you will also see the full 
>>>>>>>> stacktrace for the underlying exception that caused the 
>>>>>>>> Javascript to fail to start. Besides being unnecessary noise in 
>>>>>>>> passing test cases, it can also be misleading in any test that 
>>>>>>>> fails because the exception will be unrelated to the failure. 
>>>>>>>> This is actually what got me going down this path of what the 
>>>>>>>> javascript support is all about.
>>>>>>>> ???? >
>>>>>>>> ???? > The next step would be to strip out all Javascript 
>>>>>>>> related code, including the SOQL and JSDB tools. This would be 
>>>>>>>> done in 15.
>>>>>>>> ???? >
>>>>>>>> ???? > Please let me know what you think.
>>>>>>>> ???? >
>>>>>>>> ???? > thanks,
>>>>>>>> ???? >
>>>>>>>> ???? > Chris
>>>>>>>> ???? >
>>>>>>>> ???? > [1] 
>>>>>>>> https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>>>>>> ???? > [2] 
>>>>>>>> http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>>>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>>>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>>>>>> ???? >
>>>>>>>>
>>>>>>
>>>>>>
>>
>>

From suenaga at oss.nttdata.com  Fri Dec 20 04:17:27 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 20 Dec 2019 13:17:27 +0900
Subject: Removal of SA javascript support
In-Reply-To: <e96bad3d-ddc4-8de6-0ae2-5a4f13512d63@oracle.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
 <997585a0-e257-7152-d4cb-3177b8337bd4@oss.nttdata.com>
 <6a8913b6-0e5d-d514-12c5-326663fa3ba8@oracle.com>
 <082cbc39-6af8-bd61-3354-e1396872a870@oss.nttdata.com>
 <e96bad3d-ddc4-8de6-0ae2-5a4f13512d63@oracle.com>
Message-ID: <1bd21276-45a1-3ebd-46e4-dc845897f06b@oss.nttdata.com>

Hi, Sundar,

I agree with you in general, but I want to extend SA in some case.
I just want to add gateway to SA.
So I proposed to add some classes in jdk.hotspot.agent for exporting.

It does not mean to maintain all SA classes for it.
I think that the user who want to use it should use it by own risk.
Thus my proposal needs to add --add-exports to build SA plugin.

As another approach, we can might custom CLHSDB command with @FunctionalInterface.
For example, if CLHSDB provides "add_function [jar] [FQCN] [command name]" for it,
the user add new command(s) through it by own risk.
It does not affect SA code. I believe it is not to expose "Unsupported Java API".


Yasumasa


On 2019/12/20 12:25, sundararajan.athijegannathan at oracle.com wrote:
> Hi,
> 
> I am going to reiterate. This will lead to maintenance nightmare! Any "Unsupported Java API" is still an API! Remember Unsafe? Once blessed in any form, it is very difficult to remove. Exposing hotspot VM internals as a Java API is very bad idea. No, not even as "unsupported API".
> 
> Exposing SA to scripts was a different beast. Scripting languages are dynamically typed. It is quite common in scripting world to check the existence of an attribute & then use it (if (!obj["foo"]) / if (typeof obj["foo"] == 'function') kind of code).? That'd be extremely painful in a statically typed language like Java (reflection or method/var handles?).? Platform debuggers like dbx support a scripting language along with access to the data structures from the target process. SA scripting was inspired from that model. Also scripts were never meant to be written & maintained for long time! Most scripts were expected to be written for a specific debugging exercise/session (thrown away).
> 
> -Sundar
> 
> On 20/12/19 6:38 am, Yasumasa Suenaga wrote:
>> Hi Chris,
>>
>> Can we treat (part of) jdk.hotspot.agent like jdk.unsupported module?
>> jdk.unsupported exports unspec'd API like Unsafe.
>>
>> If we do so, we might need to separate SA API into exported class and internal class.
>>
>> I've proposed to export all SA packages in JDK-8157947, but it was rejected.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2019/12/20 1:28, Chris Plummer wrote:
>>> Hi Yasumasa,
>>>
>>> I've had similar thoughts about how to extend clhsdb. Why not export everything since what we would export is not part of a spec, and the javascript support had the same issue of potentially breaking when the SA API changed. But maybe this type of unspec'd API exporting is considered bad policy. I'm not sure. I'll let the API spec guru's comment on that aspect of it.
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 12/19/19 12:17 AM, Yasumasa Suenaga wrote:
>>>> Hi,
>>>>
>>>> I think we can provide API for SA as following:
>>>>
>>>> ? Patch: http://cr.openjdk.java.net/~ysuenaga/sa-api/webrev/
>>>> ? Plugin examples:
>>>> ??? browse: http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples/
>>>> ??? download: http://cr.openjdk.java.net/~ysuenaga/sa-api/plugin-examples.tar.gz
>>>>
>>>> I think JS plugin (loading via `jsload` CLHSDB command) was supported "AS IS".
>>>> If HotSpot and/or SA code is changed, the user should follow it if need.
>>>> SA is not part of Java SE. We need not to maintain SA API when it happen IMHO.
>>>>
>>>> The user who want to expand SA features (includes me!) should have responsible for it.
>>>> So I did not expose jdk.hotspot.agent module - the user need to build with --add-exports.
>>>>
>>>> My proposal can write SA plugins with pure Java. So we don't need to depend on script engine.
>>>>
>>>>
>>>> Comments are welcome.
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2019/12/11 21:47, sundararajan.athijegannathan at oracle.com wrote:
>>>>> Effectively you're asking for SA as API. I don't think that is a good idea. That implies supporting hotspot data structures as Java *API*. That will be maintainability nightmare - we've to keep tracking hotspot data structures in SA code. That itself is problematic. API would be next level nightmare.
>>>>>
>>>>> -Sundar
>>>>>
>>>>> On 11/12/19 11:57 am, Yasumasa Suenaga wrote:
>>>>>> Hi,
>>>>>>
>>>>>> IMHO we need to export all packages in SA if we do not provide new API for SA.
>>>>>> sa.js in jdk.hotspot.agent could access all SA classes until JDK 8 (before Jigsaw), so we could make various functions if we need.
>>>>>>
>>>>>> OTOH we cannot know what classes are needed by the SA users. All packages in jdk.hotspot.agent module provides features, and they require other packages. For example, sun.jvm.hotspot.oops.Oop requires sun.jvm.hotspot.types, and it requires sun.jvm.hotspot.debugger .
>>>>>> It is difficult to track and to export minimally.
>>>>>> (I worked for it in JDK-8157947, but I gave up...)
>>>>>>
>>>>>> Thus I guess it is a big challenge to export SA classes without refactoring.
>>>>>> If we provide new API for SA plugin, I guess we need to work some refactoring.
>>>>>>
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2019/12/11 15:00, Chris Plummer wrote:
>>>>>>> On 12/10/19 9:56 PM, Yasumasa Suenaga wrote:
>>>>>>>> On 2019/12/11 14:39, Krystal Mok wrote:
>>>>>>>>> Hi?Yasumasa,
>>>>>>>>>
>>>>>>>>> That's a very nice idea. Basically what you're asking for is exposing the Command interface [1] so that plugins can implement it and get dynamically loaded / registered into CLHSDB / HSDB, right?
>>>>>>>>
>>>>>>>> Yes, but we also need proxy API to access internal SA objects e.g. CodeCache, JavaThread, TypeDataBase, etc...
>>>>>>>>
>>>>>>> Yes, or export them. I should have read this email before posting my previous one.
>>>>>>>
>>>>>>> Chris
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>>> [1]: http://hg.openjdk.java.net/jdk/jdk/file/c71ec1f09f21/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/CommandProcessor.java#l246
>>>>>>>>>
>>>>>>>>> - Kris
>>>>>>>>>
>>>>>>>>> On Tue, Dec 10, 2019 at 9:33 PM Yasumasa Suenaga <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
>>>>>>>>>
>>>>>>>>> ??? Hi Chris,
>>>>>>>>>
>>>>>>>>> ??? It's a sad proposal, but I agree with you. To maintain SA in JS is difficult since Jigsaw.
>>>>>>>>> ??? However I want SA to implement pluggable feature.
>>>>>>>>> ??? I use custom script to list compiled codes in CodeCache.
>>>>>>>>>
>>>>>>>>> ??? I guess other troubleshooters also want similar feature (via jsload) in future if they encounter JVM crash.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ??? Thanks,
>>>>>>>>>
>>>>>>>>> ??? Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ??? On 2019/12/11 11:52, Chris Plummer wrote:
>>>>>>>>> ???? > Hi,
>>>>>>>>> ???? >
>>>>>>>>> ???? > I like to propose the removal of SA javascript support. Few people even realize this support exists, and hopefully even fewer are using it since I'd like to remove it. Since I'm new to this myself, let me first explain what I know about it's existence, and then explain why I want to remove it.
>>>>>>>>> ???? >
>>>>>>>>> ???? > If you run "jhsdb clhsdb", there are jsload and jseval commands. Don't look for them in anything post JDK 8. I'll explain why later. jsload is used to load a javascript file. In that file you can register new clhsdb commands that are written in javascript. You can also evaluate javascript using the jseval command. Some of this is explained in [1], which is the only place I can find any reference to this support. It does not appear to be officially supported, nor is there any oracle provided documentation.
>>>>>>>>> ???? >
>>>>>>>>> ???? > There also appear to be a few clhsdb commands that are written in javascript. Doing a grep for "registerCommand" in sa.js shows the following:
>>>>>>>>> ???? >
>>>>>>>>> ???? >? ?registerCommand("class", "class name", "jclass");
>>>>>>>>> ???? >? ?registerCommand("classes", "classes", "jclasses");
>>>>>>>>> ???? >? ?registerCommand("dumpclass", "dumpclass { address | name } [ directory ]", "dclass");
>>>>>>>>> ???? >? ?registerCommand("dumpheap", "dumpheap [ file ]", "dumpHeap");
>>>>>>>>> ???? >? ?registerCommand("mem", "mem address [ length ]", "printMem");
>>>>>>>>> ???? >? ?registerCommand("sysprops", "sysprops", "sysProps");
>>>>>>>>> ???? >? ?registerCommand("whatis", "whatis address", "printWhatis");
>>>>>>>>> ???? >
>>>>>>>>> ???? > Once again, don't go looking for these in anything newer than JDK8. You won't find them. Again the only documentation I can fine is [1].
>>>>>>>>> ???? >
>>>>>>>>> ???? > The other use of Javascript is the SOQL command (Simple Object Query Language), a tool used to query the heap, and also the JSDB command. The only SOQL documentation I could find is the blog reference [2]. I could not find HSDB documentation, but I believe is is a javascript support for looking at hotspot. So once again, neither of these seem to be officially supported or documented.
>>>>>>>>> ???? >
>>>>>>>>> ???? > The real purpose of the email is to propose removal of this support. Here are the reasons:
>>>>>>>>> ???? >
>>>>>>>>> ???? > (1) It's broken, and has been since 9. See [3]. This is why you don't see the javascript related commands in clhsdb. Javascript fails to initialize, so none of the javascript related commands are registered.
>>>>>>>>> ???? > (2) Nashorn is deprecated and will be removed eventually.
>>>>>>>>> ???? > (3) We have very little understanding of the javascript support.
>>>>>>>>> ???? > (4) No resources to work on it (unless there is a community volunteer).
>>>>>>>>> ???? > (5) Very questionable value (lack of users). The fact this support has been broken since JDK 9 and no bug was filed until I did so this week is a good indication of that. Another is that there are no other SA Javascript related bugs filed. Lastly, the lack of any official documentation and only minimal mention of it on the web is another good indication of it's (lack of) value.
>>>>>>>>> ???? >
>>>>>>>>> ???? > Also, regarding the 7 commands listed above that would be lost (but currently don't work now anyway), if they are really wanted, they could be implemented in java instead of javascript.
>>>>>>>>> ???? >
>>>>>>>>> ???? > I'd like to remove javascript support in two steps. The first is simply disable the clhsdb code that tries to initialize the javascript support. I'd like to do this in 14 (actually as soon as possible). I'd like to actually do this now even if we decide to keep javascript support and eventually fix it because it will get rid of the warning you see whenever you attach from clhsdb:
>>>>>>>>> ???? >
>>>>>>>>> ???? >? ???? Warning! JS Engine can't start, some commands will not be available.
>>>>>>>>> ???? >
>>>>>>>>> ???? > This warning will become more of an issue for the clhsdb tests after I push [4] because then you will also see the full stacktrace for the underlying exception that caused the Javascript to fail to start. Besides being unnecessary noise in passing test cases, it can also be misleading in any test that fails because the exception will be unrelated to the failure. This is actually what got me going down this path of what the javascript support is all about.
>>>>>>>>> ???? >
>>>>>>>>> ???? > The next step would be to strip out all Javascript related code, including the SOQL and JSDB tools. This would be done in 15.
>>>>>>>>> ???? >
>>>>>>>>> ???? > Please let me know what you think.
>>>>>>>>> ???? >
>>>>>>>>> ???? > thanks,
>>>>>>>>> ???? >
>>>>>>>>> ???? > Chris
>>>>>>>>> ???? >
>>>>>>>>> ???? > [1] https://cr.openjdk.java.net/~minqi/6830717/raw_files/new/agent/doc/clhsdb.html
>>>>>>>>> ???? > [2] http://javatroubleshooting.blogspot.com/2015/12/serviceability-agent-part-3.html
>>>>>>>>> ???? > [3] https://bugs.openjdk.java.net/browse/JDK-8235594
>>>>>>>>> ???? > [4] https://bugs.openjdk.java.net/browse/JDK-8234277
>>>>>>>>> ???? >
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>
>>>

From rkennke at redhat.com  Fri Dec 20 07:48:46 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 20 Dec 2019 08:48:46 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <fc342687-0a6e-fcbb-76ce-3d86cdc41201@oracle.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <fc342687-0a6e-fcbb-76ce-3d86cdc41201@oracle.com>
Message-ID: <a45f1588-2c8d-d38b-2a11-012385f9b79b@redhat.com>

Hi Coleen,

>>> I'll have a look at this, although it might not be for a few days. In
>>> the meantime, maybe you can describe your new implementation in
>>> classTrack.c so it's easier to look through the changes.
>> Sure.
>>
>> The purpose of this class-tracking is to be able to determine the
>> signatures of unloaded classes when GC/class-unloading happened, so that
>> we can generate the appropriate JDWP event.
>>
>> The current implementation does so by maintaining a table of currently
>> prepared classes by building that table when classTrack is initialized,
>> and then add new classes whenever a class gets loaded. When unloading
>> occurs, that cache is rebuilt into a new table, and compared with the
>> old table, and whatever is in the old, but not in the new table gets
>> returned. The problem is that when GCs happen frequently and/or many
>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>> complexity.
>>
>> The new implementation keeps a linked-list of prepared classes, and also
>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>> unload/GC occurs, the list of prepared classes is scanned, and classes
>> that are also in the deletedTagBag are unlinked (thus maintaining the
>> prepared-classes-list) and its signature put in the list that gets
>> returned.
>>
>> The implementation is not perfect. In order to determine whether or not
>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>> therefore still O(unloadedClassCount). The assumption here is that
>> unloadedClassCount << classCount. In my experiments this seems to be
>> true, and also reasonable to expect.
> 
> I don't know if you should use this, but I recently fixed the
> ClassUnload Jvmti Extension event to return the name of a class that's
> unloaded.

That sounds interesting. Which change is that? How would I use it?

Thanks,
Roman

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191220/05a0c950/signature.asc>

From rkennke at redhat.com  Fri Dec 20 09:26:28 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 20 Dec 2019 10:26:28 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
Message-ID: <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>

So, here comes the O(1) implementation:

- Whenever a class is 'prepared', it is registered with a tag, and we
set-up a listener to get notified when it is unloaded.
- Prepared classes are kept in a datastructure that is a table, which
each entry being the head of a linked-list of KlassNode*. The table is
indexed by tag % slot-count, and then simply prepend the new KlassNode*.
This is O(1) operation.
- When we get notified of unloading a class, we look up the signature of
the reported tag in that table, and remember it in a bag. The KlassNode*
is then unlinked from the table and deallocated. This is ~O(1) operation
too, depending on the depth of the table. In my testcase which hammered
the code with class-loads and unloads, I usually see depths of like 2-3,
but not usually more. It should be ok.
- when processUnloads() gets called, we simply hand out that bag, and
allocate a new one.
- I also added cleanup-code in classTrack_reset() to avoid leaking the
signatures and KlassNode* etc when debug agent gets detached and/or
re-attached (was missing before).
- I also added locks around data-structure-manipulation (was missing
before).
- Also, I only activate this whole process when an actual listener gets
registered on EI_GC_FINISH. This seems to happen right when attaching a
jdb, not sure why jdb does that though. This may be something to improve
in the future?

In my tests, the performance of class-tracking itself looks really good.
The bottleneck now is clearly actual synthesizing the class-unload
events. I don't see how this can be helped when the debug agent asks for it?

Updated webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/

Please let me know what you think of it.

Thanks,
Roman


> Alright, the perfectionist in me got me. I am implementing the even more
> efficient ~O(1) class tracking. Please hold off reviewing for now.
> 
> Thanks,Roman
> 
>  Hi Chris,
>>
>>> I'll have a look at this, although it might not be for a few days. In
>>> the meantime, maybe you can describe your new implementation in
>>> classTrack.c so it's easier to look through the changes.
>>
>> Sure.
>>
>> The purpose of this class-tracking is to be able to determine the
>> signatures of unloaded classes when GC/class-unloading happened, so that
>> we can generate the appropriate JDWP event.
>>
>> The current implementation does so by maintaining a table of currently
>> prepared classes by building that table when classTrack is initialized,
>> and then add new classes whenever a class gets loaded. When unloading
>> occurs, that cache is rebuilt into a new table, and compared with the
>> old table, and whatever is in the old, but not in the new table gets
>> returned. The problem is that when GCs happen frequently and/or many
>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>> complexity.
>>
>> The new implementation keeps a linked-list of prepared classes, and also
>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>> unload/GC occurs, the list of prepared classes is scanned, and classes
>> that are also in the deletedTagBag are unlinked (thus maintaining the
>> prepared-classes-list) and its signature put in the list that gets returned.
>>
>> The implementation is not perfect. In order to determine whether or not
>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>> therefore still O(unloadedClassCount). The assumption here is that
>> unloadedClassCount << classCount. In my experiments this seems to be
>> true, and also reasonable to expect.
>>
>> (I have some ideas how to improve the implementation to ~O(1) but it
>> would be considerably more complex: have to maintain a (hash)table that
>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>> unloaded-signatures list there, but I don't currently see that it's
>> worth the effort).
>>
>> In addition to all that, this process is only activated when there's an
>> actual listener registered for EI_GC_FINISH.
>>
>> Thanks,
>> Roman
>>
>>
>>> Chris
>>>
>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>> Hello all,
>>>>
>>>> Issue:
>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>
>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>> throwing away the class cache on GC, and instead keeps track of
>>>> loaded/unloaded classes one-by-one.
>>>>
>>>> In addition to that, it avoids this whole dance until an agent
>>>> registers interest in EI_GC_FINISH.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>
>>>> Testing: manual testing of provided test scenarios and timing.
>>>>
>>>> Eg with the testcase provided here:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>
>>>> I am getting those numbers:
>>>> unpatched: no debug: 84s with debug: 225s
>>>> patched:?? no debug: 85s with debug: 95s
>>>>
>>>> I also tested successfully through jdk/submit repo
>>>>
>>>> Can I please get a review?
>>>>
>>>> Thanks,
>>>> Roman
>>>>
>>>
>>>
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191220/fc876779/signature-0001.asc>

From Alan.Bateman at oracle.com  Fri Dec 20 09:49:55 2019
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Fri, 20 Dec 2019 09:49:55 +0000
Subject: Removal of SA javascript support
In-Reply-To: <082cbc39-6af8-bd61-3354-e1396872a870@oss.nttdata.com>
References: <adf48c71-a7e9-68f8-0873-60b95b78ed6d@oracle.com>
 <9025f755-0edc-9751-5f89-086471cb645e@oss.nttdata.com>
 <CA+cQ+tRjiThKZJHnjDSDnM2ZvXyjjLpXBgc2_yQxDs5UWsBtsA@mail.gmail.com>
 <063375c0-5b88-49ac-9bae-3922baa3386e@oss.nttdata.com>
 <893aee11-accd-dc18-ad27-98ec2081447d@oracle.com>
 <a62d8f4c-a1b0-6b15-646f-3ba2c77db5a4@oss.nttdata.com>
 <0944c2b1-a0b4-2631-24de-3789e6aeec7b@oracle.com>
 <997585a0-e257-7152-d4cb-3177b8337bd4@oss.nttdata.com>
 <6a8913b6-0e5d-d514-12c5-326663fa3ba8@oracle.com>
 <082cbc39-6af8-bd61-3354-e1396872a870@oss.nttdata.com>
Message-ID: <5e4f75df-7e39-5f29-42b6-7c80188acf92@oracle.com>

On 20/12/2019 01:08, Yasumasa Suenaga wrote:
> Hi Chris,
>
> Can we treat (part of) jdk.hotspot.agent like jdk.unsupported module?
> jdk.unsupported exports unspec'd API like Unsafe.
>
> If we do so, we might need to separate SA API into exported class and 
> internal class.
>
> I've proposed to export all SA packages in JDK-8157947, but it was 
> rejected.
JEP 260 describes the rational for the jdk.unsupported module. I don't 
think it is possible to argue that SA is a critical internal API.

You've brought this up a few times and I think the main issue is that SA 
is tightly tied to the HotSpot VM implementation, so SA or anything 
using it directly will need to updated continuously, maybe very build if 
there are data structures changing. I suspect any code using the SA API 
directly really needs to be in the jdk repo, I don't think it would be 
workable to export it unconditionally for tools that are maintained 
outside of the jdk repo.

-Alan

From richard.reingruber at sap.com  Fri Dec 20 14:29:05 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Fri, 20 Dec 2019 14:29:05 +0000
Subject: RFR (M) 8234510: Remove file seeking requirement for writing a
 heap dump
In-Reply-To: <AM0PR02MB450094D67F68E9168B6293D39F510@AM0PR02MB4500.eurprd02.prod.outlook.com>
References: <AM0PR02MB45008C66EC315E9836F7FF7A9F4A0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAA-vtUxV40m+-qQ0N7SZmOe9FotFfyr1snQGe0T5pK50LKmJXg@mail.gmail.com>
 <AM0PR02MB4500B68C26D844003956EC339F580@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <AM0PR02MB450094D67F68E9168B6293D39F510@AM0PR02MB4500.eurprd02.prod.outlook.com>
Message-ID: <DB7PR02MB36127EE8018712C005BF9DA49B2D0@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi Ralf,

the enhacement you're proposing is useful I'd say. The enhancements it enables (streaming,
compression) even more so.

Your change looks good to me. Note that I'm a JDK Committer not a Reviewer.

Best regards,
Richard.

-----Original Message-----
From: serviceability-dev <serviceability-dev-bounces at openjdk.java.net> On Behalf Of Schmelter, Ralf
Sent: Montag, 16. Dezember 2019 13:28
To: Schmelter, Ralf <ralf.schmelter at sap.com>; Thomas St?fe <thomas.stuefe at gmail.com>
Cc: OpenJDK Serviceability <serviceability-dev at openjdk.java.net>; hotspot-runtime-dev at openjdk.java.net
Subject: [CAUTION] RE: RFR (M) 8234510: Remove file seeking requirement for writing a heap dump

I forgot to post the updated webrev:
http://cr.openjdk.java.net/~rschmelter/webrevs/8234510/webrev.2/

In addition to the changes requested by Thomas, I also renamed the entries in the heap dump segment from entries to sub-records, since that is what they are called in the comment describing the format.

Best regards,
Ralf


From christoph.langer at sap.com  Fri Dec 20 14:54:53 2019
From: christoph.langer at sap.com (Langer, Christoph)
Date: Fri, 20 Dec 2019 14:54:53 +0000
Subject: RFR (M) 8234510: Remove file seeking requirement for writing a
 heap dump
In-Reply-To: <AM0PR02MB450094D67F68E9168B6293D39F510@AM0PR02MB4500.eurprd02.prod.outlook.com>
References: <AM0PR02MB45008C66EC315E9836F7FF7A9F4A0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAA-vtUxV40m+-qQ0N7SZmOe9FotFfyr1snQGe0T5pK50LKmJXg@mail.gmail.com>
 <AM0PR02MB4500B68C26D844003956EC339F580@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <AM0PR02MB450094D67F68E9168B6293D39F510@AM0PR02MB4500.eurprd02.prod.outlook.com>
Message-ID: <DB8PR02MB55478E37515779FAEADA33548A2D0@DB8PR02MB5547.eurprd02.prod.outlook.com>

Hi Ralf,

I've spent some time with your change now as well.

Overall, looks good. I only have some minor findings:

There are 3 new methods:
void DumpWriter::finish_dump_segment()
void DumpWriter::start_sub_record()
void DumpWriter::end_sub_record()

I think it would ease understanding of the code if you would also create a method DumpWriter::start_dump_segment(). The work that should be done in there is currently done implicitly in DumpWriter::start_sub_record().

Also, all that DumpWriter::end_sub_record() does is asserting and debug only. So, maybe the whole method and its calls could be enclosed in debug_only?

Then, I've spotted a little spelling error: line 623 - segement should be segment.

Then, _sub_record_ended could be changed to _in_sub_record and the according semantics be adapted. I found that to be more understandable - but maybe it's just a personal taste thing.

And a last point is that there are many places where sizes are calculated, e.g. lines 1002, 1032-1036, 1081, 1116, 1174, 1175. Here, I think code comments could be added/improved to facilitate quicker understanding for the folks that are ingenuous to this code.

Cheers
Christoph

> -----Original Message-----
> From: serviceability-dev <serviceability-dev-bounces at openjdk.java.net> On
> Behalf Of Schmelter, Ralf
> Sent: Montag, 16. Dezember 2019 13:28
> To: Schmelter, Ralf <ralf.schmelter at sap.com>; Thomas St?fe
> <thomas.stuefe at gmail.com>
> Cc: OpenJDK Serviceability <serviceability-dev at openjdk.java.net>; hotspot-
> runtime-dev at openjdk.java.net
> Subject: [CAUTION] RE: RFR (M) 8234510: Remove file seeking requirement
> for writing a heap dump
> 
> I forgot to post the updated webrev:
> http://cr.openjdk.java.net/~rschmelter/webrevs/8234510/webrev.2/
> 
> In addition to the changes requested by Thomas, I also renamed the entries
> in the heap dump segment from entries to sub-records, since that is what
> they are called in the comment describing the format.
> 
> Best regards,
> Ralf


From ralf.schmelter at sap.com  Fri Dec 20 14:58:11 2019
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Fri, 20 Dec 2019 14:58:11 +0000
Subject: RFR (M) 8234510: Remove file seeking requirement for writing a
 heap dump
In-Reply-To: <DB7PR02MB36127EE8018712C005BF9DA49B2D0@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <AM0PR02MB45008C66EC315E9836F7FF7A9F4A0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAA-vtUxV40m+-qQ0N7SZmOe9FotFfyr1snQGe0T5pK50LKmJXg@mail.gmail.com>
 <AM0PR02MB4500B68C26D844003956EC339F580@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <AM0PR02MB450094D67F68E9168B6293D39F510@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <DB7PR02MB36127EE8018712C005BF9DA49B2D0@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <AM0PR02MB45001503159860863EADCFA29F2D0@AM0PR02MB4500.eurprd02.prod.outlook.com>

Hi Richard,

thanks for the review.

Best regards,
Ralf

From ralf.schmelter at sap.com  Fri Dec 20 15:23:37 2019
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Fri, 20 Dec 2019 15:23:37 +0000
Subject: RFR (M) 8234510: Remove file seeking requirement for writing a
 heap dump
In-Reply-To: <DB8PR02MB55478E37515779FAEADA33548A2D0@DB8PR02MB5547.eurprd02.prod.outlook.com>
References: <AM0PR02MB45008C66EC315E9836F7FF7A9F4A0@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <CAA-vtUxV40m+-qQ0N7SZmOe9FotFfyr1snQGe0T5pK50LKmJXg@mail.gmail.com>
 <AM0PR02MB4500B68C26D844003956EC339F580@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <AM0PR02MB450094D67F68E9168B6293D39F510@AM0PR02MB4500.eurprd02.prod.outlook.com>
 <DB8PR02MB55478E37515779FAEADA33548A2D0@DB8PR02MB5547.eurprd02.prod.outlook.com>
Message-ID: <AM0PR02MB45005A0CA4DFEC3ADF9329159F2D0@AM0PR02MB4500.eurprd02.prod.outlook.com>

Hi Christoph,

thanks for the review.

> I think it would ease understanding of the code if you would also
> create a method DumpWriter::start_dump_segment().

That's a feature, so you only have to check in one place.

> Also, all that DumpWriter::end_sub_record() does is asserting and
> debug only. So, maybe the whole method and its calls could be
> enclosed in debug_only?

I would hope the compiler is smart enough to eliminate the call.

> Then, I've spotted a little spelling error: line 623 - segement should be segment.

Will fix it

> Then, _sub_record_ended could be changed to _in_sub_record and the
> according semantics be adapted. I found that to be more understandable - 
> but maybe it's just a personal taste thing.

I think you're right. I will rename it accordingly.

> And a last point is that there are many places where sizes are calculated,
> e.g. lines 1002, 1032-1036, 1081, 1116, 1174, 1175. Here, I think code
> comments could be added/improved to facilitate quicker understanding
> for the folks that are ingenuous to this code.

I always did the size calculations in the same order the entries are written in, so it should be clear which term corresponds to which write statement. I have to try out how long the comments would get and if they facilitate a better understanding.

Best regards,
Ralf


From daniil.x.titov at oracle.com  Sat Dec 21 00:42:02 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Fri, 20 Dec 2019 16:42:02 -0800
Subject: RFR: 8236190 : Unproblem list
 vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java
Message-ID: <E96FAD27-330E-450F-87A4-46417AB0ABE4@oracle.com>

Please a review a changeset below that removes  vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java
 and runtime/appcds/cacheObject/RedefineClassTest.java tests from test/hotspot/jtreg/ProblemList-graal.txt.


--- a/test/hotspot/jtreg/ProblemList-graal.txt  Wed Dec 11 19:20:57 2019 -0800
+++ b/test/hotspot/jtreg/ProblemList-graal.txt  Fri Dec 20 11:53:31 2019 -0800
@@ -147,9 +147,7 @@
 vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn001/TestDescription.java     8195674,8195635   generic-all
 vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002/TestDescription.java     8195674,8195635   generic-all
 
-runtime/appcds/cacheObject/RedefineClassTest.java                                  8204506   macosx-all
-vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        8204506,8195635   macosx-all,generic-all
-vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java        8204506   macosx-all
+vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        8195635   generic-all


These tests were added in ProblemList-graal.txt in [1] and [2]  but issue [1]  is no longer reproducible in JDK 15  and the tests run fine in Mach5.

The third test vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java  appears to run fine as well but it was problem-listed 
due to several issues and issue [3] is not resolved yet.  Thus I decided to keep this test on the problem list. 

Mach5 tests for tier1,tier2, and tier3 successfully passed.

[1] https://bugs.openjdk.java.net/browse/JDK-8204506 
[2] https://bugs.openjdk.java.net/browse/JDK-8209587 
[3] https://bugs.openjdk.java.net/browse/JDK-8195635 
[4] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8236190 

Thanks.
Daniil


From chris.plummer at oracle.com  Sat Dec 21 01:12:27 2019
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 20 Dec 2019 17:12:27 -0800
Subject: RFR: 8236190 : Unproblem list
 vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java
In-Reply-To: <E96FAD27-330E-450F-87A4-46417AB0ABE4@oracle.com>
References: <E96FAD27-330E-450F-87A4-46417AB0ABE4@oracle.com>
Message-ID: <0250dbda-7ecf-8787-74af-b30736456767@oracle.com>

Looks good.

Chris

On 12/20/19 4:42 PM, Daniil Titov wrote:
> Please a review a changeset below that removes  vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java
>   and runtime/appcds/cacheObject/RedefineClassTest.java tests from test/hotspot/jtreg/ProblemList-graal.txt.
>
>
> --- a/test/hotspot/jtreg/ProblemList-graal.txt  Wed Dec 11 19:20:57 2019 -0800
> +++ b/test/hotspot/jtreg/ProblemList-graal.txt  Fri Dec 20 11:53:31 2019 -0800
> @@ -147,9 +147,7 @@
>   vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn001/TestDescription.java     8195674,8195635   generic-all
>   vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002/TestDescription.java     8195674,8195635   generic-all
>   
> -runtime/appcds/cacheObject/RedefineClassTest.java                                  8204506   macosx-all
> -vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        8204506,8195635   macosx-all,generic-all
> -vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java        8204506   macosx-all
> +vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        8195635   generic-all
>
>
> These tests were added in ProblemList-graal.txt in [1] and [2]  but issue [1]  is no longer reproducible in JDK 15  and the tests run fine in Mach5.
>
> The third test vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java  appears to run fine as well but it was problem-listed
> due to several issues and issue [3] is not resolved yet.  Thus I decided to keep this test on the problem list.
>
> Mach5 tests for tier1,tier2, and tier3 successfully passed.
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8204506
> [2] https://bugs.openjdk.java.net/browse/JDK-8209587
> [3] https://bugs.openjdk.java.net/browse/JDK-8195635
> [4] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8236190
>
> Thanks.
> Daniil
>
>


From igor.ignatyev at oracle.com  Sat Dec 21 01:13:16 2019
From: igor.ignatyev at oracle.com (Igor Ignatev)
Date: Fri, 20 Dec 2019 17:13:16 -0800
Subject: RFR: 8236190 : Unproblem list
 vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java
In-Reply-To: <E96FAD27-330E-450F-87A4-46417AB0ABE4@oracle.com>
References: <E96FAD27-330E-450F-87A4-46417AB0ABE4@oracle.com>
Message-ID: <DB2D05E8-4582-41F1-81C3-F3D86ED224B0@oracle.com>

LGTM

? Igor

>> On Dec 20, 2019, at 4:42 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
> ?Please a review a changeset below that removes  vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java
> and runtime/appcds/cacheObject/RedefineClassTest.java tests from test/hotspot/jtreg/ProblemList-graal.txt.
> 
> 
> --- a/test/hotspot/jtreg/ProblemList-graal.txt  Wed Dec 11 19:20:57 2019 -0800
> +++ b/test/hotspot/jtreg/ProblemList-graal.txt  Fri Dec 20 11:53:31 2019 -0800
> @@ -147,9 +147,7 @@
> vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn001/TestDescription.java     8195674,8195635   generic-all
> vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002/TestDescription.java     8195674,8195635   generic-all
> 
> -runtime/appcds/cacheObject/RedefineClassTest.java                                  8204506   macosx-all
> -vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        8204506,8195635   macosx-all,generic-all
> -vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java        8204506   macosx-all
> +vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        8195635   generic-all
> 
> 
> These tests were added in ProblemList-graal.txt in [1] and [2]  but issue [1]  is no longer reproducible in JDK 15  and the tests run fine in Mach5.
> 
> The third test vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java  appears to run fine as well but it was problem-listed 
> due to several issues and issue [3] is not resolved yet.  Thus I decided to keep this test on the problem list. 
> 
> Mach5 tests for tier1,tier2, and tier3 successfully passed.
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8204506 
> [2] https://bugs.openjdk.java.net/browse/JDK-8209587 
> [3] https://bugs.openjdk.java.net/browse/JDK-8195635 
> [4] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8236190 
> 
> Thanks.
> Daniil


From alexey.menkov at oracle.com  Sat Dec 21 01:18:06 2019
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Fri, 20 Dec 2019 17:18:06 -0800
Subject: RFR: 8236190 : Unproblem list
 vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java
In-Reply-To: <0250dbda-7ecf-8787-74af-b30736456767@oracle.com>
References: <E96FAD27-330E-450F-87A4-46417AB0ABE4@oracle.com>
 <0250dbda-7ecf-8787-74af-b30736456767@oracle.com>
Message-ID: <9ff5f0cd-04cc-211e-fc00-e3aa862fd196@oracle.com>

+1

--alex

On 12/20/2019 17:12, Chris Plummer wrote:
> Looks good.
> 
> Chris
> 
> On 12/20/19 4:42 PM, Daniil Titov wrote:
>> Please a review a changeset below that removes  
>> vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java 
>>
>> ? and runtime/appcds/cacheObject/RedefineClassTest.java tests from 
>> test/hotspot/jtreg/ProblemList-graal.txt.
>>
>>
>> --- a/test/hotspot/jtreg/ProblemList-graal.txt? Wed Dec 11 19:20:57 
>> 2019 -0800
>> +++ b/test/hotspot/jtreg/ProblemList-graal.txt? Fri Dec 20 11:53:31 
>> 2019 -0800
>> @@ -147,9 +147,7 @@
>>   
>> vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn001/TestDescription.java     
>> 8195674,8195635?? generic-all
>>   
>> vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002/TestDescription.java     
>> 8195674,8195635?? generic-all
>> -runtime/appcds/cacheObject/RedefineClassTest.java                                  
>> 8204506?? macosx-all
>> -vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        
>> 8204506,8195635?? macosx-all,generic-all
>> -vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java        
>> 8204506?? macosx-all
>> +vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        
>> 8195635?? generic-all
>>
>>
>> These tests were added in ProblemList-graal.txt in [1] and [2]? but 
>> issue [1]? is no longer reproducible in JDK 15? and the tests run fine 
>> in Mach5.
>>
>> The third test 
>> vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java  
>> appears to run fine as well but it was problem-listed
>> due to several issues and issue [3] is not resolved yet.? Thus I 
>> decided to keep this test on the problem list.
>>
>> Mach5 tests for tier1,tier2, and tier3 successfully passed.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8204506
>> [2] https://bugs.openjdk.java.net/browse/JDK-8209587
>> [3] https://bugs.openjdk.java.net/browse/JDK-8195635
>> [4] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8236190
>>
>> Thanks.
>> Daniil
>>
>>
> 

From daniil.x.titov at oracle.com  Sat Dec 21 01:31:31 2019
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Fri, 20 Dec 2019 17:31:31 -0800
Subject: 8236190 : Unproblem list
 vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java
In-Reply-To: <9ff5f0cd-04cc-211e-fc00-e3aa862fd196@oracle.com>
References: <E96FAD27-330E-450F-87A4-46417AB0ABE4@oracle.com>
 <0250dbda-7ecf-8787-74af-b30736456767@oracle.com>
 <9ff5f0cd-04cc-211e-fc00-e3aa862fd196@oracle.com>
Message-ID: <850A67A6-588E-4E2D-9C3A-9B427BA45A1C@oracle.com>

Hi Chris, Igor, and Alex,

Thank you for reviewing this change!

Best regards,
Daniil

?On 12/20/19, 5:18 PM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:

    +1
    
    --alex
    
    On 12/20/2019 17:12, Chris Plummer wrote:
    > Looks good.
    > 
    > Chris
    > 
    > On 12/20/19 4:42 PM, Daniil Titov wrote:
    >> Please a review a changeset below that removes  
    >> vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java 
    >>
    >>   and runtime/appcds/cacheObject/RedefineClassTest.java tests from 
    >> test/hotspot/jtreg/ProblemList-graal.txt.
    >>
    >>
    >> --- a/test/hotspot/jtreg/ProblemList-graal.txt  Wed Dec 11 19:20:57 
    >> 2019 -0800
    >> +++ b/test/hotspot/jtreg/ProblemList-graal.txt  Fri Dec 20 11:53:31 
    >> 2019 -0800
    >> @@ -147,9 +147,7 @@
    >>   
    >> vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn001/TestDescription.java     
    >> 8195674,8195635   generic-all
    >>   
    >> vmTestbase/nsk/jvmti/ForceEarlyReturn/ForceEarlyReturn002/TestDescription.java     
    >> 8195674,8195635   generic-all
    >> -runtime/appcds/cacheObject/RedefineClassTest.java                                  
    >> 8204506   macosx-all
    >> -vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        
    >> 8204506,8195635   macosx-all,generic-all
    >> -vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t002/TestDescription.java        
    >> 8204506   macosx-all
    >> +vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java        
    >> 8195635   generic-all
    >>
    >>
    >> These tests were added in ProblemList-graal.txt in [1] and [2]  but 
    >> issue [1]  is no longer reproducible in JDK 15  and the tests run fine 
    >> in Mach5.
    >>
    >> The third test 
    >> vmTestbase/nsk/jvmti/scenarios/hotswap/HS102/hs102t001/TestDescription.java  
    >> appears to run fine as well but it was problem-listed
    >> due to several issues and issue [3] is not resolved yet.  Thus I 
    >> decided to keep this test on the problem list.
    >>
    >> Mach5 tests for tier1,tier2, and tier3 successfully passed.
    >>
    >> [1] https://bugs.openjdk.java.net/browse/JDK-8204506
    >> [2] https://bugs.openjdk.java.net/browse/JDK-8209587
    >> [3] https://bugs.openjdk.java.net/browse/JDK-8195635
    >> [4] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8236190
    >>
    >> Thanks.
    >> Daniil
    >>
    >>
    > 
    

From rkennke at redhat.com  Sat Dec 21 21:24:26 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Sat, 21 Dec 2019 22:24:26 +0100
Subject: RFR: 8227269: Slow class loading when running JVM in debug mode
In-Reply-To: <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
References: <8870829e-c558-c956-2184-00204632abb6@redhat.com>
 <743135a7-cade-34ee-6485-1c174376d7bd@oracle.com>
 <20e507f7-0e6e-43a8-be0b-ea7ba6a6edcb@redhat.com>
 <af412283-7c6f-f5c0-77d4-3f77f9ac9146@redhat.com>
 <48dad0bc-fbdf-08eb-4bdf-f8220742035d@redhat.com>
Message-ID: <805d3200-2f12-922a-b6d6-78381c9a9d2b@redhat.com>

Here comes an update that resolves some races that happen when
disconnecting an agent. In particular, we need to take the lock on
basically every operation, and also need to check whether or not
class-tracking is active and return an appropriate result (e.g. an empty
list) when we're not.

Updated webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/

Thanks,
Roman


> So, here comes the O(1) implementation:
> 
> - Whenever a class is 'prepared', it is registered with a tag, and we
> set-up a listener to get notified when it is unloaded.
> - Prepared classes are kept in a datastructure that is a table, which
> each entry being the head of a linked-list of KlassNode*. The table is
> indexed by tag % slot-count, and then simply prepend the new KlassNode*.
> This is O(1) operation.
> - When we get notified of unloading a class, we look up the signature of
> the reported tag in that table, and remember it in a bag. The KlassNode*
> is then unlinked from the table and deallocated. This is ~O(1) operation
> too, depending on the depth of the table. In my testcase which hammered
> the code with class-loads and unloads, I usually see depths of like 2-3,
> but not usually more. It should be ok.
> - when processUnloads() gets called, we simply hand out that bag, and
> allocate a new one.
> - I also added cleanup-code in classTrack_reset() to avoid leaking the
> signatures and KlassNode* etc when debug agent gets detached and/or
> re-attached (was missing before).
> - I also added locks around data-structure-manipulation (was missing
> before).
> - Also, I only activate this whole process when an actual listener gets
> registered on EI_GC_FINISH. This seems to happen right when attaching a
> jdb, not sure why jdb does that though. This may be something to improve
> in the future?
> 
> In my tests, the performance of class-tracking itself looks really good.
> The bottleneck now is clearly actual synthesizing the class-unload
> events. I don't see how this can be helped when the debug agent asks for it?
> 
> Updated webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
> 
> Please let me know what you think of it.
> 
> Thanks,
> Roman
> 
> 
>> Alright, the perfectionist in me got me. I am implementing the even more
>> efficient ~O(1) class tracking. Please hold off reviewing for now.
>>
>> Thanks,Roman
>>
>>  Hi Chris,
>>>
>>>> I'll have a look at this, although it might not be for a few days. In
>>>> the meantime, maybe you can describe your new implementation in
>>>> classTrack.c so it's easier to look through the changes.
>>>
>>> Sure.
>>>
>>> The purpose of this class-tracking is to be able to determine the
>>> signatures of unloaded classes when GC/class-unloading happened, so that
>>> we can generate the appropriate JDWP event.
>>>
>>> The current implementation does so by maintaining a table of currently
>>> prepared classes by building that table when classTrack is initialized,
>>> and then add new classes whenever a class gets loaded. When unloading
>>> occurs, that cache is rebuilt into a new table, and compared with the
>>> old table, and whatever is in the old, but not in the new table gets
>>> returned. The problem is that when GCs happen frequently and/or many
>>> classes get loaded+unloaded, this amounts to O(classCount*gcCount)
>>> complexity.
>>>
>>> The new implementation keeps a linked-list of prepared classes, and also
>>> tracks unloads via the listener cbTrackingObjectFree(). Whenever an
>>> unload/GC occurs, the list of prepared classes is scanned, and classes
>>> that are also in the deletedTagBag are unlinked (thus maintaining the
>>> prepared-classes-list) and its signature put in the list that gets returned.
>>>
>>> The implementation is not perfect. In order to determine whether or not
>>> a class is unloaded, it needs to scan the deletedTagBag. That process is
>>> therefore still O(unloadedClassCount). The assumption here is that
>>> unloadedClassCount << classCount. In my experiments this seems to be
>>> true, and also reasonable to expect.
>>>
>>> (I have some ideas how to improve the implementation to ~O(1) but it
>>> would be considerably more complex: have to maintain a (hash)table that
>>> maps tags -> KlassNode*, unlink them directly upon unload, and build the
>>> unloaded-signatures list there, but I don't currently see that it's
>>> worth the effort).
>>>
>>> In addition to all that, this process is only activated when there's an
>>> actual listener registered for EI_GC_FINISH.
>>>
>>> Thanks,
>>> Roman
>>>
>>>
>>>> Chris
>>>>
>>>> On 12/18/19 5:05 AM, Roman Kennke wrote:
>>>>> Hello all,
>>>>>
>>>>> Issue:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8227269
>>>>>
>>>>> I am proposing what amounts to a rewrite of classTrack.c. It avoids
>>>>> throwing away the class cache on GC, and instead keeps track of
>>>>> loaded/unloaded classes one-by-one.
>>>>>
>>>>> In addition to that, it avoids this whole dance until an agent
>>>>> registers interest in EI_GC_FINISH.
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
>>>>>
>>>>> Testing: manual testing of provided test scenarios and timing.
>>>>>
>>>>> Eg with the testcase provided here:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1751985
>>>>>
>>>>> I am getting those numbers:
>>>>> unpatched: no debug: 84s with debug: 225s
>>>>> patched:?? no debug: 85s with debug: 95s
>>>>>
>>>>> I also tested successfully through jdk/submit repo
>>>>>
>>>>> Can I please get a review?
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>
>>>>
>>>
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191221/73fa72c1/signature.asc>

From richard.reingruber at sap.com  Mon Dec 23 09:40:52 2019
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Mon, 23 Dec 2019 09:40:52 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
Message-ID: <DB7PR02MB361224E2C0B8932C73C8A24A9B2E0@DB7PR02MB3612.eurprd02.prod.outlook.com>

Hi,

webrev.3 didn't apply anymore after 8236000 [1]. I've rebased and updated in place:

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

The change was minimal.

Cheers, Richard.

[1] JDK-8236000: VM build without C2 fails

-----Original Message-----
From: Reingruber, Richard 
Sent: Dienstag, 10. Dezember 2019 22:45
To: serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi,

I would like to get reviews please for

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/

Corresponding RFE:
https://bugs.openjdk.java.net/browse/JDK-8227745

Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1]

Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without issues (thanks!). In addition the
change is being tested at SAP since I posted the first RFR some months ago.

The intention of this enhancement is to benefit performance wise from escape analysis even if JVMTI
agents request capabilities that allow them to access local variable values. E.g. if you start-up
with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then escape analysis is disabled right
from the beginning, well before a debugger attaches -- if ever one should do so. With the
enhancement, escape analysis will remain enabled until and after a debugger attaches. EA based
optimizations are reverted just before an agent acquires the reference to an object. In the JBS item
you'll find more details.

Thanks,
Richard.

[1] Experimental fix for JDK-8214584 based on JDK-8227745
    http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch

From suenaga at oss.nttdata.com  Fri Dec 27 04:10:40 2019
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Fri, 27 Dec 2019 13:10:40 +0900
Subject: RFR (trivial): 8236552: Description of jmxremote.ssl.config.file in
 ManagementAgent.start is incorrect
Message-ID: <bb685212-a513-86ba-cb98-835947795b74@oss.nttdata.com>

Hi all,

Please review this trivial fix:

   JBS: https://bugs.openjdk.java.net/browse/JDK-8236552
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8236552/webrev.00/

ManagementAgent.start dcmd provides jmxremote.ssl.config.file for SSL configuration. We can see description as below:

```
jmxremote.ssl.config.file : [optional] set com.sun.management.jmxremote.ssl_config_file (STRING, no default value)
```

This option would set com.sun.management.jmxremote.ssl.config.file, not ssl_config_file.


Thanks,

Yasumasa