RFR (M) 8201655: Add thread-enabled support for the Heap Sampling

JC Beyler jcbeyler at google.com
Mon Nov 12 17:53:12 UTC 2018


Hi Serguei,

Thanks for the update and thanks for testing mach5. Serguei sent me that
the testing passed mach5 testing, could I get another review to be able to
push it?

Thanks!
Jc

On Tue, Nov 6, 2018 at 10:41 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc,
>
> Thank you for the update!
> It looks good.
> It is great that testing on your side is Okay.
>
> I'll submit a mach5 job soon (today or tomorrow).
>
> Thanks,
> Serguei
>
>
> On 11/6/18 20:03, JC Beyler wrote:
>
> Hi Serguei,
>
> You are right, I should have reverted the memAllocator.cpp file, sorry
> about that.
>
> Here is the new webrev:
> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.04/
>
> I think we are good by testing standards, like I
> said HeapMonitorThreadTest.java tests multiple threads. I did test an
> example with a thousand threads and I get the samples from 1000 threads so
> it seems to work there too.
>
> Per thread is tested via the new HeapMonitorThreadDisabledTest.java so I
> think we are good there too.
>
> I would recommend a mach-5 testing just in case for this one if you can,
> it will be better to have that reinsurance.
>
> Thanks for your help,
> Jc
>
> On Tue, Nov 6, 2018 at 4:29 PM <serguei.spitsyn at oracle.com> wrote:
>
>> Hi Jc,
>>
>> Not sure, I understand a motivation for this change:
>>
>> -  if (JvmtiExport::should_post_sampled_object_alloc()) {+  {
>>
>> Also, I'm not sure this is still needed:
>>
>> +#include "prims/jvmtiEventController.inline.hpp"+#include "prims/jvmtiThreadState.inline.hpp"
>>
>> I expected you'd just revert all the changes in the memAllocator.cpp.
>>
>> Also, it is up to you to make a decision if these performance-related fix
>> is needed or not.
>>
>> But it needs to be well tested so that both global+thread event
>> management works correctly.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 11/6/18 9:42 AM, JC Beyler wrote:
>>
>> Hi Serguei,
>>
>> Yes exactly it was an optimization. When using a 512k sampling rate, I
>> don't see a no real difference (the overhead is anyway low for that
>> sampling rate), I imagine there would be a difference if trying to sample
>> every allocation or with a low sampling interval. But because you are right
>> and it is an optimization of the system and not a functional need, I've
>> reverted it and now the webrev is updated here:
>>
>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.03/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
>>
>> The incremental webrev is here:
>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02_03/
>>
>> Let me know what you think,
>> Jc
>>
>> On Mon, Nov 5, 2018 at 6:51 PM serguei.spitsyn at oracle.com <
>> serguei.spitsyn at oracle.com> wrote:
>>
>>> Hi Jc,
>>>
>>> Okay, I see your point: the change in memAllocator.cpp is for
>>> performance.
>>> Do you have any measurements showing a performance difference?
>>> Also, do you need me to submit a mach5 test run?
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 11/5/18 15:14, JC Beyler wrote:
>>>
>>> Hi Serguei,
>>>
>>> First off, thanks as always for looking at this :-) Let me inline my
>>> answers:
>>>
>>> I actually "struggled" with this part of the change. My change is
>>> correct semantically and if you care about performance for when sampling a
>>> given thread.
>>> Your change will work semantically but the performance is the same as
>>> the global sampling.
>>>
>>> What I mean by working semantically is that that the tests and the code
>>> will work. However, this means that all threads will be doing the sampling
>>> work but when the code will post the event here:
>>>    ->
>>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html
>>>
>>> (which is why your suggestion works, the change in jvmtiExport basically
>>> ensures only the threads requested are posting events)
>>>
>>> The code will check that we actually only post for threads we care
>>> about. The code above ensures that only threads that were requested to be
>>> sampling are the ones that are sampling internally.
>>>
>>> Note: I REALLY prefer your suggestion for two reasons:
>>>   - We do not change the runtime/GC code at all, it remains "simple"
>>>   - The overhead in the general case goes away and this is a NOP for my
>>> actual use-case from a performance point of view (sampling every thread)
>>>
>>> But:
>>>   - Then sampling per thread really is just telling the system don't
>>> pollute the callbacks, though internally you are doing all the work anyway.
>>>
>>> Let me know which you prefer :)
>>>
>>>
>>>
>>>>  Also, do you see that enabling the sampling events globally still works?
>>>>
>>>>
>>> Yes, otherwise HeapMonitorThreadTest.java would fail since it checks
>>> that.
>>>
>>>
>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
>>>>
>>>>  A couple of places where err is declared as int instead of jvmtiError:
>>>>    714   int err; 742   int err;
>>>>
>>>>  Should not be silent in a case of JVMTI error:
>>>>  744   err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
>>>>  745   if (err != JVMTI_ERROR_NONE) { 746     return;
>>>>
>>>>
>>>>
>>> Done and done, I added a fprintf on stderr saying the GetThreadInfo
>>> failed and the test is ignoring the add count.
>>>
>>> Thanks again for looking and let me know what you think,
>>> Jc
>>>
>>> On Mon, Nov 5, 2018 at 2:25 PM serguei.spitsyn at oracle.com <
>>> serguei.spitsyn at oracle.com> wrote:
>>>
>>>> Hi Jc,
>>>>
>>>> It looks good in general but I have some comments below.
>>>>
>>>>
>>>>
>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
>>>>
>>>> +static bool thread_enabled_for_one_jvmti_env() {+  JavaThread *thread  = JavaThread::current();+  JvmtiThreadState *state = thread->jvmti_thread_state();+  if (state == NULL) {+    return false;+  }++  JvmtiEnvThreadStateIterator it(state);+  for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {+    if (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {+      return true;+    }+  }++  return false;+}+
>>>>  void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
>>>>    // support for JVMTI VMObjectAlloc event (no-op if not enabled)
>>>>    JvmtiExport::vm_object_alloc_event_collector(obj());
>>>>
>>>>    if (!JvmtiExport::should_post_sampled_object_alloc()) {
>>>>      // Sampling disabled
>>>>      return;
>>>>    }
>>>>  +  // Sampling is enabled for at least one thread, is it this one?+  if (!thread_enabled_for_one_jvmti_env()) {+    return;+  }+
>>>>
>>>>  I don't think you need this change as this condition already does it:   if (!JvmtiExport::should_post_sampled_object_alloc()) {
>>>>      // Sampling disabled
>>>>      return;
>>>>    }
>>>>
>>>>  Please, look at the following line in the jvmtiEventController.cpp:
>>>>     JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled & SAMPLED_OBJECT_ALLOC_BIT) != 0);
>>>>
>>>>  I hope, testing will prove my suggestion is correct.
>>>>  Also, do you see that enabling the sampling events globally still works?
>>>>
>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
>>>>
>>>>  A couple of places where err is declared as int instead of jvmtiError:
>>>>    714   int err; 742   int err;
>>>>
>>>>  Should not be silent in a case of JVMTI error:
>>>>  744   err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
>>>>  745   if (err != JVMTI_ERROR_NONE) { 746     return;
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 10/26/18 10:48, JC Beyler wrote:
>>>>
>>>> Hi all,
>>>>
>>>> When working on the heap sampling, I had promised to do the per thread
>>>> event so here it is!
>>>>
>>>> Could I get a review for this:
>>>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
>>>>
>>>> I was thinking of adding GC-dev for the memAllocator change once I get
>>>> favorable reviews for the rest of the change.
>>>>
>>>> I've done a bit of performance testing and on the Dacapo benchmark I
>>>> see no change in performance when turned off (logical, any code change is
>>>> behind a flag check already in place) and when turned on it is comparable
>>>> to the current performance.
>>>>
>>>> (More information is: I see a very slight degradation if we are doing
>>>> 512k sampling but no degradation at 2MB).
>>>>
>>>> Thanks,
>>>> Jc
>>>>
>>>>
>>>>
>>>
>>> --
>>>
>>> Thanks,
>>> Jc
>>>
>>>
>>>
>>
>> --
>>
>> Thanks,
>> Jc
>>
>>
>>
>
> --
>
> Thanks,
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20181112/995eb534/attachment-0001.html>


More information about the serviceability-dev mailing list