RFR (M) 8201655: Add thread-enabled support for the Heap Sampling
JC Beyler
jcbeyler at google.com
Mon Nov 12 17:53:12 UTC 2018
Hi Serguei,
Thanks for the update and thanks for testing mach5. Serguei sent me that
the testing passed mach5 testing, could I get another review to be able to
push it?
Thanks!
Jc
On Tue, Nov 6, 2018 at 10:41 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:
> Hi Jc,
>
> Thank you for the update!
> It looks good.
> It is great that testing on your side is Okay.
>
> I'll submit a mach5 job soon (today or tomorrow).
>
> Thanks,
> Serguei
>
>
> On 11/6/18 20:03, JC Beyler wrote:
>
> Hi Serguei,
>
> You are right, I should have reverted the memAllocator.cpp file, sorry
> about that.
>
> Here is the new webrev:
> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.04/
>
> I think we are good by testing standards, like I
> said HeapMonitorThreadTest.java tests multiple threads. I did test an
> example with a thousand threads and I get the samples from 1000 threads so
> it seems to work there too.
>
> Per thread is tested via the new HeapMonitorThreadDisabledTest.java so I
> think we are good there too.
>
> I would recommend a mach-5 testing just in case for this one if you can,
> it will be better to have that reinsurance.
>
> Thanks for your help,
> Jc
>
> On Tue, Nov 6, 2018 at 4:29 PM <serguei.spitsyn at oracle.com> wrote:
>
>> Hi Jc,
>>
>> Not sure, I understand a motivation for this change:
>>
>> - if (JvmtiExport::should_post_sampled_object_alloc()) {+ {
>>
>> Also, I'm not sure this is still needed:
>>
>> +#include "prims/jvmtiEventController.inline.hpp"+#include "prims/jvmtiThreadState.inline.hpp"
>>
>> I expected you'd just revert all the changes in the memAllocator.cpp.
>>
>> Also, it is up to you to make a decision if these performance-related fix
>> is needed or not.
>>
>> But it needs to be well tested so that both global+thread event
>> management works correctly.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 11/6/18 9:42 AM, JC Beyler wrote:
>>
>> Hi Serguei,
>>
>> Yes exactly it was an optimization. When using a 512k sampling rate, I
>> don't see a no real difference (the overhead is anyway low for that
>> sampling rate), I imagine there would be a difference if trying to sample
>> every allocation or with a low sampling interval. But because you are right
>> and it is an optimization of the system and not a functional need, I've
>> reverted it and now the webrev is updated here:
>>
>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.03/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
>>
>> The incremental webrev is here:
>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02_03/
>>
>> Let me know what you think,
>> Jc
>>
>> On Mon, Nov 5, 2018 at 6:51 PM serguei.spitsyn at oracle.com <
>> serguei.spitsyn at oracle.com> wrote:
>>
>>> Hi Jc,
>>>
>>> Okay, I see your point: the change in memAllocator.cpp is for
>>> performance.
>>> Do you have any measurements showing a performance difference?
>>> Also, do you need me to submit a mach5 test run?
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 11/5/18 15:14, JC Beyler wrote:
>>>
>>> Hi Serguei,
>>>
>>> First off, thanks as always for looking at this :-) Let me inline my
>>> answers:
>>>
>>> I actually "struggled" with this part of the change. My change is
>>> correct semantically and if you care about performance for when sampling a
>>> given thread.
>>> Your change will work semantically but the performance is the same as
>>> the global sampling.
>>>
>>> What I mean by working semantically is that that the tests and the code
>>> will work. However, this means that all threads will be doing the sampling
>>> work but when the code will post the event here:
>>> ->
>>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html
>>>
>>> (which is why your suggestion works, the change in jvmtiExport basically
>>> ensures only the threads requested are posting events)
>>>
>>> The code will check that we actually only post for threads we care
>>> about. The code above ensures that only threads that were requested to be
>>> sampling are the ones that are sampling internally.
>>>
>>> Note: I REALLY prefer your suggestion for two reasons:
>>> - We do not change the runtime/GC code at all, it remains "simple"
>>> - The overhead in the general case goes away and this is a NOP for my
>>> actual use-case from a performance point of view (sampling every thread)
>>>
>>> But:
>>> - Then sampling per thread really is just telling the system don't
>>> pollute the callbacks, though internally you are doing all the work anyway.
>>>
>>> Let me know which you prefer :)
>>>
>>>
>>>
>>>> Also, do you see that enabling the sampling events globally still works?
>>>>
>>>>
>>> Yes, otherwise HeapMonitorThreadTest.java would fail since it checks
>>> that.
>>>
>>>
>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
>>>>
>>>> A couple of places where err is declared as int instead of jvmtiError:
>>>> 714 int err; 742 int err;
>>>>
>>>> Should not be silent in a case of JVMTI error:
>>>> 744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
>>>> 745 if (err != JVMTI_ERROR_NONE) { 746 return;
>>>>
>>>>
>>>>
>>> Done and done, I added a fprintf on stderr saying the GetThreadInfo
>>> failed and the test is ignoring the add count.
>>>
>>> Thanks again for looking and let me know what you think,
>>> Jc
>>>
>>> On Mon, Nov 5, 2018 at 2:25 PM serguei.spitsyn at oracle.com <
>>> serguei.spitsyn at oracle.com> wrote:
>>>
>>>> Hi Jc,
>>>>
>>>> It looks good in general but I have some comments below.
>>>>
>>>>
>>>>
>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
>>>>
>>>> +static bool thread_enabled_for_one_jvmti_env() {+ JavaThread *thread = JavaThread::current();+ JvmtiThreadState *state = thread->jvmti_thread_state();+ if (state == NULL) {+ return false;+ }++ JvmtiEnvThreadStateIterator it(state);+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {+ if (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {+ return true;+ }+ }++ return false;+}+
>>>> void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
>>>> // support for JVMTI VMObjectAlloc event (no-op if not enabled)
>>>> JvmtiExport::vm_object_alloc_event_collector(obj());
>>>>
>>>> if (!JvmtiExport::should_post_sampled_object_alloc()) {
>>>> // Sampling disabled
>>>> return;
>>>> }
>>>> + // Sampling is enabled for at least one thread, is it this one?+ if (!thread_enabled_for_one_jvmti_env()) {+ return;+ }+
>>>>
>>>> I don't think you need this change as this condition already does it: if (!JvmtiExport::should_post_sampled_object_alloc()) {
>>>> // Sampling disabled
>>>> return;
>>>> }
>>>>
>>>> Please, look at the following line in the jvmtiEventController.cpp:
>>>> JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled & SAMPLED_OBJECT_ALLOC_BIT) != 0);
>>>>
>>>> I hope, testing will prove my suggestion is correct.
>>>> Also, do you see that enabling the sampling events globally still works?
>>>>
>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
>>>>
>>>> A couple of places where err is declared as int instead of jvmtiError:
>>>> 714 int err; 742 int err;
>>>>
>>>> Should not be silent in a case of JVMTI error:
>>>> 744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
>>>> 745 if (err != JVMTI_ERROR_NONE) { 746 return;
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 10/26/18 10:48, JC Beyler wrote:
>>>>
>>>> Hi all,
>>>>
>>>> When working on the heap sampling, I had promised to do the per thread
>>>> event so here it is!
>>>>
>>>> Could I get a review for this:
>>>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
>>>>
>>>> I was thinking of adding GC-dev for the memAllocator change once I get
>>>> favorable reviews for the rest of the change.
>>>>
>>>> I've done a bit of performance testing and on the Dacapo benchmark I
>>>> see no change in performance when turned off (logical, any code change is
>>>> behind a flag check already in place) and when turned on it is comparable
>>>> to the current performance.
>>>>
>>>> (More information is: I see a very slight degradation if we are doing
>>>> 512k sampling but no degradation at 2MB).
>>>>
>>>> Thanks,
>>>> Jc
>>>>
>>>>
>>>>
>>>
>>> --
>>>
>>> Thanks,
>>> Jc
>>>
>>>
>>>
>>
>> --
>>
>> Thanks,
>> Jc
>>
>>
>>
>
> --
>
> Thanks,
> Jc
>
>
>
--
Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20181112/995eb534/attachment-0001.html>
More information about the serviceability-dev
mailing list