RFR (M) 8201655: Add thread-enabled support for the Heap Sampling

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Wed Dec 12 20:17:52 UTC 2018


LGTM++

Thanks,
Serguei


On 12/12/18 12:15, Alex Menkov wrote:
> Looks good to me.
>
> --alex
>
> On 12/12/2018 11:25, JC Beyler wrote:
>> Thanks both for the review, I fixed both issues and here is the new 
>> webrev, let me know what you think:
>>
>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.06/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
>>
>> Thanks!
>> Jc
>>
>> On Tue, Dec 11, 2018 at 5:01 PM <serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>>     Hi Alex,
>>
>>     Nice catch!
>>
>>     It is about the following fragment:
>>
>>       726   for (i = 0; i < thread_stats.number_threads; i++) {
>>     727 if (strcmp(expected_name, thread_stats.threads[i])) {
>>     728 return FALSE;
>>       729     } else {
>>     730 found_thread = TRUE;
>>       731     }
>>       732   }
>>     733 return found_thread;
>>     734 }
>>
>>     Also, I'd also use 'count' in place of 'number'.
>>     It is to avoid association with thread identification number.
>>
>>     Thanks,
>>     Serguei
>>
>>
>>     On 12/11/18 4:42 PM, Alex Menkov wrote:
>>>     Hi Jc,
>>>
>>>     The fix looks good.
>>>     The only note is checkThreadSamplesOnlyFrom native function
>>>     implementation - the cycle looks confusing.
>>>     As far as I got the function should check that thread_stats
>>>     contains only 1 thread and name of the thread is the same as name
>>>     of the specified thread.
>>>     And for error analysis it would be great to provide good error
>>>     description.
>>>     So I'd make it like
>>>
>>>     if (thread_stats.number_threads != 1) {
>>>       fprintf(stderr, "Wrong thread number: %d (expected 1)\n",
>>>     thread_stats.number_threads);
>>>       return FALSE;
>>>     }
>>>     if (strcmp(expected_name, thread_stats.threads[i]) != 0) {
>>>       fprintf(stderr, "Unexpected thread name: '%s' (expected
>>>     '%s')\n", thread_stats.threads[i], expected_name);
>>>       return FALSE;
>>>     }
>>>     return TRUE;
>>>
>>>     --alex
>>>
>>>     On 12/11/2018 15:11, serguei.spitsyn at oracle.com
>>>     <mailto:serguei.spitsyn at oracle.com> wrote:
>>>>     Hi Jc,
>>>>
>>>>     Alex will take a look at the test update.
>>>>
>>>>     Thanks,
>>>>     Serguei
>>>>
>>>>     On 11/12/18 9:53 AM, JC Beyler wrote:
>>>>>     Hi Serguei,
>>>>>
>>>>>     Thanks for the update and thanks for testing mach5. Serguei sent
>>>>>     me that the testing passed mach5 testing, could I get another
>>>>>     review to be able to push it?
>>>>>
>>>>>     Thanks!
>>>>>     Jc
>>>>>
>>>>>     On Tue, Nov 6, 2018 at 10:41 PM serguei.spitsyn at oracle.com
>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>>
>>>>>         Hi Jc,
>>>>>
>>>>>         Thank you for the update!
>>>>>         It looks good.
>>>>>         It is great that testing on your side is Okay.
>>>>>
>>>>>         I'll submit a mach5 job soon (today or tomorrow).
>>>>>
>>>>>         Thanks,
>>>>>         Serguei
>>>>>
>>>>>
>>>>>         On 11/6/18 20:03, JC Beyler wrote:
>>>>>>         Hi Serguei,
>>>>>>
>>>>>>         You are right, I should have reverted the memAllocator.cpp
>>>>>>     file,
>>>>>>         sorry about that.
>>>>>>
>>>>>>         Here is the new webrev:
>>>>>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.04/
>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.04/>
>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.04/>
>>>>>>
>>>>>>         I think we are good by testing standards, like I
>>>>>>         said HeapMonitorThreadTest.java tests multiple threads. I 
>>>>>> did
>>>>>>         test an example with a thousand threads and I get the 
>>>>>> samples
>>>>>>         from 1000 threads so it seems to work there too.
>>>>>>
>>>>>>         Per thread is tested via the new
>>>>>>         HeapMonitorThreadDisabledTest.java so I think we are good
>>>>>>     there too.
>>>>>>
>>>>>>         I would recommend a mach-5 testing just in case for this
>>>>>>     one if
>>>>>>         you can, it will be better to have that reinsurance.
>>>>>>
>>>>>>         Thanks for your help,
>>>>>>         Jc
>>>>>>
>>>>>>         On Tue, Nov 6, 2018 at 4:29 PM <serguei.spitsyn at oracle.com
>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>>>
>>>>>>             Hi Jc,
>>>>>>
>>>>>>             Not sure, I understand a motivation for this change:
>>>>>>
>>>>>>             - if (JvmtiExport::should_post_sampled_object_alloc()) {
>>>>>>             + {
>>>>>>
>>>>>>             Also, I'm not sure this is still needed:
>>>>>>
>>>>>>             +#include "prims/jvmtiEventController.inline.hpp"
>>>>>>             +#include "prims/jvmtiThreadState.inline.hpp"
>>>>>>
>>>>>>             I expected you'd just revert all the changes in the
>>>>>>             memAllocator.cpp.
>>>>>>
>>>>>>             Also, it is up to you to make a decision if these
>>>>>>             performance-related fix is needed or not.
>>>>>>
>>>>>>             But it needs to be well tested so that both 
>>>>>> global+thread
>>>>>>             event management works correctly.
>>>>>>
>>>>>>             Thanks,
>>>>>>             Serguei
>>>>>>
>>>>>>
>>>>>>             On 11/6/18 9:42 AM, JC Beyler wrote:
>>>>>>>             Hi Serguei,
>>>>>>>
>>>>>>>             Yes exactly it was an optimization. When using a 512k
>>>>>>>             sampling rate, I don't see a no real difference (the
>>>>>>>             overhead is anyway low for that sampling rate), I 
>>>>>>> imagine
>>>>>>>             there would be a difference if trying to sample every
>>>>>>>             allocation or with a low sampling interval. But
>>>>>>>     because you
>>>>>>>             are right and it is an optimization of the system and
>>>>>>>     not a
>>>>>>>             functional need, I've reverted it and now the webrev is
>>>>>>>             updated here:
>>>>>>>
>>>>>>>             Webrev:
>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.03/
>>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.03/>
>>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.03/>
>>>>>>>             Bug: https://bugs.openjdk.java.net/browse/JDK-8201655
>>>>>>>
>>>>>>>             The incremental webrev is here:
>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02_03/
>>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02_03/>
>>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02_03/>
>>>>>>>
>>>>>>>             Let me know what you think,
>>>>>>>             Jc
>>>>>>>
>>>>>>>             On Mon, Nov 5, 2018 at 6:51 PM
>>>>>>>     serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com>
>>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>>             <serguei.spitsyn at oracle.com
>>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>>>>
>>>>>>>                 Hi Jc,
>>>>>>>
>>>>>>>                 Okay, I see your point: the change in
>>>>>>>     memAllocator.cpp
>>>>>>>                 is for performance.
>>>>>>>                 Do you have any measurements showing a performance
>>>>>>>                 difference?
>>>>>>>                 Also, do you need me to submit a mach5 test run?
>>>>>>>
>>>>>>>                 Thanks,
>>>>>>>                 Serguei
>>>>>>>
>>>>>>>
>>>>>>>                 On 11/5/18 15:14, JC Beyler wrote:
>>>>>>>>                 Hi Serguei,
>>>>>>>>
>>>>>>>>                 First off, thanks as always for looking at this
>>>>>>>>     :-) Let
>>>>>>>>                 me inline my answers:
>>>>>>>>
>>>>>>>>                 I actually "struggled" with this part of the
>>>>>>>>     change. My
>>>>>>>>                 change is correct semantically and if you care 
>>>>>>>> about
>>>>>>>>                 performance for when sampling a given thread.
>>>>>>>>                 Your change will work semantically but the
>>>>>>>>     performance
>>>>>>>>                 is the same as the global sampling.
>>>>>>>>
>>>>>>>>                 What I mean by working semantically is that 
>>>>>>>> that the
>>>>>>>>                 tests and the code will work. However, this means
>>>>>>>>     that
>>>>>>>>                 all threads will be doing the sampling work but 
>>>>>>>> when
>>>>>>>>                 the code will post the event here:
>>>>>>>>                    ->
>>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html
>>>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html>
>>>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/prims/jvmtiExport.cpp.udiff.html>
>>>>>>>>
>>>>>>>>                 (which is why your suggestion works, the change in
>>>>>>>>                 jvmtiExport basically ensures only the threads
>>>>>>>>                 requested are posting events)
>>>>>>>>
>>>>>>>>                 The code will check that we actually only post for
>>>>>>>>                 threads we care about. The code above ensures
>>>>>>>>     that only
>>>>>>>>                 threads that were requested to be sampling are
>>>>>>>>     the ones
>>>>>>>>                 that are sampling internally.
>>>>>>>>
>>>>>>>>                 Note: I REALLY prefer your suggestion for two
>>>>>>>>     reasons:
>>>>>>>>                   - We do not change the runtime/GC code at 
>>>>>>>> all, it
>>>>>>>>                 remains "simple"
>>>>>>>>                   - The overhead in the general case goes away
>>>>>>>>     and this
>>>>>>>>                 is a NOP for my actual use-case from a performance
>>>>>>>>                 point of view (sampling every thread)
>>>>>>>>
>>>>>>>>                 But:
>>>>>>>>                   - Then sampling per thread really is just
>>>>>>>>     telling the
>>>>>>>>                 system don't pollute the callbacks, though
>>>>>>>>     internally
>>>>>>>>                 you are doing all the work anyway.
>>>>>>>>
>>>>>>>>                 Let me know which you prefer :)
>>>>>>>>
>>>>>>>>
>>>>>>>>                       Also, do you see that enabling the sampling
>>>>>>>>     events globally still works?
>>>>>>>>
>>>>>>>>
>>>>>>>>                 Yes, otherwise HeapMonitorThreadTest.java would 
>>>>>>>> fail
>>>>>>>>                 since it checks that.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
>>>>>>>>
>>>>>>>>                       A couple of places where err is declared as
>>>>>>>>     int instead of jvmtiError:
>>>>>>>>                          714   int err;
>>>>>>>>                     742 int err; Should not be silent in a case of
>>>>>>>>                     JVMTI error:   744   err =
>>>>>>>>     (*jvmti)->GetThreadInfo(jvmti, thread, &info);
>>>>>>>>                       745   if (err != JVMTI_ERROR_NONE) {
>>>>>>>>                     746 return;
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 Done and done, I added a fprintf on stderr saying
>>>>>>>>     the
>>>>>>>>                 GetThreadInfo failed and the test is ignoring the
>>>>>>>>     add
>>>>>>>>                 count.
>>>>>>>>
>>>>>>>>                 Thanks again for looking and let me know what you
>>>>>>>>     think,
>>>>>>>>                 Jc
>>>>>>>>
>>>>>>>>                 On Mon, Nov 5, 2018 at 2:25 PM
>>>>>>>>     serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com>
>>>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>>>                 <serguei.spitsyn at oracle.com
>>>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>>>     <mailto:serguei.spitsyn at oracle.com>
>>>>>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>>>>>
>>>>>>>>                     Hi Jc,
>>>>>>>>
>>>>>>>>                     It looks good in general but I have some
>>>>>>>>     comments
>>>>>>>>                     below.
>>>>>>>>
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
>>>>>>>>
>>>>>>>>                     +static bool
>>>>>>>>     thread_enabled_for_one_jvmti_env() {
>>>>>>>>                     + JavaThread *thread = JavaThread::current();
>>>>>>>>                     + JvmtiThreadState *state =
>>>>>>>>                     thread->jvmti_thread_state();
>>>>>>>>                     + if (state == NULL) {
>>>>>>>>                     + return false;
>>>>>>>>                     + }
>>>>>>>>                     +
>>>>>>>>                     + JvmtiEnvThreadStateIterator it(state);
>>>>>>>>                     + for (JvmtiEnvThreadState* ets = it.first();
>>>>>>>>     ets
>>>>>>>>                     != NULL; ets = it.next(ets)) {
>>>>>>>>                     + if
>>>>>>>> (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {
>>>>>>>>                     + return true;
>>>>>>>>                     + }
>>>>>>>>                     + }
>>>>>>>>                     +
>>>>>>>>                     + return false;
>>>>>>>>                     +}
>>>>>>>>                     +
>>>>>>>>                       void
>>>>>>>> MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
>>>>>>>>                         // support for JVMTI VMObjectAlloc event
>>>>>>>>     (no-op if not enabled)
>>>>>>>> JvmtiExport::vm_object_alloc_event_collector(obj());
>>>>>>>>                                           if
>>>>>>>> (!JvmtiExport::should_post_sampled_object_alloc()) {
>>>>>>>>                           // Sampling disabled
>>>>>>>>                           return;
>>>>>>>>                         }
>>>>>>>>                                       + // Sampling is enabled
>>>>>>>>     for at least one thread,
>>>>>>>>                     is it this one?
>>>>>>>>                     + if (!thread_enabled_for_one_jvmti_env()) {
>>>>>>>>                     + return;
>>>>>>>>                     + }
>>>>>>>>                     + I don't think you need this change as this
>>>>>>>>                     condition already does it: if
>>>>>>>> (!JvmtiExport::should_post_sampled_object_alloc()) {
>>>>>>>>                           // Sampling disabled
>>>>>>>>                           return;
>>>>>>>>                         }
>>>>>>>>
>>>>>>>>                       Please, look at the following line in the
>>>>>>>>     jvmtiEventController.cpp:
>>>>>>>> JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled
>>>>>>>>     & SAMPLED_OBJECT_ALLOC_BIT) != 0);
>>>>>>>>
>>>>>>>>                       I hope, testing will prove my suggestion is
>>>>>>>>     correct.
>>>>>>>>                       Also, do you see that enabling the sampling
>>>>>>>>     events globally still works?
>>>>>>>>
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
>>>>>>>>
>>>>>>>>                       A couple of places where err is declared as
>>>>>>>>     int instead of jvmtiError:
>>>>>>>>                          714   int err;
>>>>>>>>                     742 int err; Should not be silent in a case of
>>>>>>>>                     JVMTI error:   744   err =
>>>>>>>>     (*jvmti)->GetThreadInfo(jvmti, thread, &info);
>>>>>>>>                       745   if (err != JVMTI_ERROR_NONE) {
>>>>>>>>                     746 return;
>>>>>>>>
>>>>>>>>
>>>>>>>>                     Thanks,
>>>>>>>>                     Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>>                     On 10/26/18 10:48, JC Beyler wrote:
>>>>>>>>>                     Hi all,
>>>>>>>>>
>>>>>>>>>                     When working on the heap sampling, I had
>>>>>>>>>     promised
>>>>>>>>>                     to do the per thread event so here it is!
>>>>>>>>>
>>>>>>>>>                     Could I get a review for this:
>>>>>>>>>                     Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8201655/webrev.02/
>>>>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/>
>>>>>>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/>
>>>>>>>>>                     Bug:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8201655
>>>>>>>>>
>>>>>>>>>                     I was thinking of adding GC-dev for the
>>>>>>>>>                     memAllocator change once I get favorable
>>>>>>>>>     reviews
>>>>>>>>>                     for the rest of the change.
>>>>>>>>>
>>>>>>>>>                     I've done a bit of performance testing and
>>>>>>>>>     on the
>>>>>>>>>                     Dacapo benchmark I see no change in 
>>>>>>>>> performance
>>>>>>>>>                     when turned off (logical, any code change is
>>>>>>>>>                     behind a flag check already in place) and 
>>>>>>>>> when
>>>>>>>>>                     turned on it is comparable to the current
>>>>>>>>>     performance.
>>>>>>>>>
>>>>>>>>>                     (More information is: I see a very slight
>>>>>>>>>                     degradation if we are doing 512k sampling
>>>>>>>>>     but no
>>>>>>>>>                     degradation at 2MB).
>>>>>>>>>
>>>>>>>>>                     Thanks,
>>>>>>>>>                     Jc
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 --
>>>>>>>>                 Thanks,
>>>>>>>>                 Jc
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>             --
>>>>>>>             Thanks,
>>>>>>>             Jc
>>>>>>
>>>>>>
>>>>>>
>>>>>>         --
>>>>>>         Thanks,
>>>>>>         Jc
>>>>>
>>>>>
>>>>>
>>>>>     --
>>>>>     Thanks,
>>>>>     Jc
>>>>
>>
>>
>>
>> -- 
>>
>> Thanks,
>> Jc



More information about the serviceability-dev mailing list