Low-Overhead Heap Profiling

Wed Oct 25 07:30:38 UTC 2017

Hi,

325     HeapWord *tlab_old_end = thread->tlab().return end();

Should be something like:

325     HeapWord *tlab_old_end = thread->tlab().end();

Thanks, Robbin

On 2017-10-23 17:27, JC Beyler wrote:
> Dear all,
> 
> Small update this week with this new webrev:
>    - http://cr.openjdk.java.net/~rasbold/8171119/webrev.13/
>    - Incremental is here: http://cr.openjdk.java.net/~rasbold/8171119/webrev.12_13/
> 
> I patched the code changes showed by Robbin last week and I refactored 
> collectedHeap.cpp:
> http://cr.openjdk.java.net/~rasbold/8171119/webrev.12_13/src/hotspot/share/gc/shared/collectedHeap.cpp.patch
> 
> The original code became a bit too complex in my opinion with the 
> handle_heap_sampling handling too many things. So I subdivided the logic into 
> two smaller methods and moved out a bit of the logic to make it more clear. 
> Hopefully it is :)
> 
> Let me know if you have any questions/comments :)
> Jc
> 
> On Mon, Oct 16, 2017 at 9:34 AM, JC Beyler <jcbeyler at google.com 
> <mailto:jcbeyler at google.com>> wrote:
> 
>     Hi Robbin,
> 
>     That is because version 11 to 12 was only a test change. I was going to
>     write about it and say here are the webrev links:
>     Incremental:
>     http://cr.openjdk.java.net/~rasbold/8171119/webrev.11_12/
>     <http://cr.openjdk.java.net/~rasbold/8171119/webrev.11_12/>
> 
>     Full webrev:
>     http://cr.openjdk.java.net/~rasbold/8171119/webrev.12/
>     <http://cr.openjdk.java.net/~rasbold/8171119/webrev.12/>
> 
>     This change focused only on refactoring the tests to be more manageable,
>     readable, maintainable. As all tests are looking at allocations, I moved
>     common code to a java class:
>     http://cr.openjdk.java.net/~rasbold/8171119/webrev.11_12/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.patch
>     <http://cr.openjdk.java.net/~rasbold/8171119/webrev.11_12/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.patch>
> 
>     And then most tests call into that class to turn on/off the sampling,
>     allocate, etc. This has removed almost 500 lines of test code so I'm happy
>     about that.
> 
>     Thanks for your changes, a bit of relics of previous versions :). I've
>     already integrated them into my code and will make a new webrev end of this
>     week with a bit of refactor of the code handling the tlab slow path. I find
>     it could use a bit of refactoring to make it easier to follow so I'm going
>     to take a stab at it this week.
> 
>     Any other issues/comments?
> 
>     Thanks!
>     Jc
> 
> 
>     On Mon, Oct 16, 2017 at 8:46 AM, Robbin Ehn <robbin.ehn at oracle.com
>     <mailto:robbin.ehn at oracle.com>> wrote:
> 
>         Hi JC,
> 
>         I saw a webrev.12 in the directory, with only test changes(11->12), so I
>         took that version.
>         I had a look and tested the tests, worked fine!
> 
>         First glance at the code (looking at full v12) some minor things below,
>         mostly unused stuff.
> 
>         Thanks, Robbin
> 
>         diff -r 9047e0d726d6 src/hotspot/share/runtime/heapMonitoring.cpp
>         --- a/src/hotspot/share/runtime/heapMonitoring.cpp      Mon Oct 16
>         16:54:06 2017 +0200
>         +++ b/src/hotspot/share/runtime/heapMonitoring.cpp      Mon Oct 16
>         17:42:42 2017 +0200
>         @@ -211,2 +211,3 @@
>             void initialize(int max_storage) {
>         +    // validate max_storage to sane value ? What would 0 mean ?
>               MutexLocker mu(HeapMonitor_lock);
>         @@ -227,8 +228,4 @@
>             bool initialized() { return _initialized; }
>         -  volatile bool *initialized_address() { return &_initialized; }
> 
>            private:
>         -  // Protects the traces currently sampled (below).
>         -  volatile intptr_t _stack_storage_lock[1];
>         -
>             // The traces currently sampled.
>         @@ -313,3 +310,2 @@
>             _initialized(false) {
>         -    _stack_storage_lock[0] = 0;
>           }
>         @@ -532,13 +528,2 @@
> 
>         -// Delegate the initialization question to the underlying storage system.
>         -bool HeapMonitoring::initialized() {
>         -  return StackTraceStorage::storage()->initialized();
>         -}
>         -
>         -// Delegate the initialization question to the underlying storage system.
>         -bool *HeapMonitoring::initialized_address() {
>         -  return
>         -     
>         const_cast<bool*>(StackTraceStorage::storage()->initialized_address());
>         -}
>         -
>           void HeapMonitoring::get_live_traces(jvmtiStackTraces *traces) {
>         diff -r 9047e0d726d6 src/hotspot/share/runtime/heapMonitoring.hpp
>         --- a/src/hotspot/share/runtime/heapMonitoring.hpp      Mon Oct 16
>         16:54:06 2017 +0200
>         +++ b/src/hotspot/share/runtime/heapMonitoring.hpp      Mon Oct 16
>         17:42:42 2017 +0200
>         @@ -35,3 +35,2 @@
>             static uint64_t _rnd;
>         -  static bool _initialized;
>             static jint _monitoring_rate;
>         @@ -92,7 +91,2 @@
> 
>         -  // Is the profiler initialized and where is the address to the
>         initialized
>         -  // boolean.
>         -  static bool initialized();
>         -  static bool *initialized_address();
>         -
>             // Called when o is to be sampled from a given thread and a given size.
> 
> 
> 
>         On 10/10/2017 12:57 AM, JC Beyler wrote:
> 
>             Dear all,
> 
>             Thread-safety is back!! Here is the update webrev:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/>
> 
>             Full webrev is here:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.11/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.11/>
> 
>             In order to really test this, I needed to add this so thought now
>             was a good time. It required a few changes here for the creation to
>             ensure correctness and safety. Now we keep the static pointer but
>             clear the data internally so on re-initialize, it will be a bit more
>             costly than before. I don't think this is a huge use-case so I did
>             not think it was a problem. I used the internal MutexLocker, I think
>             I used it well, let me know.
> 
>             I also added three tests:
> 
>             1) Stack depth test:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStackDepthTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStackDepthTest.java.patch>
> 
>             This test shows that the maximum stack depth system is working.
> 
>             2) Thread safety:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorThreadTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorThreadTest.java.patch>
> 
>             The test creates 24 threads and they all allocate at the same time.
>             The test then checks it does find samples from all the threads.
> 
>             3) Thread on/off safety
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorThreadOnOffTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorThreadOnOffTest.java.patch>
> 
>             The test creates 24 threads that all allocate a bunch of memory.
>             Then another thread turns the sampling on/off.
> 
>             Btw, both tests 2 & 3 failed without the locks.
> 
>             As I worked on this, I saw a lot of places where the tests are doing
>             very similar things, I'm going to clean up the code a bit and make a
>             HeapAllocator class that all tests can call directly. This will
>             greatly simplify the code.
> 
>             Thanks for any comments/criticisms!
>             Jc
> 
> 
>             On Mon, Oct 2, 2017 at 8:52 PM, JC Beyler <jcbeyler at google.com
>             <mailto:jcbeyler at google.com> <mailto:jcbeyler at google.com
>             <mailto:jcbeyler at google.com>>> wrote:
> 
>                  Dear all,
> 
>                  Small update to the webrev:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/>>
> 
>                  Full webrev is here:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/>>
> 
>                  I updated a bit of the naming, removed a TODO comment, and I
>             added a test for testing the sampling rate. I also updated the
>             maximum stack depth to 1024, there is no
>                  reason to keep it so small. I did a micro benchmark that tests
>             the overhead and it seems relatively the same.
> 
>                  I compared allocations from a stack depth of 10 and allocations
>             from a stack depth of 1024 (allocations are from the same helper
>             method in
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java>
>                 
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java>>):
>                             - For an array of 1 integer allocated in a loop;
>             stack depth 1024 vs stack depth 10: 1% slower
>                             - For an array of 200k integers allocated in a loop;
>             stack depth 1024 vs stack depth 10: 3% slower
> 
>                  So basically now moving the maximum stack depth to 1024 but we
>             only copy over the stack depths actually used.
> 
>                  For the next webrev, I will be adding a stack depth test to
>             show that it works and probably put back the mutex locking so that
>             we can see how difficult it is to keep
>                  thread safe.
> 
>                  Let me know what you think!
>                  Jc
> 
> 
> 
>                  On Mon, Sep 25, 2017 at 3:02 PM, JC Beyler <jcbeyler at google.com
>             <mailto:jcbeyler at google.com> <mailto:jcbeyler at google.com
>             <mailto:jcbeyler at google.com>>> wrote:
> 
>                      Forgot to say that for my numbers:
>                        - Not in the test are the actual numbers I got for the
>             various array sizes, I ran the program 30 times and parsed the
>             output; here are the averages and standard
>                      deviation:
>                             1000:     1.28% average; 1.13% standard deviation
>                             10000:    1.59% average; 1.25% standard deviation
>                             100000:   1.26% average; 1.26% standard deviation
> 
>                      The 1000/10000/100000 are the sizes of the arrays being
>             allocated. These are allocated 100k times and the sampling rate is
>             111 times the size of the array.
> 
>                      Thanks!
>                      Jc
> 
> 
>                      On Mon, Sep 25, 2017 at 3:01 PM, JC Beyler
>             <jcbeyler at google.com <mailto:jcbeyler at google.com>
>             <mailto:jcbeyler at google.com <mailto:jcbeyler at google.com>>> wrote:
> 
>                          Hi all,
> 
>                          After a bit of a break, I am back working on this :).
>             As before, here are two webrevs:
> 
>                          - Full change set:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.09/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.09/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.09/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.09/>>
>                          - Compared to version 8:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/>>
>                               (This version is compared to version 8 I last
>             showed but ported to the new folder hierarchy)
> 
>                          In this version I have:
>                             - Handled Thomas' comments from his email of 07/03:
>                                  - Merged the logging to be standard
>                                  - Fixed up the code a bit where asked
>                                  - Added some notes about the code not being
>             thread-safe yet
>                              - Removed additional dead code from the version
>             that modifies interpreter/c1/c2
>                              - Fixed compiler issues so that it compiles with
>             --disable-precompiled-header
>                                   - Tested with ./configure
>             --with-boot-jdk=<jdk8> --with-debug-level=slowdebug
>             --disable-precompiled-headers
> 
>                          Additionally, I added a test to check the sanity of the
>             sampler: HeapMonitorStatCorrectnessTest
>                         
>             (http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch>>)
>                              - This allocates a number of arrays and checks that
>             we obtain the number of samples we want with an accepted error of
>             5%. I tested it 100 times and it
>                          passed everytime, I can test more if wanted
>                              - Not in the test are the actual numbers I got for
>             the various array sizes, I ran the program 30 times and parsed the
>             output; here are the averages and
>                          standard deviation:
>                                 1000:     1.28% average; 1.13% standard deviation
>                                 10000:    1.59% average; 1.25% standard deviation
>                                 100000:   1.26% average; 1.26% standard deviation
> 
>                          What this means is that we were always at about 1~2% of
>             the number of samples the test expected.
> 
>                          Let me know what you think,
>                          Jc
> 
>                          On Wed, Jul 5, 2017 at 9:31 PM, JC Beyler
>             <jcbeyler at google.com <mailto:jcbeyler at google.com>
>             <mailto:jcbeyler at google.com <mailto:jcbeyler at google.com>>> wrote:
> 
>                              Hi all,
> 
>                              I apologize, I have not yet handled your remarks
>             but thought this new webrev would also be useful to see and comment
>             on perhaps.
> 
>                              Here is the latest webrev, it is generated slightly
>             different than the others since now I'm using webrev.ksh without the
>             -N option:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/>>
> 
>                              And the webrev.07 to webrev.08 diff is here:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.07_08/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07_08/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07_08/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07_08/>>
> 
>                              (Let me know if it works well)
> 
>                              It's a small change between versions but it:
>                                 - provides a fix that makes the average sample
>             rate correct (more on that below).
>                                 - fixes the code to actually have it play nicely
>             with the fast tlab refill
>                                 - cleaned up a bit the JVMTI text and now use
>             jvmtiFrameInfo
>                              - moved the capability to be onload solo
> 
>                              With this webrev, I've done a small study of the
>             random number generator we use here for the sampling rate. I took a
>             small program and it can be simplified to:
> 
>                              for (outer loop)
>                              for (inner loop)
>                              int[] tmp = new int[arraySize];
> 
>                              - I've fixed the outer and inner loops to being 800
>             for this experiment, meaning we allocate 640000 times an array of a
>             given array size.
> 
>                              - Each program provides the average sample size
>             used for the whole execution
> 
>                              - Then, I ran each variation 30 times and then
>             calculated the average of the average sample size used for various
>             array sizes. I selected the array size to
>                              be one of the following: 1, 10, 100, 1000.
> 
>                              - When compared to 512kb, the average sample size
>             of 30 runs:
>                              1: 4.62% of error
>                              10: 3.09% of error
>                              100: 0.36% of error
>                              1000: 0.1% of error
>                              10000: 0.03% of error
> 
>                              What it shows is that, depending on the number of
>             samples, the average does become better. This is because with an
>             allocation of 1 element per array, it
>                              will take longer to hit one of the thresholds. This
>             is seen by looking at the sample count statistic I put in. For the
>             same number of iterations (800 *
>                              800), the different array sizes provoke:
>                              1: 62 samples
>                              10: 125 samples
>                              100: 788 samples
>                              1000: 6166 samples
>                              10000: 57721 samples
> 
>                              And of course, the more samples you have, the more
>             sample rates you pick, which means that your average gets closer
>             using that math.
> 
>                              Thanks,
>                              Jc
> 
>                              On Thu, Jun 29, 2017 at 10:01 PM, JC Beyler
>             <jcbeyler at google.com <mailto:jcbeyler at google.com>
>             <mailto:jcbeyler at google.com <mailto:jcbeyler at google.com>>> wrote:
> 
>                                  Thanks Robbin,
> 
>                                  This seems to have worked. When I have the next
>             webrev ready, we will find out but I'm fairly confident it will work!
> 
>                                  Thanks agian!
>                                  Jc
> 
>                                  On Wed, Jun 28, 2017 at 11:46 PM, Robbin Ehn
>             <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>
>             <mailto:robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>>> wrote:
> 
>                                      Hi JC,
> 
>                                      On 06/29/2017 12:15 AM, JC Beyler wrote:
> 
>                                          B) Incremental changes
> 
> 
>                                      I guess the most common work flow here is
>             using mq :
>                                      hg qnew fix_v1
>                                      edit files
>                                      hg qrefresh
>                                      hg qnew fix_v2
>                                      edit files
>                                      hg qrefresh
> 
>                                      if you do hg log you will see 2 commits
> 
>                                      webrev.ksh -r -2 -o my_inc_v1_v2
>                                      webrev.ksh -o my_full_v2
> 
> 
>                                      In  your .hgrc you might need:
>                                      [extensions]
>                                      mq =
> 
>                                      /Robbin
> 
> 
>                                          Again another newbiew question here...
> 
>                                          For showing the incremental changes, is
>             there a link that explains how to do that? I apologize for my newbie
>             questions all the time :)
> 
>                                          Right now, I do:
> 
>                                             ksh ../webrev.ksh -m -N
> 
>                                          That generates a webrev.zip and send it
>             to Chuck Rasbold. He then uploads it to a new webrev.
> 
>                                          I tried commiting my change and adding
>             a small change. Then if I just do ksh ../webrev.ksh without any
>             options, it seems to produce a similar
>                                          page but now with only the changes I
>             had (so the 06-07 comparison you were talking about) and a changeset
>             that has it all. I imagine that is
>                                          what you meant.
> 
>                                          Which means that my workflow would become:
> 
>                                          1) Make changes
>                                          2) Make a webrev without any options to
>             show just the differences with the tip
>                                          3) Amend my changes to my local commit
>             so that I have it done with
>                                          4) Go to 1
> 
>                                          Does that seem correct to you?
> 
>                                          Note that when I do this, I only see
>             the full change of a file in the full change set (Side note here:
>             now the page says change set and not
>                                          patch, which is maybe why Serguei was
>             having issues?).
> 
>                                          Thanks!
>                                          Jc
> 
> 
> 
>                                          On Wed, Jun 28, 2017 at 1:12 AM, Robbin
>             Ehn <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>
>             <mailto:robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>>
>             <mailto:robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>
>                                          <mailto:robbin.ehn at oracle.com
>             <mailto:robbin.ehn at oracle.com>>>> wrote:
> 
>                                               Hi,
> 
>                                               On 06/28/2017 12:04 AM, JC Beyler
>             wrote:
> 
>                                                   Dear Thomas et al,
> 
>                                                   Here is the newest webrev:
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/>>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/>>>
> 
> 
> 
>                                               You have some more bits to in
>             there but generally this looks good and really nice with more tests.
>                                               I'll do and deep dive and re-test
>             this when I get back from my long vacation with whatever patch
>             version you have then.
> 
>                                               Also I think it's time you provide
>             incremental (v06->07 changes) as well as complete change-sets.
> 
>                                               Thanks, Robbin
> 
> 
> 
> 
>                                                   Thomas, I "think" I have
>             answered all your remarks. The summary is:
> 
>                                                   - The statistic system is up
>             and provides insight on what the heap sampler is doing
>                                                        - I've noticed that,
>             though the sampling rate is at the right mean, we are missing some
>             samples, I have not yet tracked out why
>                                          (details below)
> 
>                                                   - I've run a tiny benchmark
>             that is the worse case: it is a very tight loop and allocated a
>             small array
>                                                        - In this case, I see no
>             overhead when the system is off so that is a good start :)
>                                                        - I see right now a high
>             overhead in this case when sampling is on. This is not a really too
>             surprising but I'm going to see if
>                                          this is consistent with our
>                                                   internal implementation. The
>             benchmark is really allocation stressful so I'm not too surprised
>             but I want to do the due diligence.
> 
>                                                      - The statistic system up
>             is up and I have a new test
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch>>
>                                                  
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatTest.java.patch>>>
>                                                         - I did a bit of a study
>             about the random generator here, more details are below but
>             basically it seems to work well
> 
>                                                      - I added a capability but
>             since this is the first time doing this, I was not sure I did it right
>                                                        - I did add a test though
>             for it and the test seems to do what I expect (all methods are
>             failing with the
>                                          JVMTI_ERROR_MUST_POSSESS_CAPABILITY error).
>                                                            -
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch>>
>                                                                             
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapabilityTest.java.patch>>>
> 
>                                                      - I still need to figure
>             out what to do about the multi-agent vs single-agent issue
> 
>                                                      - As far as measurements,
>             it seems I still need to look at:
>                                                        - Why we do the 20 random
>             calls first, are they necessary?
>                                                        - Look at the mean of the
>             sampling rate that the random generator does and also what is
>             actually sampled
>                                                        - What is the overhead in
>             terms of memory/performance when on?
> 
>                                                   I have inlined my answers, I
>             think I got them all in the new webrev, let me know your thoughts.
> 
>                                                   Thanks again!
>                                                   Jc
> 
> 
>                                                   On Fri, Jun 23, 2017 at 3:52
>             AM, Thomas Schatzl <thomas.schatzl at oracle.com
>             <mailto:thomas.schatzl at oracle.com> <mailto:thomas.schatzl at oracle.com
>             <mailto:thomas.schatzl at oracle.com>>
>                                          <mailto:thomas.schatzl at oracle.com
>             <mailto:thomas.schatzl at oracle.com> <mailto:thomas.schatzl at oracle.com
>             <mailto:thomas.schatzl at oracle.com>>>
>             <mailto:thomas.schatzl at oracle.com <mailto:thomas.schatzl at oracle.com>
>             <mailto:thomas.schatzl at oracle.com <mailto:thomas.schatzl at oracle.com>>
> 
>                                                  
>             <mailto:thomas.schatzl at oracle.com <mailto:thomas.schatzl at oracle.com>
>             <mailto:thomas.schatzl at oracle.com
>             <mailto:thomas.schatzl at oracle.com>>>>> wrote:
> 
>                                                        Hi,
> 
>                                                        On Wed, 2017-06-21 at
>             13:45 -0700, JC Beyler wrote:
>                                                        > Hi all,
>                                                        >
>                                                        > First off: Thanks again
>             to Robbin and Thomas for their reviews :)
>                                                        >
>                                                        > Next, I've uploaded a
>             new webrev:
>                                                        >
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>>>
>                                                  
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>>>>
> 
>                                                        >
>                                                        > Here is an update:
>                                                        >
>                                                        > - @Robbin, I forgot to
>             say that yes I need to look at implementing
>                                                        > this for the other
>             architectures and testing it before it is all
>                                                        > ready to go. Is it
>             common to have it working on all possible
>                                                        > combinations or is
>             there a subset that I should be doing first and we
>                                                        > can do the others later?
>                                                        > - I've tested
>             slowdebug, built and ran the JTreg tests I wrote with
>                                                        > slowdebug and fixed a
>             few more issues
>                                                        > - I've refactored a bit
>             of the code following Thomas' comments
>                                                        >    - I think I've
>             handled all the comments from Thomas (I put
>                                                        > comments inline below
>             for the specifics)
> 
>                                                        Thanks for handling all
>             those.
> 
>                                                        > - Following Thomas'
>             comments on statistics, I want to add some
>                                                        > quality assurance tests
>             and find that the easiest way would be to
>                                                        > have a few counters of
>             what is happening in the sampler and expose
>                                                        > that to the user.
>                                                        >    - I'll be adding
>             that in the next version if no one sees any
>                                                        > objections to that.
>                                                        >    - This will allow me
>             to add a sanity test in JTreg about number of
>                                                        > samples and average of
>             sampling rate
>                                                        >
>                                                        > @Thomas: I had a few
>             questions that I inlined below but I will
>                                                        > summarize the "bigger
>             ones" here:
>                                                        >    - You mentioned
>             constants are not using the right conventions, I
>                                                        > looked around and
>             didn't see any convention except normal naming then
>                                                        > for static constants.
>             Is that right?
> 
>                                                        I looked through
>             https://wiki.openjdk.java.net/display/HotSpot/StyleGui
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui>
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui>>
>                                         
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui>
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui>>>
>                                                  
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui>
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui>>
>                                         
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui>
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui
>             <https://wiki.openjdk.java.net/display/HotSpot/StyleGui>>>>
>                                                        de and the rule is to
>             "follow an existing pattern and must have a
>                                                        distinct appearance from
>             other names". Which does not help a lot I
>                                                        guess :/ The GC team
>             started using upper camel case, e.g.
>                                                        SomeOtherConstant, but
>             very likely this is probably not applied
>                                                        consistently throughout.
>             So I am fine with not adding another style
>                                                        (like kMaxStackDepth with
>             the "k" in front with some unknown meaning)
>                                                        is fine.
> 
>                                                        (Chances are you will
>             find that style somewhere used anyway too,
>                                                        apologies if so :/)
> 
> 
>                                                   Thanks for that link, now I
>             know where to look. I used the upper camel case in my code as well
>             then :) I should have gotten them all.
> 
> 
>                                                         > PS: I've also inlined
>             my answers to Thomas below:
>                                                         >
>                                                         > On Tue, Jun 13, 2017
>             at 8:03 AM, Thomas Schatzl <thomas.schatzl at oracl
>                                                         > e.com <http://e.com>
>             <http://e.com> <http://e.com> <http://e.com>> wrote:
>                                                         > > Hi all,
>                                                         > >
>                                                         > > On Mon, 2017-06-12
>             at 11:11 -0700, JC Beyler wrote:
>                                                         > > > Dear all,
>                                                         > > >
>                                                         > > > I've continued
>             working on this and have done the following
>                                                         > > webrev:
>                                                         > > >
>             http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>>>
>                                                  
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>>
>                                         
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/
>             <http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>>>>
> 
>                                                         > >
>                                                         > > [...]
>                                                         > > > Things I still
>             need to do:
>                                                         > > >    - Have to fix
>             that TLAB case for the FastTLABRefill
>                                                         > > >    - Have to start
>             looking at the data to see that it is
>                                                         > > consistent and does
>             gather the right samples, right frequency, etc.
>                                                         > > >    - Have to check
>             the GC elements and what that produces
>                                                         > > >    - Run a
>             slowdebug run and ensure I fixed all those issues you
>                                                         > > saw > Robbin
>                                                         > > >
>                                                         > > > Thanks for looking
>             at the webrev and have a great week!
>                                                         > >
>                                                         > >   scratching a bit
>             on the surface of this change, so apologies for
>                                                         > > rather shallow comments:
>                                                         > >
>                                                         > > -
>             macroAssembler_x86.cpp:5604: while this is compiler code, and I
>                                                         > > am not sure this is
>             final, please avoid littering the code with
>                                                         > > TODO remarks :) They
>             tend to be candidates for later wtf moments
>                                                         > > only.
>                                                         > >
>                                                         > > Just file a CR for that.
>                                                         > >
>                                                         > Newcomer question:
>             what is a CR and not sure I have the rights to do
>                                                         > that yet ? :)
> 
>                                                        Apologies. CR is a change
>             request, this suggests to file a bug in the
>                                                        bug tracker. And you are
>             right, you can't just create a new account in
>                                                        the OpenJDK JIRA
>             yourselves. :(
> 
> 
>                                                   Ok good to know, I'll continue
>             with my own todo list but I'll work hard on not letting it slip in
>             the webrevs anymore :)
> 
> 
>                                                        I was mostly referring to
>             the "... but it is a TODO" part of that
>                                                        comment in
>             macroassembler_x86.cpp. Comments about the why of the code
>                                                        are appreciated.
> 
>                                                        [Note that I now
>             understand that this is to some degree still work in
>                                                        progress. As long as the
>             final changeset does no contain TODO's I am
>                                                        fine (and it's not a hard
>             objection, rather their use in "final" code
>                                                        is typically limited in
>             my experience)]
> 
>                                                        5603   // Currently, if
>             this happens, just set back the actual end to
>                                                        where it was.
>                                                        5604   // We miss a
>             chance to sample here.
> 
>                                                        Would be okay, if
>             explaining "this" and the "why" of missing a chance
>                                                        to sample here would be best.
> 
>                                                        Like maybe:
> 
>                                                        // If we needed to refill
>             TLABs, just set the actual end point to
>                                                        // the end of the TLAB
>             again. We do not sample here although we could.
> 
>                                                   Done with your comment, it
>             works well in my mind.
> 
>                                                        I am not sure whether
>             "miss a chance to sample" meant "we could, but
>                                                        consciously don't because
>             it's not that useful" or "it would be
>                                                        necessary but don't
>             because it's too complicated to do.".
> 
>                                                        Looking at the original
>             comment once more, I am also not sure if that
>                                                        comment shouldn't
>             referring to the "end" variable (not actual_end)
>                                                        because that's the
>             variable that is responsible for taking the sampling
>                                                        path? (Going from the
>             member description of ThreadLocalAllocBuffer).
> 
> 
>                                                   I've moved this code and it no
>             longer shows up here but the rationale and answer was:
> 
>                                                   So.. Yes, end is the variable
>             provoking the sampling. Actual end is the actual end of the TLAB.
> 
>                                                   What was happening here is
>             that the code is resetting _end to point towards the end of the new
>             TLAB. Because, we now have the end for
>                                          sampling and _actual_end for
>                                                   the actual end, we need to
>             update the actual_end as well.
> 
>                                                   Normally, were we to do the
>             real work here, we would calculate the (end - start) offset, then do:
> 
>                                                   - Set the new end to : start +
>             (old_end - old_start)
>                                                   - Set the actual end like we
>             do here now where it because it is the actual end.
> 
>                                                   Why is this not done here now
>             anymore?
>                                                       - I was still debating
>             which path to take:
>                                                          - Do it in the fast
>             refill code, it has its perks:
>                                                              - In a world where
>             fast refills are happening all the time or a lot, we can augment
>             there the code to do the sampling
>                                                          - Remember what we had
>             as an end before leaving the slowpath and check on return
>                                                              - This is what I'm
>             doing now, it removes the need to go fix up all fast refill paths
>             but if you remain in fast refill paths,
>                                          you won't get sampling. I
>                                                   have to think of the
>             consequences of that, maybe a future change later on?
>                                                                 - I have the
>             statistics now so I'm going to study that
>                                                                    -> By the
>             way, though my statistics are showing I'm missing some samples, if I
>             turn off FastTlabRefill, it is the same
>                                          loss so for now, it seems
>                                                   this does not occur in my
>             simple test.
> 
> 
> 
>                                                        But maybe I am only
>             confused and it's best to just leave the comment
>                                                        away. :)
> 
>                                                        Thinking about it some
>             more, doesn't this not-sampling in this case
>                                                        mean that sampling does
>             not work in any collector that does inline TLAB
>                                                        allocation at the moment?
>             (Or is inline TLAB alloc automatically
>                                                        disabled with sampling
>             somehow?)
> 
>                                                        That would indeed be a
>             bigger TODO then :)
> 
> 
>                                                   Agreed, this remark made me
>             think that perhaps as a first step the new way of doing it is better
>             but I did have to:
>                                                      - Remove the const of the
>             ThreadLocalBuffer remaining and hard_end methods
>                                                      - Move hard_end out of the
>             header file to have a bit more logic there
> 
>                                                   Please let me know what you
>             think of that and if you prefer it this way or changing the fast
>             refills. (I prefer this way now because it
>                                          is more incremental).
> 
> 
>                                                        > > - calling
>             HeapMonitoring::do_weak_oops() (which should probably be
>                                                        > > called weak_oops_do()
>             like other similar methods) only if string
>                                                        > > deduplication is
>             enabled (in g1CollectedHeap.cpp:4511) seems wrong.
>                                                        >
>                                                        > The call should be at
>             least around 6 lines up outside the if.
>                                                        >
>                                                        > Preferentially in a
>             method like process_weak_jni_handles(), including
>                                                        > additional logging. (No
>             new (G1) gc phase without minimal logging
>                                                        > :)).
>                                                        > Done but really not
>             sure because:
>                                                        >
>                                                        > I put for logging:
>                                                        >   log_develop_trace(gc,
>             freelist)("G1ConcRegionFreeing [other] : heap
>                                                        > monitoring");
> 
>                                                        I would think that "gc,
>             ref" would be more appropriate log tags for
>                                                        this similar to jni handles.
>                                                        (I am als not sure what
>             weak reference handling has to do with
>                                                        G1ConcRegionFreeing, so I
>             am a bit puzzled)
> 
> 
>                                                   I was not sure what to put for
>             the tags or really as the message. I cleaned it up a bit now to:
>                                                        log_develop_trace(gc,
>             ref)("HeapSampling [other] : heap monitoring processing");
> 
> 
> 
>                                                        > Since weak_jni_handles
>             didn't have logging for me to be inspired
>                                                        > from, I did that but
>             unconvinced this is what should be done.
> 
>                                                        The JNI handle processing
>             does have logging, but only in
>                                                      
>               ReferenceProcessor::process_discovered_references(). In
>                                                      
>               process_weak_jni_handles() only overall time is measured (in a G1
>                                                        specific way, since only
>             G1 supports disabling reference procesing) :/
> 
>                                                        The code in
>             ReferenceProcessor prints both time taken
>                                                      
>               referenceProcessor.cpp:254, as well as the count, but strangely
>             only in
>                                                        debug VMs.
> 
>                                                        I have no idea why this
>             logging is that unimportant to only print that
>                                                        in a debug VM. However
>             there are reviews out for changing this area a
>                                                        bit, so it might be
>             useful to wait for that (JDK-8173335).
> 
> 
>                                                   I cleaned it up a bit anyway
>             and now it returns the count of objects that are in the system.
> 
> 
>                                                        > > - the change doubles
>             the size of
>                                                        > >
>             CollectedHeap::allocate_from_tlab_slow() above the "small and nice"
>                                                        > > threshold. Maybe it
>             could be refactored a bit.
>                                                        > Done I think, it looks
>             better to me :).
> 
>                                                        In
>             ThreadLocalAllocBuffer::handle_sample() I think the
>                                                      
>               set_back_actual_end()/pick_next_sample() calls could be hoisted out of
>                                                        the "if" :)
> 
> 
>                                                   Done!
> 
> 
>                                                        > > -
>             referenceProcessor.cpp:261: the change should add logging about
>                                                        > > the number of
>             references encountered, maybe after the corresponding
>                                                        > > "JNI weak reference
>             count" log message.
>                                                        > Just to double check,
>             are you saying that you'd like to have the heap
>                                                        > sampler to keep in
>             store how many sampled objects were encountered in
>                                                        > the
>             HeapMonitoring::weak_oops_do?
>                                                        >    - Would a return of
>             the method with the number of handled
>                                                        > references and logging
>             that work?
> 
>                                                        Yes, it's fine if
>             HeapMonitoring::weak_oops_do() only returned the
>                                                        number of processed weak
>             oops.
> 
> 
>                                                   Done also (but I admit I have
>             not tested the output yet) :)
> 
> 
>                                                        >    - Additionally,
>             would you prefer it in a separate block with its
>                                                        > GCTraceTime?
> 
>                                                        Yes. Both kinds of
>             information is interesting: while the time taken is
>                                                        typically more important,
>             the next question would be why, and the
>                                                        number of references
>             typically goes a long way there.
> 
>                                                        See above though, it is
>             probably best to wait a bit.
> 
> 
>                                                   Agreed that I "could" wait
>             but, if it's ok, I'll just refactor/remove this when we get closer
>             to something final. Either, JDK-8173335
>                                                   has gone in and I will notice
>             it now or it will soon and I can change it then.
> 
> 
>                                                        > > -
>             threadLocalAllocBuffer.cpp:331: one more "TODO"
>                                                        > Removed it and added it
>             to my personal todos to look at.
>                                                        >      > >
>                                                        > > -
>             threadLocalAllocBuffer.hpp: ThreadLocalAllocBuffer class
>                                                        > > documentation should
>             be updated about the sampling additions. I
>                                                        > > would have no clue
>             what the difference between "actual_end" and
>                                                        > > "end" would be from
>             the given information.
>                                                        > If you are talking
>             about the comments in this file, I made them more
>                                                        > clear I hope in the new
>             webrev. If it was somewhere else, let me know
>                                                        > where to change.
> 
> 
>