Low-Overhead Heap Profiling

Mon Jun 12 18:11:45 UTC 2017

Dear all,

I've continued working on this and have done the following webrev:
http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/

What it does:
   - Implement the turn on/off system for the TLAB implementation
      - Basically you can turn the system on/off using the JVMTI API
(StartHeapProfiling and StopHeapProfiling here
http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/src/share/vm/prims/jvmti.xml.patch
)
         - I've currently implemented that the system resets the profiling
data at the StartHeapProfiling, this allows you to start profiling, stop
profiling, and then the user can query what happened during profiling as a
post-processing step :)

   - I've currently, for sake of simplicity and Robbin hinted it would be
better for now, removed the mutex code for the data but think that will
have to come back in a next patch or in this one at some point

   - I've also really worked on the testing code to make it more expandable
in this case

http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c.patch
is
now a bit more generic and allows to turn on/off the sampling; it allows to
query the existence or not of frames passed from Java world to JNI, which
allows the test to define what should be seen and have the code in one
place.
          - That is done using the helper class
http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/test/serviceability/jvmti/HeapMonitor/MyPackage/Frame.java.patch
             - Basically each test can now provide an array of Frames that
the native library can check the internal data for a match. This allows to
have various tests have their own signatures, etc.

          - This has allowed me to add a nice test here to test the wipe
out of the data:

http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorOnOffTest.java.patch

  - Finally, I've done initial overhead testing and, though my data was a
bit noisy, I saw no real overhead using the Tlab approach while off. I will
try to get better data and more importantly, more stable data when turning
it on.

Things I still need to do:
   - Have to fix that TLAB case for the FastTLABRefill
   - Have to start looking at the data to see that it is consistent and
does gather the right samples, right frequency, etc.
   - Have to check the GC elements and what that produces
   - Run a slowdebug run and ensure I fixed all those issues you saw Robbin

Thanks for looking at the webrev and have a great week!
Jc

On Fri, Jun 2, 2017 at 8:57 AM, JC Beyler <jcbeyler at google.com> wrote:

> Dear all,
>
> Thanks Robbin for the comments. I have left the MuxLocker for now and will
> look at removing it or as a future enhancement as you say. I did make the
> class extends and added a TODO for myself to test this in slowdebug.
>
> I have also put a new webrev up that is still a work in progress but
> should allow us to talk about TLAB vs C1/C2 modifications.
>
> TLAB implementation: http://cr.openjdk.java.net/~
> rasbold/8171119/webrev.04/
> C1/C2 implementation: http://cr.openjdk.java.net/~
> rasbold/8171119/webrev.03/
>
> What this webrev has is:
>    - I have put in a TLAB implementation, it is a WIP, and I have not yet
> done the qualitative/quantitative study on it vs the implementation using
> compilation changes but the big parts are in place
>        - Caveats:
>            - There is a TODO: http://cr.openjdk.java.n
> et/~rasbold/8171119/webrev.04/src/cpu/x86/vm/macroAssembler_x86.cpp.patch
>                - I have not fixed the calculation in the case of a
> FastTLABRefill case
>            - This is always on right now, there is no way to turn it off,
> that's also a TODO to be directed by the JVMTI API
>
>     - I also have circumvented the AsyncGetCallTrace using the snippet of
> code you showed Robbin, it works for here/now
>           - But we might have to revisit this one day because it then does
> not try to get some of the native stacks and jumps directly to the Java
> stacks (I see cases where this could be an issue)
>           - However, this has cleaned up quite a bit of the code and I
> have removed all mention of ASGCT and its structures now and use directly
> the JVMTI structures
>
>    - GC is handled now, I have not yet done the qualitative/quantitative
> study on it but the big parts are in place
>
>    - Due to the way the TLAB is called, the stack walker is now correct
> and produces the right stacks it seems (this is a bold sentence from my ONE
> JTREG test :))
>
> Final notes on this webrev:
>    - Have to fix that TLAB case for the FastTLABRefill
>    - Implement the turn on/off system for the TLAB implementation
>    - Have to start looking at the data to see that it is consistent and
> does gather the right samples, right frequency, etc.
>    - Have to check the GC elements and what that produces
>    - Run a slowdebug run and ensure I fixed all those issues you saw Robbin
>
> As always, your comments and feedback are greatly appreciated! Happy
> Friday!
> Jc
>
>
> On Tue, May 30, 2017 at 5:33 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>
>> Hi Jc,
>>
>> On 05/22/2017 08:47 PM, JC Beyler wrote:
>>
>>> Dear all,
>>>
>>> I have a new webrev up:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.03/
>>>
>>
>> I liked this!
>>
>> Two small things:
>>
>> heapMonitoring.hpp
>> class HeapMonitoring should extend AllStatic
>>
>> heapMonitoring.cpp
>> class MuxLocker should extend StackObj
>> But I think you should skip MuxLocker or push it separate generic
>> enhancement.
>>
>> Great with the jtreg test, thanks alot!
>>
>>
>>> This webrev has, I hope, fixed a lot of the comments from Robbin:
>>>    - The casts normally are all C++ style
>>>    - Moved this to jdk10-hs
>>>       - I have not tested slowdebug yet, hopefully it does not break
>>> there
>>>    - Added the garbage collection system:
>>>       - Now live sampled allocations are tracked throughout their
>>> lifetime
>>>          - When GC happens, it moves the sampled allocation information
>>> to two lists: recent and frequent GC lists
>>>              - Those lists use the array system that the live objects
>>> were using before but have different re-use strategies
>>>       - Added the JVMTI API for them via a GetFrequentGarbageTraces and
>>> GetGarbageTraces
>>> - Both use the same JVMTI structures
>>>       - Added the calls to them for the test, though I've kept that test
>>> simple for now:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.03/raw_fi
>>> les/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c
>>>          - As I write this, I notice my webrev is missing a final change
>>> I made to the test that calls the same ReleaseTraces to each
>>> live/garbage/frequent structure. This is updated in my local repo and will
>>> get in the next webrev.
>>>
>>> Next steps for this work are:
>>>     - Putting the TLAB implementation (yes not forgotten ;-))
>>>     - Adding more testing and separate the current test system to check
>>> things a bit more thoroughly
>>>     - Have not tried to circumvent AsyncGetCallTrace yet
>>>     - Still have to double check the stack walker a bit more
>>>
>>
>> Looking forward to this.
>>
>> Could someone from compiler take a look please?
>>
>> Thanks!
>>
>> /Robbin
>>
>>
>>> Happy webrev perusal!
>>> Jc
>>>
>>>
>>> On Tue, May 16, 2017 at 5:20 AM, Robbin Ehn <robbin.ehn at oracle.com
>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>
>>>     Just a few answers,
>>>
>>>     On 05/15/2017 06:48 PM, JC Beyler wrote:
>>>
>>>         Dear all,
>>>
>>>         I've updated the webrev to:
>>>         http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/>
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/>>
>>>
>>>
>>>     I'll look at this later, thanks!
>>>
>>>
>>>         Robbin,
>>>         I believe I have addressed most of your items with webrev 02:
>>>             - I added a JTreg test to show how it works:
>>>         http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_fi
>>> les/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_f
>>> iles/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c>
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_f
>>> iles/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.02/raw_f
>>> iles/new/test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c>>
>>>            - I've modified the code to use its own data structures both
>>> internally and externally, this will make it easier to move out of
>>> AsyncGetCallTrace as we move
>>>         forward, that is still on my TODOs
>>>            - I cleaned up the JVMTI API by passing a structure that
>>> handles the num_traces and put in a ReleaseTraces as well
>>>            - I cleaned up other issues as well.
>>>
>>>         However, I have three questions, which are probably because I'm
>>> new in this community:
>>>            1) My previous webrevs were based off of JDK9 by mistake.
>>> When I took JDK10 via : hg clone http://hg.openjdk.java.net/jdk10/jdk10
>>>         <http://hg.openjdk.java.net/jdk10/jdk10> <
>>> http://hg.openjdk.java.net/jdk10/jdk10 <http://hg.openjdk.java.net/jd
>>> k10/jdk10>> jdk10
>>>                - I don't see code compatible with what you were showing
>>> (ie your patches don't make sense for that code base; ex: klass is still
>>> accessed via klass() for
>>>         example in collectedHeap.inline.hpp)
>>>                - Would you know what is the right hg clone command so we
>>> are working on the same code base?
>>>
>>>
>>>     We use jdk10-hs, e.g.
>>>     hg tclone http://hg.openjdk.java.net/jdk10/hs <
>>> http://hg.openjdk.java.net/jdk10/hs> 10-hs
>>>
>>>     There is sporadic big merges going from jdk9->jdk10->jdk10-hs and
>>> jdk10-hs->jdk10, so 10 is moving...
>>>
>>>
>>>            2) You mentioned I was using os::malloc, new,
>>> NEW_C_HEAP_ARRAY; I cleaned out the os::malloc but which of the new vs
>>> NEW_C_HEAP_ARRAY should I use. It might be
>>>         that I don't understand when one uses one or the other but I see
>>> both used around the code base?
>>>              - Is it that new is to be used for anything internal and
>>> NEW_C_HEAP_ARRAY anything provided to the JVMTI users outside of the JVM?
>>>
>>>
>>>     We overload new operator when you extend correct base class, e.g.
>>> CHeapObj<mtInternal> so use 'new'
>>>     But for arrays you will need the macro NEW_C_HEAP_ARRAY.
>>>
>>>
>>>            3) Casts: same kind question: which should I use. The code
>>> was using a bit of everything, I'll refactor it entirely but I was not
>>> clear if I should go to C casts
>>>         or C++ casts as I see both in the codebase. What is the
>>> convention I should use?
>>>
>>>
>>>     Just be consist, use what suites you, C++ casts might be preferable,
>>> if we are moving towards C++11.
>>>     And use 'right' cast, e.g. going from Thread* to JavaThread* you
>>> should use C cast or static_cast, not reinterpret_cast I would say.
>>>
>>>
>>>         Final notes on this webrev:
>>>             - I am still missing:
>>>               - Putting a TLAB implementation so that we can compare
>>> both webrevs
>>>               - Have not tried to circumvent AsyncGetCallTrace
>>>               - Putting in the handling of GC'd objects
>>>               - Fix a stack walker issue I have seen, I think I know the
>>> problem and will test that theory out for the next webrev
>>>
>>>         I will work on integrating those items for the next webrev!
>>>
>>>
>>>     Thanks!
>>>
>>>
>>>         Thanks for your help,
>>>         Jc
>>>
>>>         Ps:  I tested this on a new repo:
>>>
>>>         hg clone http://hg.openjdk.java.net/jdk10/jdk10 <
>>> http://hg.openjdk.java.net/jdk10/jdk10> <http://hg.openjdk.java.net/jd
>>> k10/jdk10
>>>         <http://hg.openjdk.java.net/jdk10/jdk10>> jdk10
>>>         ... building it
>>>         cd test
>>>         jtreg -nativepath:<path-to-jdk10>/bu
>>> ild/linux-x86_64-normal-server-release/support/test/hotspot/jtreg/native/lib/
>>> -jdk
>>>         <path-to-jdk10>/linux-x86_64-normal-server-release/images/jdk
>>> ../hotspot/test/serviceability/jvmti/HeapMonitor/
>>>
>>>
>>>     I'll test it out!
>>>
>>>     /Robbin
>>>
>>>
>>>
>>>         On Thu, May 4, 2017 at 11:21 PM, serguei.spitsyn at oracle.com
>>> <mailto:serguei.spitsyn at oracle.com> <mailto:serguei.spitsyn at oracle.com
>>>         <mailto:serguei.spitsyn at oracle.com>> <serguei.spitsyn at oracle.com
>>> <mailto:serguei.spitsyn at oracle.com> <mailto:serguei.spitsyn at oracle.com
>>>
>>>         <mailto:serguei.spitsyn at oracle.com>>> wrote:
>>>
>>>              Robbin,
>>>
>>>              Thank you for forwarding!
>>>              I will review it.
>>>
>>>              Thanks,
>>>              Serguei
>>>
>>>
>>>
>>>              On 5/4/17 02:13, Robbin Ehn wrote:
>>>
>>>                  Hi,
>>>
>>>                  To me the compiler changes looks what is expected.
>>>                  It would be good if someone from compiler could take a
>>> look at that.
>>>                  Added compiler to mail thread.
>>>
>>>                  Also adding Serguei, It would be good with his view
>>> also.
>>>
>>>                  My initial take on it, read through most of the code
>>> and took it for a ride.
>>>
>>>                  ##############################
>>>                  - Regarding the compiler changes: I think we need the
>>> 'TLAB end' trickery (mentioned by Tony P)
>>>                  instead of a separate check for sampling in fast path
>>> for the final version.
>>>
>>>                  ##############################
>>>                  - This patch I had to apply to get it compile on JDK 10:
>>>
>>>                  diff -r ac3ded340b35 src/share/vm/gc/shared/collect
>>> edHeap.inline.hpp
>>>                  --- a/src/share/vm/gc/shared/collectedHeap.inline.hpp
>>>   Fri Apr 28 14:31:38 2017 +0200
>>>                  +++ b/src/share/vm/gc/shared/collectedHeap.inline.hpp
>>>   Thu May 04 10:22:56 2017 +0200
>>>                  @@ -87,3 +87,3 @@
>>>                        // support for object alloc event (no-op most of
>>> the time)
>>>                  -    if (klass() != NULL && klass()->name() != NULL) {
>>>                  +    if (klass != NULL && klass->name() != NULL) {
>>>                          Thread *base_thread = Thread::current();
>>>                  diff -r ac3ded340b35 src/share/vm/runtime/heapMonit
>>> oring.cpp
>>>                  --- a/src/share/vm/runtime/heapMonitoring.cpp    Fri
>>> Apr 28 14:31:38 2017 +0200
>>>                  +++ b/src/share/vm/runtime/heapMonitoring.cpp    Thu
>>> May 04 10:22:56 2017 +0200
>>>                  @@ -316,3 +316,3 @@
>>>                      JavaThread *thread = reinterpret_cast<JavaThread
>>> *>(Thread::current());
>>>                  -  assert(o->size() << LogHeapWordSize == byte_size,
>>>                  +  assert(o->size() << LogHeapWordSize ==
>>> (long)byte_size,
>>>                             "Object size is incorrect.");
>>>
>>>                  ##############################
>>>                  - This patch I had to apply to get it not asserting
>>> during slowdebug:
>>>
>>>                  --- a/src/share/vm/runtime/heapMonitoring.cpp    Fri
>>> Apr 28 15:15:16 2017 +0200
>>>                  +++ b/src/share/vm/runtime/heapMonitoring.cpp    Thu
>>> May 04 10:24:25 2017 +0200
>>>                  @@ -32,3 +32,3 @@
>>>                    // TODO(jcbeyler): should we make this into a JVMTI
>>> structure?
>>>                  -struct StackTraceData {
>>>                  +struct StackTraceData : CHeapObj<mtInternal> {
>>>                      ASGCT_CallTrace *trace;
>>>                  @@ -143,3 +143,2 @@
>>>                    StackTraceStorage::StackTraceStorage() :
>>>                  -    _allocated_traces(new
>>> StackTraceData*[MaxHeapTraces]),
>>>                        _allocated_traces_size(MaxHeapTraces),
>>>                  @@ -147,2 +146,3 @@
>>>                        _allocated_count(0) {
>>>                  +  _allocated_traces = NEW_C_HEAP_ARRAY(StackTraceData*,
>>> MaxHeapTraces, mtInternal);
>>>                      memset(_allocated_traces, 0,
>>> sizeof(*_allocated_traces) * MaxHeapTraces);
>>>                  @@ -152,3 +152,3 @@
>>>                    StackTraceStorage::~StackTraceStorage() {
>>>                  -  delete[] _allocated_traces;
>>>                  +  FREE_C_HEAP_ARRAY(StackTraceData*,
>>> _allocated_traces);
>>>                    }
>>>
>>>                  - Classes should extend correct base class for which
>>> type of memory is used for it e.g.: CHeapObj<mt????> or StackObj or
>>> AllStatic
>>>                  - The style in heapMonitoring.cpp is a bit different
>>> from normal vm-style, e.g. using C++ casts instead of C. You mix
>>> NEW_C_HEAP_ARRAY, os::malloc and new.
>>>                  - In jvmtiHeapTransition.hpp you use C cast instead.
>>>
>>>                  ##############################
>>>                  - This patch I had apply to get traces without setting
>>> an ‘unrelated’ capability
>>>                  - Should this not be a new capability?
>>>
>>>                  diff -r c02a5d8785bf src/share/vm/prims/forte.cpp
>>>                  --- a/src/share/vm/prims/forte.cpp    Fri Apr 28
>>> 15:15:16 2017 +0200
>>>                  +++ b/src/share/vm/prims/forte.cpp    Thu May 04
>>> 10:24:25 2017 +0200
>>>                  @@ -530,6 +530,6 @@
>>>
>>>                  -  if (!JvmtiExport::should_post_class_load()) {
>>>                  +/*  if (!JvmtiExport::should_post_class_load()) {
>>>                        trace->num_frames = ticks_no_class_load; // -1
>>>                        return;
>>>                  -  }
>>>                  +  }*/
>>>
>>>                  ##############################
>>>                  - forte.cpp: (I know this is not part of your changes
>>> but)
>>>                  find_jmethod_id_or_null give me NULL for my test.
>>>                  It looks like we actually want the regular jmethod_id()
>>> ?
>>>
>>>                  Since we are the thread we are talking about (and in
>>> same ucontext) and thread is in vm and have a last java frame,
>>>                  I think most of the checks done in AsyncGetCallTrace is
>>> irrelevant, so you should be-able to call forte_fill_call_trace_given_top
>>> directly.
>>>                  But since we might need jmethod_id() if possible to
>>> avoid getting method id NULL,
>>>                  we need some fixes in forte code, or just do the
>>> vframStream loop inside heapMonitoring.cpp and not use forte.cpp.
>>>
>>>                  Something like:
>>>
>>>                     if (jthread->has_last_Java_frame()) { // just to be
>>> safe
>>>                       vframeStream vfst(jthread);
>>>                       while (!vfst.at_end()) {
>>>                         Method* m = vfst.method();
>>>                         m->jmethod_id();
>>>                         m->line_number_from_bci(vfst.bci());
>>>                         vfst.next();
>>>                       }
>>>
>>>                  - This is a bit confusing in forte.cpp,
>>> trace->frames[count].lineno = bci.
>>>                  Line number should be m->line_number_from_bci(bci);
>>>                  Do the heapMonitoring suppose to trace with bci or line
>>> number?
>>>                  I would say bci, meaning we should either rename
>>> ASGCT_CallFrame→lineno or use another data structure which says bci.
>>>
>>>                  ##############################
>>>                  - // TODO(jcbeyler): remove this extra code handling
>>> the extra trace for
>>>                  Please fix all these TODO's :)
>>>
>>>                  ##############################
>>>                  - heapMonitoring.hpp:
>>>                  // TODO(jcbeyler): is this algorithm acceptable in open
>>> source?
>>>
>>>                  Why is this comment here? What is the implication?
>>>                  Have you tested any simpler algorithm?
>>>
>>>                  ##############################
>>>                  - Create a sanity jtreg test.
>>> (./hotspot/make/test/JtregNative.gmk for building the agent)
>>>
>>>                  ##############################
>>>                  - monitoring_period vs HeapMonitorRate, pick rate or
>>> period.
>>>
>>>                  ##############################
>>>                  - globals.hpp
>>>                  Why is MaxHeapTraces not settable/overridable from
>>> jvmti interface? That would be handy.
>>>
>>>                  ##############################
>>>                  - jvmtiStackTraceData + ASGCT_CallFrame memory
>>>                  Are the agent suppose to loop through and free all
>>> ASGCT_CallFrame?
>>>                  Wouldn't it be better with some kinda protocol, like:
>>>                  (*jvmti)->GetLiveTraces(jvmti, &stack_traces,
>>> &num_traces);
>>>                  (*jvmti)->ReleaseTraces(jvmti, stack_traces,
>>> num_traces);
>>>
>>>                  Also using another data structure that have num_traces
>>> inside it simplifies things.
>>>                  So I'm not convinced using the async structure is the
>>> best way forward.
>>>
>>>
>>>                  I have more questions, but I think it's better if you
>>> respond and update the code first.
>>>
>>>                  Thanks!
>>>
>>>                  /Robbin
>>>
>>>
>>>                  On 04/21/2017 11:34 PM, JC Beyler wrote:
>>>
>>>                      Hi all,
>>>
>>>                      I've added size information to the allocation
>>> sampling system. This allows the callback to remember the size of each
>>> sampled allocation.
>>>         http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/>
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/>>
>>>
>>>                      The new webrev.01 also adds the actual heap
>>> monitoring sampling system in files:
>>>         http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh
>>> are/vm/runtime/heapMonitoring.cpp.patch
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/s
>>> hare/vm/runtime/heapMonitoring.cpp.patch>
>>>                      <http://cr.openjdk.java.net/~
>>> rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.cpp.patch
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/s
>>> hare/vm/runtime/heapMonitoring.cpp.patch>>
>>>                      and
>>>         http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/sh
>>> are/vm/runtime/heapMonitoring.hpp.patch
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/s
>>> hare/vm/runtime/heapMonitoring.hpp.patch>
>>>                      <http://cr.openjdk.java.net/~
>>> rasbold/8171119/webrev.01/src/share/vm/runtime/heapMonitoring.hpp.patch
>>>         <http://cr.openjdk.java.net/~rasbold/8171119/webrev.01/src/s
>>> hare/vm/runtime/heapMonitoring.hpp.patch>>
>>>
>>>                      My next step is to add the GC part to the webrev,
>>> which will allow users to determine what objects are live and what are
>>> garbage.
>>>
>>>                      Thanks for your attention and let me know if there
>>> are any questions!
>>>
>>>                      Have a wonderful Friday!
>>>                      Jc
>>>
>>>                      On Mon, Apr 17, 2017 at 12:37 PM, JC Beyler <
>>> jcbeyler at google.com <mailto:jcbeyler at google.com> <mailto:
>>> jcbeyler at google.com <mailto:jcbeyler at google.com>>
>>>         <mailto:jcbeyler at google.com <mailto:jcbeyler at google.com>
>>> <mailto:jcbeyler at google.com <mailto:jcbeyler at google.com>>>> wrote:
>>>
>>>                           Hi all,
>>>
>>>                           I worked on getting a few numbers for overhead
>>> and accuracy for my feature. I'm unsure if here is the right place to
>>> provide the full data, so I
>>>         am just
>>>                      summarizing
>>>                           here for now.
>>>
>>>                           - Overhead of the feature
>>>
>>>                           Using the Dacapo benchmark (
>>> http://dacapobench.org/). My initial results are that sampling provides
>>> 2.4% with a 512k sampling, 512k being our
>>>         default setting.
>>>
>>>                           - Note: this was without the tradesoap,
>>> tradebeans and tomcat benchmarks since they did not work with my JDK9
>>> (issue between Dacapo and JDK9 it seems)
>>>                           - I want to rerun next week to ensure number
>>> stability
>>>
>>>                           - Accuracy of the feature
>>>
>>>                           I wrote a small microbenchmark that allocates
>>> from two different stacktraces at a given ratio. For example, 10% of
>>> stacktrace S1 and 90% from
>>>         stacktrace
>>>                      S2. The
>>>                           microbenchmark was run 20 times, I averaged
>>> the results and looked for accuracy. It seems that statistically it is
>>> sound since if I allocated10%
>>>         S1 and 90%
>>>                      S2, with a
>>>                           sampling rate of 512k, I obtained 9.61% S1 and
>>> 90.49% S2.
>>>
>>>                           Let me know if there are any questions on the
>>> numbers and if you'd like to see some more data.
>>>
>>>                           Note: this was done using our internal JDK8
>>> implementation since the webrev provided by
>>>         http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html <
>>> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html>
>>>                      <http://cr.openjdk.java.net/~
>>> rasbold/heapz/webrev.00/index.html <http://cr.openjdk.java.net/~r
>>> asbold/heapz/webrev.00/index.html>>
>>>                      <http://cr.openjdk.java.net/~
>>> rasbold/heapz/webrev.00/index.html <http://cr.openjdk.java.net/~r
>>> asbold/heapz/webrev.00/index.html>
>>>         <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html
>>> <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html>>> does
>>> not yet contain the whole
>>>                      implementation and therefore would have been
>>> misleading.
>>>
>>>                           Thanks,
>>>                           Jc
>>>
>>>
>>>                           On Tue, Apr 4, 2017 at 3:55 PM, JC Beyler <
>>> jcbeyler at google.com <mailto:jcbeyler at google.com> <mailto:
>>> jcbeyler at google.com
>>>         <mailto:jcbeyler at google.com>> <mailto:jcbeyler at google.com
>>> <mailto:jcbeyler at google.com> <mailto:jcbeyler at google.com <mailto:
>>> jcbeyler at google.com>>>> wrote:
>>>
>>>                               Hi all,
>>>
>>>                               To move the discussion forward, with Chuck
>>> Rasbold's help to make a webrev, we pushed this:
>>>         http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html <
>>> http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html>
>>>         <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html
>>> <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html>>
>>>                      <http://cr.openjdk.java.net/~
>>> rasbold/heapz/webrev.00/index.html <http://cr.openjdk.java.net/~r
>>> asbold/heapz/webrev.00/index.html>
>>>         <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html
>>> <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/index.html>>>
>>>                               415 lines changed: 399 ins; 13 del; 3 mod;
>>> 51122 unchg
>>>
>>>                               This is not a final change that does the
>>> whole proposition from the JBS entry: https://bugs.openjdk.java.net/
>>> browse/JDK-8177374
>>>         <https://bugs.openjdk.java.net/browse/JDK-8177374>
>>>                      <https://bugs.openjdk.java.net/browse/JDK-8177374 <
>>> https://bugs.openjdk.java.net/browse/JDK-8177374>>
>>>                      <https://bugs.openjdk.java.net/browse/JDK-8177374 <
>>> https://bugs.openjdk.java.net/browse/JDK-8177374> <
>>> https://bugs.openjdk.java.net/browse/JDK-8177374
>>>         <https://bugs.openjdk.java.net/browse/JDK-8177374>>>; what it
>>> does show is parts of the implementation that is
>>>                      proposed and hopefully can start the conversation
>>> going
>>>                               as I work through the details.
>>>
>>>                               For example, the changes to C2 are done
>>> here for the allocations:
>>>         http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/shar
>>> e/vm/opto/macro.cpp.patch
>>>         <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/sha
>>> re/vm/opto/macro.cpp.patch>
>>>                      <http://cr.openjdk.java.net/~
>>> rasbold/heapz/webrev.00/src/share/vm/opto/macro.cpp.patch
>>>         <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/sha
>>> re/vm/opto/macro.cpp.patch>>
>>>                      <http://cr.openjdk.java.net/~
>>> rasbold/heapz/webrev.00/src/share/vm/opto/macro.cpp.patch
>>>         <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/sha
>>> re/vm/opto/macro.cpp.patch>
>>>                      <http://cr.openjdk.java.net/~
>>> rasbold/heapz/webrev.00/src/share/vm/opto/macro.cpp.patch
>>>         <http://cr.openjdk.java.net/~rasbold/heapz/webrev.00/src/sha
>>> re/vm/opto/macro.cpp.patch>>>
>>>
>>>                               Hopefully this all makes sense and thank
>>> you for all your future comments!
>>>                               Jc
>>>
>>>
>>>                               On Tue, Dec 13, 2016 at 1:11 PM, JC Beyler
>>> <jcbeyler at google.com <mailto:jcbeyler at google.com> <mailto:
>>> jcbeyler at google.com
>>>         <mailto:jcbeyler at google.com>> <mailto:jcbeyler at google.com
>>> <mailto:jcbeyler at google.com> <mailto:jcbeyler at google.com <mailto:
>>> jcbeyler at google.com>>>>
>>>                      wrote:
>>>
>>>                                   Hello all,
>>>
>>>                                   This is a follow-up from Jeremy's
>>> initial email from last year:
>>>         http://mail.openjdk.java.net/pipermail/serviceability-dev/20
>>> 15-June/017543.html <http://mail.openjdk.java.net/
>>> pipermail/serviceability-dev/2015-June/017543.html>
>>>                      <http://mail.openjdk.java.net
>>> /pipermail/serviceability-dev/2015-June/017543.html
>>>         <http://mail.openjdk.java.net/pipermail/serviceability-dev/2
>>> 015-June/017543.html>>
>>>                      <http://mail.openjdk.java.net
>>> /pipermail/serviceability-dev/2015-June/017543.html
>>>         <http://mail.openjdk.java.net/pipermail/serviceability-dev/2
>>> 015-June/017543.html> <http://mail.openjdk.java.net/
>>> pipermail/serviceability-dev/2015-June/017543.html
>>>         <http://mail.openjdk.java.net/pipermail/serviceability-dev/2
>>> 015-June/017543.html>>>
>>>
>>>                                   I've gone ahead and started working on
>>> preparing this and Jeremy and I went down the route of actually writing it
>>> up in JEP form:
>>>         https://bugs.openjdk.java.net/browse/JDK-8171119 <
>>> https://bugs.openjdk.java.net/browse/JDK-8171119> <
>>> https://bugs.openjdk.java.net/browse/JDK-8171119
>>>         <https://bugs.openjdk.java.net/browse/JDK-8171119>>
>>>
>>>                                   I think original conversation that
>>> happened last year in that thread still holds true:
>>>
>>>                                     - We have a patch at Google that we
>>> think others might be interested in
>>>                                        - It provides a means to
>>> understand where the allocation hotspots are at a very low overhead
>>>                                        - Since it is at a low overhead,
>>> we can leave it on by default
>>>
>>>                                   So I come to the mailing list with
>>> Jeremy's initial question:
>>>                                   "I thought I would ask if there is any
>>> interest / if I should write a JEP / if I should just forget it."
>>>
>>>                                   A year ago, it seemed some thought it
>>> was a good idea, is this still true?
>>>
>>>                                   Thanks,
>>>                                   Jc
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20170612/d93a01c3/attachment-0001.html>