Low-Overhead Heap Profiling
JC Beyler
jcbeyler at google.com
Mon Oct 23 15:27:50 UTC 2017
Dear all,
Small update this week with this new webrev:
- http://cr.openjdk.java.net/~rasbold/8171119/webrev.13/
- Incremental is here:
http://cr.openjdk.java.net/~rasbold/8171119/webrev.12_13/
I patched the code changes showed by Robbin last week and I refactored
collectedHeap.cpp:
http://cr.openjdk.java.net/~rasbold/8171119/webrev.12_13/src/hotspot/share/gc/shared/collectedHeap.cpp.patch
The original code became a bit too complex in my opinion with the
handle_heap_sampling handling too many things. So I subdivided the logic
into two smaller methods and moved out a bit of the logic to make it more
clear. Hopefully it is :)
Let me know if you have any questions/comments :)
Jc
On Mon, Oct 16, 2017 at 9:34 AM, JC Beyler <jcbeyler at google.com> wrote:
> Hi Robbin,
>
> That is because version 11 to 12 was only a test change. I was going to
> write about it and say here are the webrev links:
> Incremental:
> http://cr.openjdk.java.net/~rasbold/8171119/webrev.11_12/
>
> Full webrev:
> http://cr.openjdk.java.net/~rasbold/8171119/webrev.12/
>
> This change focused only on refactoring the tests to be more manageable,
> readable, maintainable. As all tests are looking at allocations, I moved
> common code to a java class:
> http://cr.openjdk.java.net/~rasbold/8171119/webrev.11_12/
> test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/
> HeapMonitor.java.patch
>
> And then most tests call into that class to turn on/off the sampling,
> allocate, etc. This has removed almost 500 lines of test code so I'm happy
> about that.
>
> Thanks for your changes, a bit of relics of previous versions :). I've
> already integrated them into my code and will make a new webrev end of this
> week with a bit of refactor of the code handling the tlab slow path. I find
> it could use a bit of refactoring to make it easier to follow so I'm going
> to take a stab at it this week.
>
> Any other issues/comments?
>
> Thanks!
> Jc
>
>
> On Mon, Oct 16, 2017 at 8:46 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>
>> Hi JC,
>>
>> I saw a webrev.12 in the directory, with only test changes(11->12), so I
>> took that version.
>> I had a look and tested the tests, worked fine!
>>
>> First glance at the code (looking at full v12) some minor things below,
>> mostly unused stuff.
>>
>> Thanks, Robbin
>>
>> diff -r 9047e0d726d6 src/hotspot/share/runtime/heapMonitoring.cpp
>> --- a/src/hotspot/share/runtime/heapMonitoring.cpp Mon Oct 16
>> 16:54:06 2017 +0200
>> +++ b/src/hotspot/share/runtime/heapMonitoring.cpp Mon Oct 16
>> 17:42:42 2017 +0200
>> @@ -211,2 +211,3 @@
>> void initialize(int max_storage) {
>> + // validate max_storage to sane value ? What would 0 mean ?
>> MutexLocker mu(HeapMonitor_lock);
>> @@ -227,8 +228,4 @@
>> bool initialized() { return _initialized; }
>> - volatile bool *initialized_address() { return &_initialized; }
>>
>> private:
>> - // Protects the traces currently sampled (below).
>> - volatile intptr_t _stack_storage_lock[1];
>> -
>> // The traces currently sampled.
>> @@ -313,3 +310,2 @@
>> _initialized(false) {
>> - _stack_storage_lock[0] = 0;
>> }
>> @@ -532,13 +528,2 @@
>>
>> -// Delegate the initialization question to the underlying storage system.
>> -bool HeapMonitoring::initialized() {
>> - return StackTraceStorage::storage()->initialized();
>> -}
>> -
>> -// Delegate the initialization question to the underlying storage system.
>> -bool *HeapMonitoring::initialized_address() {
>> - return
>> - const_cast<bool*>(StackTraceStorage::storage()->initialized_
>> address());
>> -}
>> -
>> void HeapMonitoring::get_live_traces(jvmtiStackTraces *traces) {
>> diff -r 9047e0d726d6 src/hotspot/share/runtime/heapMonitoring.hpp
>> --- a/src/hotspot/share/runtime/heapMonitoring.hpp Mon Oct 16
>> 16:54:06 2017 +0200
>> +++ b/src/hotspot/share/runtime/heapMonitoring.hpp Mon Oct 16
>> 17:42:42 2017 +0200
>> @@ -35,3 +35,2 @@
>> static uint64_t _rnd;
>> - static bool _initialized;
>> static jint _monitoring_rate;
>> @@ -92,7 +91,2 @@
>>
>> - // Is the profiler initialized and where is the address to the
>> initialized
>> - // boolean.
>> - static bool initialized();
>> - static bool *initialized_address();
>> -
>> // Called when o is to be sampled from a given thread and a given size.
>>
>>
>>
>> On 10/10/2017 12:57 AM, JC Beyler wrote:
>>
>>> Dear all,
>>>
>>> Thread-safety is back!! Here is the update webrev:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/
>>>
>>> Full webrev is here:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.11/
>>>
>>> In order to really test this, I needed to add this so thought now was a
>>> good time. It required a few changes here for the creation to ensure
>>> correctness and safety. Now we keep the static pointer but clear the data
>>> internally so on re-initialize, it will be a bit more costly than before. I
>>> don't think this is a huge use-case so I did not think it was a problem. I
>>> used the internal MutexLocker, I think I used it well, let me know.
>>>
>>> I also added three tests:
>>>
>>> 1) Stack depth test:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/tes
>>> t/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/H
>>> eapMonitorStackDepthTest.java.patch
>>>
>>> This test shows that the maximum stack depth system is working.
>>>
>>> 2) Thread safety:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/tes
>>> t/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/H
>>> eapMonitorThreadTest.java.patch
>>>
>>> The test creates 24 threads and they all allocate at the same time. The
>>> test then checks it does find samples from all the threads.
>>>
>>> 3) Thread on/off safety
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10_11/tes
>>> t/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/H
>>> eapMonitorThreadOnOffTest.java.patch
>>>
>>> The test creates 24 threads that all allocate a bunch of memory. Then
>>> another thread turns the sampling on/off.
>>>
>>> Btw, both tests 2 & 3 failed without the locks.
>>>
>>> As I worked on this, I saw a lot of places where the tests are doing
>>> very similar things, I'm going to clean up the code a bit and make a
>>> HeapAllocator class that all tests can call directly. This will greatly
>>> simplify the code.
>>>
>>> Thanks for any comments/criticisms!
>>> Jc
>>>
>>>
>>> On Mon, Oct 2, 2017 at 8:52 PM, JC Beyler <jcbeyler at google.com <mailto:
>>> jcbeyler at google.com>> wrote:
>>>
>>> Dear all,
>>>
>>> Small update to the webrev:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.09_10/>
>>>
>>> Full webrev is here:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/>
>>>
>>> I updated a bit of the naming, removed a TODO comment, and I added a
>>> test for testing the sampling rate. I also updated the maximum stack depth
>>> to 1024, there is no
>>> reason to keep it so small. I did a micro benchmark that tests the
>>> overhead and it seems relatively the same.
>>>
>>> I compared allocations from a stack depth of 10 and allocations from
>>> a stack depth of 1024 (allocations are from the same helper method in
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_fi
>>> les/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/
>>> MyPackage/HeapMonitorStatRateTest.java
>>> <http://cr.openjdk.java.net/~rasbold/8171119/webrev.10/raw_f
>>> iles/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor
>>> /MyPackage/HeapMonitorStatRateTest.java>):
>>> - For an array of 1 integer allocated in a loop; stack
>>> depth 1024 vs stack depth 10: 1% slower
>>> - For an array of 200k integers allocated in a loop;
>>> stack depth 1024 vs stack depth 10: 3% slower
>>>
>>> So basically now moving the maximum stack depth to 1024 but we only
>>> copy over the stack depths actually used.
>>>
>>> For the next webrev, I will be adding a stack depth test to show
>>> that it works and probably put back the mutex locking so that we can see
>>> how difficult it is to keep
>>> thread safe.
>>>
>>> Let me know what you think!
>>> Jc
>>>
>>>
>>>
>>> On Mon, Sep 25, 2017 at 3:02 PM, JC Beyler <jcbeyler at google.com
>>> <mailto:jcbeyler at google.com>> wrote:
>>>
>>> Forgot to say that for my numbers:
>>> - Not in the test are the actual numbers I got for the various
>>> array sizes, I ran the program 30 times and parsed the output; here are the
>>> averages and standard
>>> deviation:
>>> 1000: 1.28% average; 1.13% standard deviation
>>> 10000: 1.59% average; 1.25% standard deviation
>>> 100000: 1.26% average; 1.26% standard deviation
>>>
>>> The 1000/10000/100000 are the sizes of the arrays being
>>> allocated. These are allocated 100k times and the sampling rate is 111
>>> times the size of the array.
>>>
>>> Thanks!
>>> Jc
>>>
>>>
>>> On Mon, Sep 25, 2017 at 3:01 PM, JC Beyler <jcbeyler at google.com
>>> <mailto:jcbeyler at google.com>> wrote:
>>>
>>> Hi all,
>>>
>>> After a bit of a break, I am back working on this :). As
>>> before, here are two webrevs:
>>>
>>> - Full change set: http://cr.openjdk.java.net/~ra
>>> sbold/8171119/webrev.09/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.09/>
>>> - Compared to version 8: http://cr.openjdk.java.net/~ra
>>> sbold/8171119/webrev.08_09/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.08_09/>
>>> (This version is compared to version 8 I last showed
>>> but ported to the new folder hierarchy)
>>>
>>> In this version I have:
>>> - Handled Thomas' comments from his email of 07/03:
>>> - Merged the logging to be standard
>>> - Fixed up the code a bit where asked
>>> - Added some notes about the code not being
>>> thread-safe yet
>>> - Removed additional dead code from the version that
>>> modifies interpreter/c1/c2
>>> - Fixed compiler issues so that it compiles with
>>> --disable-precompiled-header
>>> - Tested with ./configure --with-boot-jdk=<jdk8>
>>> --with-debug-level=slowdebug --disable-precompiled-headers
>>>
>>> Additionally, I added a test to check the sanity of the
>>> sampler: HeapMonitorStatCorrectnessTest
>>> (http://cr.openjdk.java.net/~rasbold/8171119/webrev.08_09/te
>>> st/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/
>>> HeapMonitorStatCorrectnessTest.java.patch <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.08_09/test/hotspot/jtreg/serviceabilit
>>> y/jvmti/HeapMonitor/MyPackage/HeapMonitorStatCorrectnessTest.java.patch
>>> >)
>>> - This allocates a number of arrays and checks that we
>>> obtain the number of samples we want with an accepted error of 5%. I tested
>>> it 100 times and it
>>> passed everytime, I can test more if wanted
>>> - Not in the test are the actual numbers I got for the
>>> various array sizes, I ran the program 30 times and parsed the output; here
>>> are the averages and
>>> standard deviation:
>>> 1000: 1.28% average; 1.13% standard deviation
>>> 10000: 1.59% average; 1.25% standard deviation
>>> 100000: 1.26% average; 1.26% standard deviation
>>>
>>> What this means is that we were always at about 1~2% of the
>>> number of samples the test expected.
>>>
>>> Let me know what you think,
>>> Jc
>>>
>>> On Wed, Jul 5, 2017 at 9:31 PM, JC Beyler <
>>> jcbeyler at google.com <mailto:jcbeyler at google.com>> wrote:
>>>
>>> Hi all,
>>>
>>> I apologize, I have not yet handled your remarks but
>>> thought this new webrev would also be useful to see and comment on perhaps.
>>>
>>> Here is the latest webrev, it is generated slightly
>>> different than the others since now I'm using webrev.ksh without the -N
>>> option:
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.08/>
>>>
>>> And the webrev.07 to webrev.08 diff is here:
>>> http://cr.openjdk.java.net/~ra
>>> sbold/8171119/webrev.07_08/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07_08/>
>>>
>>> (Let me know if it works well)
>>>
>>> It's a small change between versions but it:
>>> - provides a fix that makes the average sample rate
>>> correct (more on that below).
>>> - fixes the code to actually have it play nicely with
>>> the fast tlab refill
>>> - cleaned up a bit the JVMTI text and now use
>>> jvmtiFrameInfo
>>> - moved the capability to be onload solo
>>>
>>> With this webrev, I've done a small study of the random
>>> number generator we use here for the sampling rate. I took a small program
>>> and it can be simplified to:
>>>
>>> for (outer loop)
>>> for (inner loop)
>>> int[] tmp = new int[arraySize];
>>>
>>> - I've fixed the outer and inner loops to being 800 for
>>> this experiment, meaning we allocate 640000 times an array of a given array
>>> size.
>>>
>>> - Each program provides the average sample size used for
>>> the whole execution
>>>
>>> - Then, I ran each variation 30 times and then
>>> calculated the average of the average sample size used for various array
>>> sizes. I selected the array size to
>>> be one of the following: 1, 10, 100, 1000.
>>>
>>> - When compared to 512kb, the average sample size of 30
>>> runs:
>>> 1: 4.62% of error
>>> 10: 3.09% of error
>>> 100: 0.36% of error
>>> 1000: 0.1% of error
>>> 10000: 0.03% of error
>>>
>>> What it shows is that, depending on the number of
>>> samples, the average does become better. This is because with an allocation
>>> of 1 element per array, it
>>> will take longer to hit one of the thresholds. This is
>>> seen by looking at the sample count statistic I put in. For the same number
>>> of iterations (800 *
>>> 800), the different array sizes provoke:
>>> 1: 62 samples
>>> 10: 125 samples
>>> 100: 788 samples
>>> 1000: 6166 samples
>>> 10000: 57721 samples
>>>
>>> And of course, the more samples you have, the more
>>> sample rates you pick, which means that your average gets closer using that
>>> math.
>>>
>>> Thanks,
>>> Jc
>>>
>>> On Thu, Jun 29, 2017 at 10:01 PM, JC Beyler <
>>> jcbeyler at google.com <mailto:jcbeyler at google.com>> wrote:
>>>
>>> Thanks Robbin,
>>>
>>> This seems to have worked. When I have the next
>>> webrev ready, we will find out but I'm fairly confident it will work!
>>>
>>> Thanks agian!
>>> Jc
>>>
>>> On Wed, Jun 28, 2017 at 11:46 PM, Robbin Ehn <
>>> robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>> wrote:
>>>
>>> Hi JC,
>>>
>>> On 06/29/2017 12:15 AM, JC Beyler wrote:
>>>
>>> B) Incremental changes
>>>
>>>
>>> I guess the most common work flow here is using
>>> mq :
>>> hg qnew fix_v1
>>> edit files
>>> hg qrefresh
>>> hg qnew fix_v2
>>> edit files
>>> hg qrefresh
>>>
>>> if you do hg log you will see 2 commits
>>>
>>> webrev.ksh -r -2 -o my_inc_v1_v2
>>> webrev.ksh -o my_full_v2
>>>
>>>
>>> In your .hgrc you might need:
>>> [extensions]
>>> mq =
>>>
>>> /Robbin
>>>
>>>
>>> Again another newbiew question here...
>>>
>>> For showing the incremental changes, is
>>> there a link that explains how to do that? I apologize for my newbie
>>> questions all the time :)
>>>
>>> Right now, I do:
>>>
>>> ksh ../webrev.ksh -m -N
>>>
>>> That generates a webrev.zip and send it to
>>> Chuck Rasbold. He then uploads it to a new webrev.
>>>
>>> I tried commiting my change and adding a
>>> small change. Then if I just do ksh ../webrev.ksh without any options, it
>>> seems to produce a similar
>>> page but now with only the changes I had (so
>>> the 06-07 comparison you were talking about) and a changeset that has it
>>> all. I imagine that is
>>> what you meant.
>>>
>>> Which means that my workflow would become:
>>>
>>> 1) Make changes
>>> 2) Make a webrev without any options to show
>>> just the differences with the tip
>>> 3) Amend my changes to my local commit so
>>> that I have it done with
>>> 4) Go to 1
>>>
>>> Does that seem correct to you?
>>>
>>> Note that when I do this, I only see the
>>> full change of a file in the full change set (Side note here: now the page
>>> says change set and not
>>> patch, which is maybe why Serguei was having
>>> issues?).
>>>
>>> Thanks!
>>> Jc
>>>
>>>
>>>
>>> On Wed, Jun 28, 2017 at 1:12 AM, Robbin Ehn <
>>> robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com> <mailto:
>>> robbin.ehn at oracle.com
>>> <mailto:robbin.ehn at oracle.com>>> wrote:
>>>
>>> Hi,
>>>
>>> On 06/28/2017 12:04 AM, JC Beyler wrote:
>>>
>>> Dear Thomas et al,
>>>
>>> Here is the newest webrev:
>>> http://cr.openjdk.java.net/~ra
>>> sbold/8171119/webrev.07/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07/>
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07/>>
>>>
>>>
>>>
>>> You have some more bits to in there but
>>> generally this looks good and really nice with more tests.
>>> I'll do and deep dive and re-test this
>>> when I get back from my long vacation with whatever patch version you have
>>> then.
>>>
>>> Also I think it's time you provide
>>> incremental (v06->07 changes) as well as complete change-sets.
>>>
>>> Thanks, Robbin
>>>
>>>
>>>
>>>
>>> Thomas, I "think" I have answered
>>> all your remarks. The summary is:
>>>
>>> - The statistic system is up and
>>> provides insight on what the heap sampler is doing
>>> - I've noticed that, though
>>> the sampling rate is at the right mean, we are missing some samples, I have
>>> not yet tracked out why
>>> (details below)
>>>
>>> - I've run a tiny benchmark that is
>>> the worse case: it is a very tight loop and allocated a small array
>>> - In this case, I see no
>>> overhead when the system is off so that is a good start :)
>>> - I see right now a high
>>> overhead in this case when sampling is on. This is not a really too
>>> surprising but I'm going to see if
>>> this is consistent with our
>>> internal implementation. The
>>> benchmark is really allocation stressful so I'm not too surprised but I
>>> want to do the due diligence.
>>>
>>> - The statistic system up is up
>>> and I have a new test
>>> http://cr.openjdk.java.net/~ra
>>> sbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonito
>>> r/MyPackage/HeapMonitorStatTest.java.patch
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit
>>> or/MyPackage/HeapMonitorStatTest.java.patch>
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit
>>> or/MyPackage/HeapMonitorStatTest.java.patch
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit
>>> or/MyPackage/HeapMonitorStatTest.java.patch>>
>>> - I did a bit of a study
>>> about the random generator here, more details are below but basically it
>>> seems to work well
>>>
>>> - I added a capability but since
>>> this is the first time doing this, I was not sure I did it right
>>> - I did add a test though for
>>> it and the test seems to do what I expect (all methods are failing with the
>>> JVMTI_ERROR_MUST_POSSESS_CAPABILITY error).
>>> -
>>> http://cr.openjdk.java.net/~ra
>>> sbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonito
>>> r/MyPackage/HeapMonitorNoCapabilityTest.java.patch
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit
>>> or/MyPackage/HeapMonitorNoCapabilityTest.java.patch>
>>> <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.07/test/
>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorNoCapa
>>> bilityTest.java.patch
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.07/test/serviceability/jvmti/HeapMonit
>>> or/MyPackage/HeapMonitorNoCapabilityTest.java.patch>>
>>>
>>> - I still need to figure out
>>> what to do about the multi-agent vs single-agent issue
>>>
>>> - As far as measurements, it
>>> seems I still need to look at:
>>> - Why we do the 20 random
>>> calls first, are they necessary?
>>> - Look at the mean of the
>>> sampling rate that the random generator does and also what is actually
>>> sampled
>>> - What is the overhead in
>>> terms of memory/performance when on?
>>>
>>> I have inlined my answers, I think
>>> I got them all in the new webrev, let me know your thoughts.
>>>
>>> Thanks again!
>>> Jc
>>>
>>>
>>> On Fri, Jun 23, 2017 at 3:52 AM,
>>> Thomas Schatzl <thomas.schatzl at oracle.com <mailto:thomas.schatzl at oracle.
>>> com>
>>> <mailto:thomas.schatzl at oracle.com <mailto:
>>> thomas.schatzl at oracle.com>> <mailto:thomas.schatzl at oracle.com <mailto:
>>> thomas.schatzl at oracle.com>
>>>
>>> <mailto:thomas.schatzl at oracle.com
>>> <mailto:thomas.schatzl at oracle.com>>>> wrote:
>>>
>>> Hi,
>>>
>>> On Wed, 2017-06-21 at 13:45
>>> -0700, JC Beyler wrote:
>>> > Hi all,
>>> >
>>> > First off: Thanks again to
>>> Robbin and Thomas for their reviews :)
>>> >
>>> > Next, I've uploaded a new
>>> webrev:
>>> >
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/>
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.06/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.06/>>
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.06/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.06/>
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.06/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.06/>>>
>>>
>>> >
>>> > Here is an update:
>>> >
>>> > - @Robbin, I forgot to say
>>> that yes I need to look at implementing
>>> > this for the other
>>> architectures and testing it before it is all
>>> > ready to go. Is it common to
>>> have it working on all possible
>>> > combinations or is there a
>>> subset that I should be doing first and we
>>> > can do the others later?
>>> > - I've tested slowdebug,
>>> built and ran the JTreg tests I wrote with
>>> > slowdebug and fixed a few
>>> more issues
>>> > - I've refactored a bit of
>>> the code following Thomas' comments
>>> > - I think I've handled
>>> all the comments from Thomas (I put
>>> > comments inline below for
>>> the specifics)
>>>
>>> Thanks for handling all those.
>>>
>>> > - Following Thomas' comments
>>> on statistics, I want to add some
>>> > quality assurance tests and
>>> find that the easiest way would be to
>>> > have a few counters of what
>>> is happening in the sampler and expose
>>> > that to the user.
>>> > - I'll be adding that in
>>> the next version if no one sees any
>>> > objections to that.
>>> > - This will allow me to
>>> add a sanity test in JTreg about number of
>>> > samples and average of
>>> sampling rate
>>> >
>>> > @Thomas: I had a few
>>> questions that I inlined below but I will
>>> > summarize the "bigger ones"
>>> here:
>>> > - You mentioned constants
>>> are not using the right conventions, I
>>> > looked around and didn't see
>>> any convention except normal naming then
>>> > for static constants. Is
>>> that right?
>>>
>>> I looked through
>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGui <
>>> https://wiki.openjdk.java.net/display/HotSpot/StyleGui>
>>> <https://wiki.openjdk.java.net
>>> /display/HotSpot/StyleGui <https://wiki.openjdk.java.net
>>> /display/HotSpot/StyleGui>>
>>> <https://wiki.openjdk.java.net
>>> /display/HotSpot/StyleGui <https://wiki.openjdk.java.net
>>> /display/HotSpot/StyleGui>
>>> <https://wiki.openjdk.java.net
>>> /display/HotSpot/StyleGui <https://wiki.openjdk.java.net
>>> /display/HotSpot/StyleGui>>>
>>> de and the rule is to "follow
>>> an existing pattern and must have a
>>> distinct appearance from other
>>> names". Which does not help a lot I
>>> guess :/ The GC team started
>>> using upper camel case, e.g.
>>> SomeOtherConstant, but very
>>> likely this is probably not applied
>>> consistently throughout. So I
>>> am fine with not adding another style
>>> (like kMaxStackDepth with the
>>> "k" in front with some unknown meaning)
>>> is fine.
>>>
>>> (Chances are you will find
>>> that style somewhere used anyway too,
>>> apologies if so :/)
>>>
>>>
>>> Thanks for that link, now I know
>>> where to look. I used the upper camel case in my code as well then :) I
>>> should have gotten them all.
>>>
>>>
>>> > PS: I've also inlined my
>>> answers to Thomas below:
>>> >
>>> > On Tue, Jun 13, 2017 at
>>> 8:03 AM, Thomas Schatzl <thomas.schatzl at oracl
>>> > e.com <http://e.com> <
>>> http://e.com> <http://e.com>> wrote:
>>> > > Hi all,
>>> > >
>>> > > On Mon, 2017-06-12 at
>>> 11:11 -0700, JC Beyler wrote:
>>> > > > Dear all,
>>> > > >
>>> > > > I've continued working
>>> on this and have done the following
>>> > > webrev:
>>> > > >
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/ <
>>> http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/>
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.05/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.05/>>
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.05/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.05/>
>>> <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.05/ <http://cr.openjdk.java.net/~r
>>> asbold/8171119/webrev.05/>>>
>>>
>>> > >
>>> > > [...]
>>> > > > Things I still need to
>>> do:
>>> > > > - Have to fix that
>>> TLAB case for the FastTLABRefill
>>> > > > - Have to start
>>> looking at the data to see that it is
>>> > > consistent and does
>>> gather the right samples, right frequency, etc.
>>> > > > - Have to check the
>>> GC elements and what that produces
>>> > > > - Run a slowdebug
>>> run and ensure I fixed all those issues you
>>> > > saw > Robbin
>>> > > >
>>> > > > Thanks for looking at
>>> the webrev and have a great week!
>>> > >
>>> > > scratching a bit on the
>>> surface of this change, so apologies for
>>> > > rather shallow comments:
>>> > >
>>> > > -
>>> macroAssembler_x86.cpp:5604: while this is compiler code, and I
>>> > > am not sure this is
>>> final, please avoid littering the code with
>>> > > TODO remarks :) They tend
>>> to be candidates for later wtf moments
>>> > > only.
>>> > >
>>> > > Just file a CR for that.
>>> > >
>>> > Newcomer question: what is
>>> a CR and not sure I have the rights to do
>>> > that yet ? :)
>>>
>>> Apologies. CR is a change
>>> request, this suggests to file a bug in the
>>> bug tracker. And you are
>>> right, you can't just create a new account in
>>> the OpenJDK JIRA yourselves. :(
>>>
>>>
>>> Ok good to know, I'll continue with
>>> my own todo list but I'll work hard on not letting it slip in the webrevs
>>> anymore :)
>>>
>>>
>>> I was mostly referring to the
>>> "... but it is a TODO" part of that
>>> comment in
>>> macroassembler_x86.cpp. Comments about the why of the code
>>> are appreciated.
>>>
>>> [Note that I now understand
>>> that this is to some degree still work in
>>> progress. As long as the final
>>> changeset does no contain TODO's I am
>>> fine (and it's not a hard
>>> objection, rather their use in "final" code
>>> is typically limited in my
>>> experience)]
>>>
>>> 5603 // Currently, if this
>>> happens, just set back the actual end to
>>> where it was.
>>> 5604 // We miss a chance to
>>> sample here.
>>>
>>> Would be okay, if explaining
>>> "this" and the "why" of missing a chance
>>> to sample here would be best.
>>>
>>> Like maybe:
>>>
>>> // If we needed to refill
>>> TLABs, just set the actual end point to
>>> // the end of the TLAB again.
>>> We do not sample here although we could.
>>>
>>> Done with your comment, it works
>>> well in my mind.
>>>
>>> I am not sure whether "miss a
>>> chance to sample" meant "we could, but
>>> consciously don't because it's
>>> not that useful" or "it would be
>>> necessary but don't because
>>> it's too complicated to do.".
>>>
>>> Looking at the original
>>> comment once more, I am also not sure if that
>>> comment shouldn't referring to
>>> the "end" variable (not actual_end)
>>> because that's the variable
>>> that is responsible for taking the sampling
>>> path? (Going from the member
>>> description of ThreadLocalAllocBuffer).
>>>
>>>
>>> I've moved this code and it no
>>> longer shows up here but the rationale and answer was:
>>>
>>> So.. Yes, end is the variable
>>> provoking the sampling. Actual end is the actual end of the TLAB.
>>>
>>> What was happening here is that the
>>> code is resetting _end to point towards the end of the new TLAB. Because,
>>> we now have the end for
>>> sampling and _actual_end for
>>> the actual end, we need to update
>>> the actual_end as well.
>>>
>>> Normally, were we to do the real
>>> work here, we would calculate the (end - start) offset, then do:
>>>
>>> - Set the new end to : start +
>>> (old_end - old_start)
>>> - Set the actual end like we do
>>> here now where it because it is the actual end.
>>>
>>> Why is this not done here now
>>> anymore?
>>> - I was still debating which
>>> path to take:
>>> - Do it in the fast refill
>>> code, it has its perks:
>>> - In a world where fast
>>> refills are happening all the time or a lot, we can augment there the code
>>> to do the sampling
>>> - Remember what we had as an
>>> end before leaving the slowpath and check on return
>>> - This is what I'm doing
>>> now, it removes the need to go fix up all fast refill paths but if you
>>> remain in fast refill paths,
>>> you won't get sampling. I
>>> have to think of the consequences
>>> of that, maybe a future change later on?
>>> - I have the
>>> statistics now so I'm going to study that
>>> -> By the way,
>>> though my statistics are showing I'm missing some samples, if I turn off
>>> FastTlabRefill, it is the same
>>> loss so for now, it seems
>>> this does not occur in my simple
>>> test.
>>>
>>>
>>>
>>> But maybe I am only confused
>>> and it's best to just leave the comment
>>> away. :)
>>>
>>> Thinking about it some more,
>>> doesn't this not-sampling in this case
>>> mean that sampling does not
>>> work in any collector that does inline TLAB
>>> allocation at the moment? (Or
>>> is inline TLAB alloc automatically
>>> disabled with sampling
>>> somehow?)
>>>
>>> That would indeed be a bigger
>>> TODO then :)
>>>
>>>
>>> Agreed, this remark made me think
>>> that perhaps as a first step the new way of doing it is better but I did
>>> have to:
>>> - Remove the const of the
>>> ThreadLocalBuffer remaining and hard_end methods
>>> - Move hard_end out of the
>>> header file to have a bit more logic there
>>>
>>> Please let me know what you think
>>> of that and if you prefer it this way or changing the fast refills. (I
>>> prefer this way now because it
>>> is more incremental).
>>>
>>>
>>> > > - calling
>>> HeapMonitoring::do_weak_oops() (which should probably be
>>> > > called weak_oops_do() like
>>> other similar methods) only if string
>>> > > deduplication is enabled
>>> (in g1CollectedHeap.cpp:4511) seems wrong.
>>> >
>>> > The call should be at least
>>> around 6 lines up outside the if.
>>> >
>>> > Preferentially in a method
>>> like process_weak_jni_handles(), including
>>> > additional logging. (No new
>>> (G1) gc phase without minimal logging
>>> > :)).
>>> > Done but really not sure
>>> because:
>>> >
>>> > I put for logging:
>>> > log_develop_trace(gc,
>>> freelist)("G1ConcRegionFreeing [other] : heap
>>> > monitoring");
>>>
>>> I would think that "gc, ref"
>>> would be more appropriate log tags for
>>> this similar to jni handles.
>>> (I am als not sure what weak
>>> reference handling has to do with
>>> G1ConcRegionFreeing, so I am a
>>> bit puzzled)
>>>
>>>
>>> I was not sure what to put for the
>>> tags or really as the message. I cleaned it up a bit now to:
>>> log_develop_trace(gc,
>>> ref)("HeapSampling [other] : heap monitoring processing");
>>>
>>>
>>>
>>> > Since weak_jni_handles
>>> didn't have logging for me to be inspired
>>> > from, I did that but
>>> unconvinced this is what should be done.
>>>
>>> The JNI handle processing does
>>> have logging, but only in
>>> ReferenceProcessor::process_discovered_references().
>>> In
>>> process_weak_jni_handles()
>>> only overall time is measured (in a G1
>>> specific way, since only G1
>>> supports disabling reference procesing) :/
>>>
>>> The code in ReferenceProcessor
>>> prints both time taken
>>> referenceProcessor.cpp:254, as
>>> well as the count, but strangely only in
>>> debug VMs.
>>>
>>> I have no idea why this
>>> logging is that unimportant to only print that
>>> in a debug VM. However there
>>> are reviews out for changing this area a
>>> bit, so it might be useful to
>>> wait for that (JDK-8173335).
>>>
>>>
>>> I cleaned it up a bit anyway and
>>> now it returns the count of objects that are in the system.
>>>
>>>
>>> > > - the change doubles the
>>> size of
>>> > >
>>> CollectedHeap::allocate_from_tlab_slow() above the "small and nice"
>>> > > threshold. Maybe it could
>>> be refactored a bit.
>>> > Done I think, it looks
>>> better to me :).
>>>
>>> In
>>> ThreadLocalAllocBuffer::handle_sample() I think the
>>> set_back_actual_end()/pick_next_sample()
>>> calls could be hoisted out of
>>> the "if" :)
>>>
>>>
>>> Done!
>>>
>>>
>>> > > -
>>> referenceProcessor.cpp:261: the change should add logging about
>>> > > the number of references
>>> encountered, maybe after the corresponding
>>> > > "JNI weak reference count"
>>> log message.
>>> > Just to double check, are
>>> you saying that you'd like to have the heap
>>> > sampler to keep in store how
>>> many sampled objects were encountered in
>>> > the
>>> HeapMonitoring::weak_oops_do?
>>> > - Would a return of the
>>> method with the number of handled
>>> > references and logging that
>>> work?
>>>
>>> Yes, it's fine if
>>> HeapMonitoring::weak_oops_do() only returned the
>>> number of processed weak oops.
>>>
>>>
>>> Done also (but I admit I have not
>>> tested the output yet) :)
>>>
>>>
>>> > - Additionally, would you
>>> prefer it in a separate block with its
>>> > GCTraceTime?
>>>
>>> Yes. Both kinds of information
>>> is interesting: while the time taken is
>>> typically more important, the
>>> next question would be why, and the
>>> number of references typically
>>> goes a long way there.
>>>
>>> See above though, it is
>>> probably best to wait a bit.
>>>
>>>
>>> Agreed that I "could" wait but, if
>>> it's ok, I'll just refactor/remove this when we get closer to something
>>> final. Either, JDK-8173335
>>> has gone in and I will notice it
>>> now or it will soon and I can change it then.
>>>
>>>
>>> > > -
>>> threadLocalAllocBuffer.cpp:331: one more "TODO"
>>> > Removed it and added it to
>>> my personal todos to look at.
>>> > > >
>>> > > -
>>> threadLocalAllocBuffer.hpp: ThreadLocalAllocBuffer class
>>> > > documentation should be
>>> updated about the sampling additions. I
>>> > > would have no clue what
>>> the difference between "actual_end" and
>>> > > "end" would be from the
>>> given information.
>>> > If you are talking about the
>>> comments in this file, I made them more
>>> > clear I hope in the new
>>> webrev. If it was somewhere else, let me know
>>> > where to change.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20171023/75f197eb/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list