JDK-8171119: Low-Overhead Heap Profiling

Tue Feb 6 04:40:13 UTC 2018

Hi Jc,

I've just been browsing this and have a few comments/queries

src/hotspot/share/prims/jvmtiHeapTransition.hpp

In HeapThreadTransition what are the possible entry states? The primary 
state is presumably _in_native, but what else is expected? You can't 
transition from arbitrary states to _in_vm

Also in that file the include guard has the wrong name:

  #ifndef SHARE_VM_PRIMS_JVMTIHEAPSAMPLING_HPP

etc.

---

src/hotspot/share/runtime/heapMonitoring.cpp

Can you change StackTraceData from a struct to a class please.

StackTraceData::free_data could still be an instance method even if not 
done in the destructor. I also think the caller should do the delete 
rather than hiding it in here, as this obscures the lifecycle of the 
instances.

  215   void initialize(int max_storage) {
  216     MutexLocker mu(HeapMonitorStorage_lock);
  217     allocate_storage(max_storage);
  218     memset(&_stats, 0, sizeof(_stats));
  219   }

You're using a lock so obviously expect multiple threads to try this, 
but you aren't checking if initialization has already been done. That 
check is inside allocate_storage, but you're doing the memset 
unconditionally.

You're inconsistent with accessing/modifying _initialized under the 
HeapMonitorStorage_lock. The destructor calls free_storage with no lock 
(obviously) held. It's unclear how this lock-free destructor fits into 
the concurrent usage expected of this class. I see from your reply to 
Erik that you've now added some orderAccess usage to be "paranoid". But 
we don't want paranoid, we want correct. If there is expected to be a 
lock-free, but potentially concurrent path, then you will need to use 
OrderAccess appropriately. If everything is actually being done under a 
lock, then you don't need to use OrderAccess. But you need to be clear 
on exactly how concurrency comes into play with your code.

  397   _allocated_traces = new (ResourceObj::C_HEAP, mtInternal)
  398       GrowableArray<StackTraceData>(128, true);

These are allocations that will abort the VM if they fail - correct? If 
so that seems rather user-unfriendly for a profiling tool!

  422 void StackTraceStorage::weak_oops_do(BoolObjectClosure *is_alive,
  423                                      OopClosure *f) {
  424   MutexLocker mu(HeapMonitorStorage_lock);

IIUC oops-do methods get called at safepoints, but you are then taking a 
Mutex unconditionally. So any other code that acquires the same mutex 
must be guaranteed not to hit a safepoint check - so NoSafepointVerifier 
should be used to check that.

  575 void StackTraceStorage::store_garbage_trace(const StackTraceData 
&trace) {
  576   StackTraceData *new_trace = new StackTraceData();
  577   *new_trace = trace;

This looks quite odd to me. Shouldn't this just be using a copy-constructor?

Thanks,
David
-----

On 27/01/2018 2:01 PM, JC Beyler wrote:
> (Change of subject to contain the bug number :)).
> 
> New webrev is here:
> Incremental:
> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03_04/
> 
> Full webrev:
> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04/
> 
> Inlined are my answers again :).
> 
> On Fri, Jan 26, 2018 at 5:51 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>> Hi JC, (dropping compiler on mail thread)
>>
>> On 01/26/2018 06:45 AM, JC Beyler wrote:
>>>
>>> Thanks Robbin for the reviews :)
>>>
>>> The new full webrev is here:
>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/
>>> The incremental webrev is here:
>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/
>>
>>
>> Thanks!
>>
>> I got this compile issue:
>> /home/rehn/source/jdk/small/open/src/hotspot/share/gc/shared/threadLocalAllocBuffer.hpp:
>> In static member function 'static void
>> G1ResManTLABCache::put(ThreadLocalAllocBuffer&, AllocationContext_t)':
>> /home/rehn/source/jdk/small/open/src/hotspot/share/gc/shared/threadLocalAllocBuffer.hpp:97:13:
>> error: 'HeapWord* ThreadLocalAllocBuffer::hard_end()' is private
>>     HeapWord* hard_end();
>>               ^
>> /home/rehn/source/jdk/small/closed/src/hotspot/share/gc/g1/g1ResManTLABCache.cpp:52:50:
>> error: within this context
>>     size_t remaining = pointer_delta(tlab.hard_end(), tlab.top());
> 
> This is strange but I'm assuming it is because we are not working on
> the same repo?
> 
> I used:
> hg clone http://hg.openjdk.java.net/jdk/hs jdkhs-heap
> 
> I'll try a new clone on Monday and see. My new version moved hard_end
> back to public so it should work now.
> 
> 
> 
>>
>>>
>>> I inlined my answers:
>>>
>>> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>>>
>>>> Hi JC, great to see another revision!
>>>>
>>>> ####
>>>> heapMonitoring.cpp
>>>>
>>>> StackTraceData should not contain the oop for 'safety' reasons.
>>>> When StackTraceData is moved from _allocated_traces:
>>>> L452 store_garbage_trace(trace);
>>>> it contains a dead oop.
>>>> _allocated_traces could instead be a tupel of oop and StackTraceData thus
>>>> dead oops are not kept.
>>>
>>>
>>> Done I used inheritance to make the copier work regardless but the
>>> idea is the same.
>>
>>
>> Looks good.
>>
>>>
>>>>
>>>> You should use the new Access API for loading the oop, something like
>>>> this:
>>>> RootAccess<ON_PHANTOM_OOP_REF | AS_NO_KEEPALIVE>::load(...)
>>>> I don't think you need to use Access API for clearing the oop, but it
>>>> would
>>>> look nicer. And you shouldn't probably be using:
>>>> Universe::heap()->is_in_reserved(value)
>>>
>>>
>>> I am unfamiliar with this but I think I did do it like you wanted me
>>> to (all tests pass so that's a start). I'm not sure how to clear the
>>> oop exactly, is there somewhere that does that, which I can use to do
>>> the same?
>>>
>>> I removed the is_in_reserved, this came from our internal version, I
>>> don't know why it was there but my tests work without so I removed it
>>> :)
>>
>>
>> I talked a bit with GC folks about this today.
>> So actually the oop should be in the oopStorage and your should have a
>> weakhandle to that oop (at least in the near future).
>> But this should work for now...
>> In the future when we have the oop in oppStorage you will not get called, so
>> you will not know when the oops are dead, either GC must trigger a
>> concurrent vm operation (e.g. _no_ safepoint operation) so we can clear dead
>> oops or do periodic scanning.
>>
>> It would be good with some input here from Thomas that knows what you are
>> doing and knows GC.
> 
> Fair enough, hopefully Thomas will chime in. Are you saying that this
> first version could go in and we can work on a refinement? Or are you
> saying I should work on this now at the same time and fix it before
> this V1 goes in? (Just so I know :))
> 
>>
>>>
>>>
>>>>
>>>> The lock:
>>>> L424   MutexLocker mu(HeapMonitorStorage_lock);
>>>> Is not needed as far as I can see.
>>>> weak_oops_do is called in a safepoint, no TLAB allocation can happen and
>>>> JVMTI thread can't access these data-structures. Is there something more
>>>> to
>>>> this lock that I'm missing?
>>>
>>>
>>> Since a thread can call the JVMTI getLiveTraces (or any of the other
>>> ones), it can get to the point of trying to copying the
>>> _allocated_traces. I imagine it is possible that this is happening
>>> during a GC or that it can be started and a GC happens afterwards.
>>> Therefore, it seems to me that you want this protected, no?
> 
> Looks like it yes, I removed it, ran my tests and they work so the new
> webrev no longer has that mutex at all.
> 
>>
>>
>> A thread calling getLiveTraces will be stopped in the HeapThreadTransition.
>> A safepoint don't allow any threads to be in_vm or to be in_java during
>> safepoint. Only threads in native can execute, but they will be stopped on
>> any transition. If a thread is in_vm the safepoint waits to a transition
>> (locking a mutex is also transition to blocked). So if weak_oops is called
>> you have an guarantee that no threads are executing inside the VM or
>> executing Java code. (not counting GC threads of course)
>> This also means that the lock can never be contented when weak_oops is
>> called, so it's not harmful.
>>
>>>
>>>
>>>>
>>>> ####
>>>> You have 6 files without any changes in them (any more):
>>>> g1CollectedHeap.cpp
>>>> psMarkSweep.cpp
>>>> psParallelCompact.cpp
>>>> genCollectedHeap.cpp
>>>> referenceProcessor.cpp
>>>> thread.hpp
>>>>
>>>
>>> Done.
>>>
>>>> ####
>>>> I have not looked closely, but is it possible to hide heap sampling in
>>>> AllocTracer ? (with some minor changes to the AllocTracer API)
>>>>
>>>
>>> I am imagining that you are saying to move the code that does the
>>> sampling code (change the tlab end, do the call to HeapMonitoring,
>>> etc.) into the AllocTracer code itself? I think that is right and I'll
>>> look if that is possible and prepare a webrev to show what would be
>>> needed to make that happen.
>>
>>
>> Sounds good.
> 
> I'll look at this on Monday then!
> 
> Thanks for the reply and have a great weekend!
> Jc
> 
>>
>>>
>>>> ####
>>>> Minor nit, when declaring pointer there is a little mix of having the
>>>> pointer adjacent by type name and data name. (Most hotspot code is by
>>>> type
>>>> name)
>>>> E.g.
>>>> heapMonitoring.cpp:711     jvmtiStackTrace *trace = ....
>>>> heapMonitoring.cpp:733         Method* m = vfst.method();
>>>> (not just this file)
>>>>
>>>
>>> Done!
>>>
>>>> ####
>>>> HeapMonitorThreadOnOffTest.java:77
>>>> I would make g_tmp volatile, otherwise the assignment in loop may
>>>> theoretical be skipped.
>>>>
>>>
>>> Also done!
>>
>>
>> Looks good, thanks!
>>
>> /Robbin
>>
>>>
>>> Thanks again!
>>> Jc
>>>
>>