Low-Overhead Heap Profiling
Erik Österlund
erik.osterlund at oracle.com
Fri Feb 2 16:44:37 UTC 2018
Hi JC,
Hope I am reviewing the right version of your work. Here goes...
src/hotspot/share/gc/shared/collectedHeap.inline.hpp:
159 AllocTracer::send_allocation_outside_tlab(klass, result, size
* HeapWordSize, THREAD);
160
161 THREAD->tlab().handle_sample(THREAD, result, size);
162 return result;
163 }
Should not call tlab()->X without checking if (UseTLAB) IMO.
src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp:
So first of all, there seems to quite a few ends. There is an "end", a
"hard end", a "slow path end", and an "actual end". Moreover, it seems
like the "hard end" is actually further away than the "actual end". So
the "hard end" seems like more of a "really definitely actual end" or
something. I don't know about you, but I think it looks kind of messy.
In particular, I don't feel like the name "actual end" reflects what it
represents, especially when there is another end that is behind the
"actual end".
413 HeapWord* ThreadLocalAllocBuffer::hard_end() {
414 // Did a fast TLAB refill occur?
415 if (_slow_path_end != _end) {
416 // Fix up the actual end to be now the end of this TLAB.
417 _slow_path_end = _end;
418 _actual_end = _end;
419 }
420
421 return _actual_end + alignment_reserve();
422 }
I really do not like making getters unexpectedly have these kind of side
effects. It is not expected that when you ask for the "hard end", you
implicitly update the "slow path end" and "actual end" to new values.
src/hotspot/share/prims/jvmti.xml:
10357 <capabilityfield id="can_sample_heap" since="9">
10358 <description>
10359 Can sample the heap.
10360 If this capability is enabled then the heap sampling
methods can be called.
10361 </description>
10362 </capabilityfield>
Looks like this capability should not be "since 9" if it gets integrated
now.
src/hotspot/share/runtime/heapMonitoring.cpp:
448 if (is_alive->do_object_b(value)) {
449 // Update the oop to point to the new object if it is
still alive.
450 f->do_oop(&(trace.obj));
451
452 // Copy the old trace, if it is still live.
453 _allocated_traces->at_put(curr_pos++, trace);
454
455 // Store the live trace in a cache, to be served up on /heapz.
456 _traces_on_last_full_gc->append(trace);
457
458 count++;
459 } else {
460 // If the old trace is no longer live, add it to the list of
461 // recently collected garbage.
462 store_garbage_trace(trace);
463 }
In the case where the oop was not live, I would like it to be explicitly
cleared.
Also I see a lot of concurrent-looking use of the following field:
267 volatile bool _initialized;
Please note that the "volatile" qualifier does not help with reordering
here. Reordering between volatile and non-volatile fields is completely
free for both compiler and hardware, except for windows with MSVC, where
volatile semantics is defined to use acquire/release semantics, and the
hardware is TSO. But for the general case, I would expect this field to
be stored with OrderAccess::release_store and loaded with
OrderAccess::load_acquire. Otherwise it is not thread safe.
As a kind of meta comment, I wonder if it would make sense to add
sampling for non-TLAB allocations. Seems like if someone is rapidly
allocating a whole bunch of 1 MB objects that never fit in a TLAB, I
might still be interested in seeing that in my traces, and not get
surprised that the allocation rate is very high yet not showing up in
any profiles.
Thanks,
/Erik
On 2018-01-26 06:45, JC Beyler wrote:
> Thanks Robbin for the reviews :)
>
> The new full webrev is here:
> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/
> The incremental webrev is here:
> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/
>
> I inlined my answers:
>
> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>> Hi JC, great to see another revision!
>>
>> ####
>> heapMonitoring.cpp
>>
>> StackTraceData should not contain the oop for 'safety' reasons.
>> When StackTraceData is moved from _allocated_traces:
>> L452 store_garbage_trace(trace);
>> it contains a dead oop.
>> _allocated_traces could instead be a tupel of oop and StackTraceData thus
>> dead oops are not kept.
> Done I used inheritance to make the copier work regardless but the
> idea is the same.
>
>> You should use the new Access API for loading the oop, something like this:
>> RootAccess<ON_PHANTOM_OOP_REF | AS_NO_KEEPALIVE>::load(...)
>> I don't think you need to use Access API for clearing the oop, but it would
>> look nicer. And you shouldn't probably be using:
>> Universe::heap()->is_in_reserved(value)
> I am unfamiliar with this but I think I did do it like you wanted me
> to (all tests pass so that's a start). I'm not sure how to clear the
> oop exactly, is there somewhere that does that, which I can use to do
> the same?
>
> I removed the is_in_reserved, this came from our internal version, I
> don't know why it was there but my tests work without so I removed it
> :)
>
>
>> The lock:
>> L424 MutexLocker mu(HeapMonitorStorage_lock);
>> Is not needed as far as I can see.
>> weak_oops_do is called in a safepoint, no TLAB allocation can happen and
>> JVMTI thread can't access these data-structures. Is there something more to
>> this lock that I'm missing?
> Since a thread can call the JVMTI getLiveTraces (or any of the other
> ones), it can get to the point of trying to copying the
> _allocated_traces. I imagine it is possible that this is happening
> during a GC or that it can be started and a GC happens afterwards.
> Therefore, it seems to me that you want this protected, no?
>
>
>> ####
>> You have 6 files without any changes in them (any more):
>> g1CollectedHeap.cpp
>> psMarkSweep.cpp
>> psParallelCompact.cpp
>> genCollectedHeap.cpp
>> referenceProcessor.cpp
>> thread.hpp
>>
> Done.
>
>> ####
>> I have not looked closely, but is it possible to hide heap sampling in
>> AllocTracer ? (with some minor changes to the AllocTracer API)
>>
> I am imagining that you are saying to move the code that does the
> sampling code (change the tlab end, do the call to HeapMonitoring,
> etc.) into the AllocTracer code itself? I think that is right and I'll
> look if that is possible and prepare a webrev to show what would be
> needed to make that happen.
>
>> ####
>> Minor nit, when declaring pointer there is a little mix of having the
>> pointer adjacent by type name and data name. (Most hotspot code is by type
>> name)
>> E.g.
>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = ....
>> heapMonitoring.cpp:733 Method* m = vfst.method();
>> (not just this file)
>>
> Done!
>
>> ####
>> HeapMonitorThreadOnOffTest.java:77
>> I would make g_tmp volatile, otherwise the assignment in loop may
>> theoretical be skipped.
>>
> Also done!
>
> Thanks again!
> Jc
More information about the serviceability-dev
mailing list