RFR(M, v7): JDK-8059036 : Implement Diagnostic Commands for heap and finalizerinfo

Wed May 20 11:19:58 UTC 2015

Dmitry,

I’ve look at the changes on the hotspot side.

vm/services/diagnosticCommand.hpp:

 270   static const char* impact() {
 271     return "Low";
 272   }

I wonder if the impact should be “Medium” instead. There aren’t any good guidelines for what impact means, but finalizerinfo does have non-negible impact.

test/serviceability/dcmd/gc/FinalizerInfoTest.java:

 69             while(wasInitialized != objectsCount);

I don’t get the point of this loop. wasInitialized will always be equal to objectsCount at this point.

 72             while(wasTrapped < 1);

Perhaps the System.gc() call should be inside this loop?

test/serviceability/dcmd/gc/HeapInfoTest.java:

Can you add some output verification here?

/Staffan

> On 18 maj 2015, at 14:17, Dmitry Samersoff <dmitry.samersoff at oracle.com> wrote:
> 
> Everyone,
> 
> Please review updated version of the fix:
> 
> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.07/
> 
> Most important part of the fix provided by Peter Levart, so all
> credentials belongs to him.
> 
> -Dmitry
> 
> On 2015-05-16 15:48, Peter Levart wrote:
>> 
>> 
>> On 05/16/2015 02:38 PM, Peter Levart wrote:
>>> 
>>> 
>>> On 05/16/2015 09:35 AM, Dmitry Samersoff wrote:
>>>> Derek,
>>>> 
>>>> Personally, I'm for volatile over explicit fence too.
>>>> 
>>>> So I'll change the code if Peter don't have objections.
>>> 
>>> There are no objections. There's one unneeded volatile store to .next
>>> field then in enqueue(), but it is followed by another volatile store,
>>> so there shouldn't be overhead on Intel and SPARC at least.
>>> 
>>> Regards, Peter
>> 
>> If you make .next field volatile, then perhaps you could also cache it's
>> value in reallyPoll() into a local variable, so that it is not read
>> twice. Like this:
>> 
>>    private Reference<? extends T> reallyPoll() {       /* Must hold lock */
>>        Reference<? extends T> r = head;
>>        if (r != null) {
>>            Reference rn = r.next; // HERE !!!
>>            head = (rn == r) ?
>>                null :
>>                rn; // Unchecked due to the next field having a raw type
>> in Reference
>>            r.queue = NULL;
>>            r.next = r;
>>            queueLength--;
>>            if (r instanceof FinalReference) {
>>                sun.misc.VM.addFinalRefCount(-1);
>>            }
>>            return r;
>>        }
>>        return null;
>>    }
>> 
>> 
>> 
>> Peter
>> 
>> 
>>> 
>>>> -Dmitry
>>>> 
>>>> On 2015-05-16 01:59, Derek White wrote:
>>>>> Hi Dmitry, Peter,
>>>>> 
>>>>> I should read my email more frequently - I missed Dmitry's last webrev
>>>>> email.
>>>>> 
>>>>> My comments on a preference for volatile over Unsafe are not a strong
>>>>> objection to using Unsafe fences. I just don't have enough experience
>>>>> with Unsafe fences yet to agree that this use is correct.
>>>>> 
>>>>> While looking over this Reference code I found 3-6 really simple things
>>>>> that need cleaning up that have been there for years, so I'm not a big
>>>>> cheerleader for more complicated things here :-)
>>>>> 
>>>>> - Derek
>>>>> 
>>>>> On 5/15/15 6:46 PM, Derek White wrote:
>>>>>> Hi Peter,
>>>>>> 
>>>>>> Some comments below...
>>>>>> 
>>>>>> On 5/14/15 3:50 AM, Peter Levart wrote:
>>>>>>> Hi Derek,
>>>>>>> 
>>>>>>> On 05/13/2015 10:34 PM, Derek White wrote:
>>>>>>>> Hi Peter,
>>>>>>>> 
>>>>>>>> I don't have smoking gun evidence that your change introduces a bug,
>>>>>>>> but I have some concerns...
>>>>>>> I did have a concern too, ...
>>>>>>> 
>>>>>>>> On 5/12/15 6:05 PM, Peter Levart wrote:
>>>>>>>>> Hi Dmitry,
>>>>>>>>> 
>>>>>>>>> You iterate the queue then, not the unfinalized list. That's more
>>>>>>>>> logical.
>>>>>>>>> 
>>>>>>>>> Holding the queue's lock may pause reference handler and finalizer
>>>>>>>>> threads for the entire time of iteration. This can blow up the
>>>>>>>>> application. Suppose one wants to diagnose the application because
>>>>>>>>> he suspects that finalizer thread hardly keeps up with production
>>>>>>>>> of finalizable instances. This can happen if the allocation rate of
>>>>>>>>> finalizable objects is very high and/or finalize() methods are not
>>>>>>>>> coded to be fast enough. Suppose the queue of Finalizer(s) builds
>>>>>>>>> up gradually and is already holding several million objects. Now
>>>>>>>>> you initiate the diagnostic command to print the queue. The
>>>>>>>>> iteration over and grouping/counting of several millions of
>>>>>>>>> Finalizer(s) takes some time. This blocks finalizer thread and
>>>>>>>>> reference handler thread. But does not block the threads that
>>>>>>>>> allocate finalizable objects. By the time the iteration is over,
>>>>>>>>> the JVM blows up and application slows down to a halt doing GCs
>>>>>>>>> most of the time, getting OOMEs, etc...
>>>>>>>>> 
>>>>>>>>> It is possible to iterate the elements of the queue for diagnostic
>>>>>>>>> purposes without holding a lock. The modification required to do
>>>>>>>>> that is the following (havent tested this, but I think it should work):
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.01/
>>>>>>>>> 
>>>>>>>> One issue to watch out for is the garbage collectors inspect the
>>>>>>>> Reference.next field from C code directly (to distinguish active vs.
>>>>>>>> pending, iterate over oops, etc.). You can search hotspot for
>>>>>>>> java_lang_ref_Reference::next*, java_lang_ref_Reference::set_next*.
>>>>>>>> 
>>>>>>>> Your change makes "inactive" References superficially look like
>>>>>>>> "enqueued" References. The GC code is rather subtle and I haven't
>>>>>>>> yet seen a case where it would get confused by this change, but
>>>>>>>> there could easily be something like that lurking in the GC code.
>>>>>>> ...but then I thought that GC can in no way treat a Reference
>>>>>>> differently whether it is still enqueued in a ReferenceQueue or
>>>>>>> already dequeued (inactive) - responsibility is already on the Java
>>>>>>> side. Currently the definition of Reference.next is this:
>>>>>>> 
>>>>>>>    /* When active:   NULL
>>>>>>>     *     pending:   this
>>>>>>>     *    Enqueued:   next reference in queue (or this if last)
>>>>>>>     *    Inactive:   this
>>>>>>>     */
>>>>>>>    @SuppressWarnings("rawtypes")
>>>>>>>    Reference next;
>>>>>>> 
>>>>>>> We see that, unless GC inspects all ReferenceQueue instances and
>>>>>>> scans their lists too, the state of a Reference that is enqueued as
>>>>>>> last in list is indistinguishable from the state of inactive
>>>>>>> Reference. So I deduced that this distinction (enqueued/inactive)
>>>>>>> can't be established solely on the value of .next field ( == this or
>>>>>>> != this)...
>>>>>>> 
>>>>>>> But I should inspect the GC code too to build a better understanding
>>>>>>> of that part of the story...
>>>>>>> 
>>>>>>> Anyway. It turns out that there is already enough state in Reference
>>>>>>> to destinguish between it being enqueued as last in list and already
>>>>>>> dequeued (inactive) - the additional state is Reference.queue that
>>>>>>> transitions from ReferenceQueue.ENQUEUED -> ReferenceQueue.NULL in
>>>>>>> ReferenceQueue.reallyPoll. We need to just make sure the two fields
>>>>>>> (r.next and r.queue) are assigned and read in correct order. This can
>>>>>>> be achieved either by making Reference.next a volatile field or by a
>>>>>>> couple of explicit fences:
>>>>>>> 
>>>>>>> 
>>>>>>> http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.02/
>>>>>>> 
>>>>>>> The assignment of r.queue to ReferenceQueue.ENQUEUED already happens
>>>>>>> before linking the reference into the queue's head in
>>>>>>> ReferenceQueue.enqueue():
>>>>>>> 
>>>>>>>            r.queue = ENQUEUED;
>>>>>>>            r.next = (head == null) ? r : head;
>>>>>>>            head = r;
>>>>>>> 
>>>>>>> Both stores are volatile.
>>>>>> This sounds a bit better. I don't have a good handle on the pros and
>>>>>> cons of a volatile field over explicit fences via Unsafe in this case.
>>>>>> Kim might jump in soon.
>>>>>> 
>>>>>> I'd like to suggest an entirely less-clever solution. Since the
>>>>>> problem is that forEach() might see inconsistent values for the queue
>>>>>> and next fields of a Reference, but we don't want to grab a lock
>>>>>> walking the whole queue, we could instead grab the lock as we look at
>>>>>> each Reference (and do the "back-up" trick if we were interfered
>>>>>> with). Of course this is slower than going lock-free, but it's not any
>>>>>> more overhead than the ReferenceHandler thread or FinalizerThreads incur.
>>>>>> 
>>>>>> The only benefit of this solution over the others is that it is less
>>>>>> clever, and I'm not sure how much cleverness this problem deserves.
>>>>>> Given that, and ranking the solutions as "lock" < (clever) "volatile"
>>>>>> < "fence", I'd be happier with the "volatile" or "lock" solution if
>>>>>> that is sufficient.
>>>>>> 
>>>>>> - Derek
>>>>>>>>> I also suggest the addition to the ReferenceQueue to be contained
>>>>>>>>> (package-private) and as generic as possible. That's why I suggest
>>>>>>>>> forEach iteration method with no concrete logic.
>>>>>>>>> 
>>>>>>>>> It would be possible to encapsulate the entire logic into a special
>>>>>>>>> package-private class (say java.lang.ref.DiagnosticCommands) with
>>>>>>>>> static method(s) (printFinalizationQueue)... You could just expose
>>>>>>>>> a package-private forEach static method from Finalizer and code the
>>>>>>>>> rest in DiagnosticCommands.
>>>>>>>> That's good for encapsulation. But I'm concerned that if "forEach"
>>>>>>>> got exposed beyond careful use in DiagnosticCommands, and the
>>>>>>>> Referents were leaked back into the heap, then we could get
>>>>>>>> unexpected object resurrection after finalization. This isn't a bug
>>>>>>>> on it's own - any finalizer might resurrect its object if not
>>>>>>>> careful, but ordinarily /only/ the finalizer could resurrect the
>>>>>>>> object. This could invalidate application invariants?
>>>>>>> Well, it all stays in the confines of package-private API - internal
>>>>>>> to JDK. Any future additional use should be reviewed carefully.
>>>>>>> Comments on the forEach() method should warn about that.
>>>>>>> 
>>>>>>>> I agree that using a lock over the ReferenceQueue iteration opens up
>>>>>>>> /another/ case of diagnostics affecting application behavior. And
>>>>>>>> there are pathological scenarios where this gets severe. This is
>>>>>>>> unfortunate but not uncommon. There is enough complication here that
>>>>>>>> you should be sure that the fix for diagnostics performance doesn't
>>>>>>>> introduce subtle bugs to the system in general. A change in this
>>>>>>>> area should get a thorough review from both the runtime and GC sides.
>>>>>>> Is the webrev.02 proposed above more acceptable in that respect? It
>>>>>>> does not introduce any logical changes to existing code.
>>>>>>> 
>>>>>>>> Better yet, the reference handling code in GC and runtime should
>>>>>>>> probably be thrown out and re-written, but that's another issue :-)
>>>>>>> I may have some proposals in that direction. Stay tuned.
>>>>>>> 
>>>>>>> Regards, Peter
>>>>>>> 
>>>>>>>> - Derek
>>>>>>>> 
>>>>>>>>> Regards, Peter
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 05/12/2015 07:10 PM, Dmitry Samersoff wrote:
>>>>>>>>>> Everybody,
>>>>>>>>>> 
>>>>>>>>>> Updated version:
>>>>>>>>>> 
>>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.03/
>>>>>>>>>> 
>>>>>>>>>> Now it iterates over queue and output result sorted by number of instances.
>>>>>>>>>> 
>>>>>>>>>> -Dmitry
>>>>>>>>>> 
>>>>>>>>>> On 2015-05-07 00:51, Derek White wrote:
>>>>>>>>>>> Hi Dmitry, Staffan,
>>>>>>>>>>> 
>>>>>>>>>>> Lots of good comments here.
>>>>>>>>>>> 
>>>>>>>>>>> On the topic of what list should be printed out, I think we should focus
>>>>>>>>>>> on objects waiting to be finalized - e.g. the contents of the
>>>>>>>>>>> ReferenceQueue. It's more of a pain to walk the ReferenceQueue, but you
>>>>>>>>>>> could add a summerizeQueue(TreeMap<String, Integer>) method, or a
>>>>>>>>>>> general iterator and lambda.
>>>>>>>>>>> 
>>>>>>>>>>> A histogram of objects with finalize methods might also be interesting,
>>>>>>>>>>> but you can get most of the same information from a heap histogram
>>>>>>>>>>> (ClassHistogramDCmd, and search for count of Finalizer instances).
>>>>>>>>>>> 
>>>>>>>>>>> BTW, by only locking the ReferenceQueue, we should only be blocking the
>>>>>>>>>>> FinalizerThread from making progress (and I think there's a GC thread
>>>>>>>>>>> that runs after GC that sorts found References objects that need
>>>>>>>>>>> processing into their respective ReferenceQueues). But locking the
>>>>>>>>>>> "unfinalized" list blocks initializing any object with a finalize() method.
>>>>>>>>>>> 
>>>>>>>>>>> The sorting suggestion is a nice touch.
>>>>>>>>>>> 
>>>>>>>>>>> - Derek White, GC team
>>>>>>>>>>> 
>>>>>>>>>>> On 5/5/15 10:40 AM, Peter Levart wrote:
>>>>>>>>>>>> Hi Dmitry, Staffan,
>>>>>>>>>>>> 
>>>>>>>>>>>> On 05/05/2015 12:38 PM, Staffan Larsen wrote:
>>>>>>>>>>>>> Dmitry,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I think this should be reviewed on hotspot-gc and core-libs-dev as
>>>>>>>>>>>>> well considering the changes to Finalizer.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I’m a little worried about the potentially very large String that is
>>>>>>>>>>>>> returned from printFinalizationQueue(). A possible different approach
>>>>>>>>>>>>> would be to write printFinalizationQueue() in C++ inside Hotspot.
>>>>>>>>>>>>> That would allow for outputting one line at a time. The output would
>>>>>>>>>>>>> still be saved in memory (since the stream is buffered), but at least
>>>>>>>>>>>>> the data is only saved once in memory, then. It would make the code a
>>>>>>>>>>>>> bit harder to write, so its a question of how scared we are of
>>>>>>>>>>>>> running out of memory.
>>>>>>>>>>>> If the output is just a histogram of # of instances per class name,
>>>>>>>>>>>> then it should not be too large, as there are not many different
>>>>>>>>>>>> classes overriding finalize() method (I counted 72 overriddings of
>>>>>>>>>>>> finalize() method in the whole jdk/src tree).
>>>>>>>>>>>> 
>>>>>>>>>>>> I'm more concerned about the fact that while traversing the list, a
>>>>>>>>>>>> lock is held, which might impact prompt finalization(). Is it
>>>>>>>>>>>> acceptable for diagnostic output to impact performance and/or
>>>>>>>>>>>> interfere with synchronization?
>>>>>>>>>>>> 
>>>>>>>>>>>> It might be possible to scan the list without holding a lock for
>>>>>>>>>>>> diagnostic purposes if Finalizer.remove() didn't set the removed
>>>>>>>>>>>> Finalizer.next pointer to point back to itself:
>>>>>>>>>>>> 
>>>>>>>>>>>>    private void remove() {
>>>>>>>>>>>>        synchronized (lock) {
>>>>>>>>>>>>            if (unfinalized == this) {
>>>>>>>>>>>>                if (this.next != null) {
>>>>>>>>>>>>                    unfinalized = this.next;
>>>>>>>>>>>>                } else {
>>>>>>>>>>>>                    unfinalized = this.prev;
>>>>>>>>>>>>                }
>>>>>>>>>>>>            }
>>>>>>>>>>>>            if (this.next != null) {
>>>>>>>>>>>>                this.next.prev = this.prev;
>>>>>>>>>>>>            }
>>>>>>>>>>>>            if (this.prev != null) {
>>>>>>>>>>>>                this.prev.next = this.next;
>>>>>>>>>>>>            }
>>>>>>>>>>>>            // this.next = this; must not be set so that we can
>>>>>>>>>>>> traverse the list unsynchronized
>>>>>>>>>>>>            this.prev = this;   /* Indicates that this has been
>>>>>>>>>>>> finalized */
>>>>>>>>>>>>        }
>>>>>>>>>>>>    }
>>>>>>>>>>>> 
>>>>>>>>>>>> For detecting whether a Finalizer is already processed, the 'prev'
>>>>>>>>>>>> pointer could be used instead:
>>>>>>>>>>>> 
>>>>>>>>>>>>    private boolean hasBeenFinalized() {
>>>>>>>>>>>>        return (prev == this);
>>>>>>>>>>>>    }
>>>>>>>>>>>> 
>>>>>>>>>>>> Also, to make sure a data race dereferencing 'unfinalized' in
>>>>>>>>>>>> unsynchronized printFinalizationQueue() would get you a fully
>>>>>>>>>>>> initialized Finalizer instance (in particular the next pointer), you
>>>>>>>>>>>> would have to make the 'unfinalized' field volatile or insert an
>>>>>>>>>>>> Unsafe.storeFence() before assigning to 'unfinalized':
>>>>>>>>>>>> 
>>>>>>>>>>>>    private void add() {
>>>>>>>>>>>>        synchronized (lock) {
>>>>>>>>>>>>            if (unfinalized != null) {
>>>>>>>>>>>>                this.next = unfinalized;
>>>>>>>>>>>>                unfinalized.prev = this;
>>>>>>>>>>>>            }
>>>>>>>>>>>>            // make sure a data race dereferencing 'unfinalized'
>>>>>>>>>>>>            // in printFinalizationQueue() does see the Finalizer
>>>>>>>>>>>>            // instance fully initialized
>>>>>>>>>>>>            // (in particular the 'next' pointer)
>>>>>>>>>>>>            U.storeFence();
>>>>>>>>>>>>            unfinalized = this;
>>>>>>>>>>>>        }
>>>>>>>>>>>>    }
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> By doing these modifications, I think you can remove
>>>>>>>>>>>> synchronized(lock) {} from printFinalizationQueue().
>>>>>>>>>>>> 
>>>>>>>>>>>>> I see you are traversing the ‘unfinalized’ list in Finalizer, whereas
>>>>>>>>>>>>> the old SA code for ‘-finalizerinfo' traverses the ‘queue’ list. I am
>>>>>>>>>>>>> not sure of the difference, perhaps someone from the GC team can help
>>>>>>>>>>>>> out?
>>>>>>>>>>>> The 'queue' is a ReferenceQueue of Finalizer (FinalReference)
>>>>>>>>>>>> instances pending processing by finalizer thread because their
>>>>>>>>>>>> referents are eligible for finalization (they are not reachable any
>>>>>>>>>>>> more). The 'unfinalized' is a doubly-linked list of all Finalizer
>>>>>>>>>>>> instances for which their referents have not been finalized yet
>>>>>>>>>>>> (including those that are still reachable and alive). The later serves
>>>>>>>>>>>> two purposes:
>>>>>>>>>>>> - it keeps Finalizer instances reachable until they are processed
>>>>>>>>>>>> - it is a source of unfinalized instances for
>>>>>>>>>>>> running-finalizers-on-exit if requested by
>>>>>>>>>>>> System.runFinalizersOnExit(true);
>>>>>>>>>>>> 
>>>>>>>>>>>> So it really depends on what one would like to see. Showing the queue
>>>>>>>>>>>> may be interesting if one wants to see how the finalizer thread is
>>>>>>>>>>>> coping with processing the finalize() invocations. Showing unfinalized
>>>>>>>>>>>> list may be interesting if one wants to know how many live +
>>>>>>>>>>>> finalization pending instances are there on the heap that override
>>>>>>>>>>>> finalize() method.
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards, Peter
>>>>>>>>>>>> 
>>>>>>>>>>>>> For the output, it would be a nice touch to sort it on the number of
>>>>>>>>>>>>> references for each type. Perhaps outputting it more like a table
>>>>>>>>>>>>> (see the old code again) would also make it easier to digest.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> In diagnosticCommand.hpp, the new commands should override
>>>>>>>>>>>>> permission() and limit access:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   static const JavaPermission permission() {
>>>>>>>>>>>>>     JavaPermission p = {"java.lang.management.ManagementPermission",
>>>>>>>>>>>>>                         "monitor", NULL};
>>>>>>>>>>>>>     return p;
>>>>>>>>>>>>>   }
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The two tests don’t validate the output in any way. Would it be
>>>>>>>>>>>>> possible to add validation? Perhaps hard to make sure an object is on
>>>>>>>>>>>>> the finalizer queue… Hmm.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 5 maj 2015, at 12:01, Dmitry Samersoff
>>>>>>>>>>>>>> <dmitry.samersoff at oracle.com> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Everyone,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Please review the fix:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.01/
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> heap dcmd outputs the same information as SIGBREAK
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> finalizerinfo dcmd outputs a list of all classes in finalization queue
>>>>>>>>>>>>>> with count
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>> Dmitry Samersoff
>>>>>>>>>>>>>> Oracle Java development team, Saint Petersburg, Russia
>>>>>>>>>>>>>> * I would love to change the world, but they won't give me the sources.
>>> 
>> 
> 
> 
> -- 
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the sources.