RFR(S): 8200720: Print additional information in thread dump (times, allocated bytes etc.)

David Holmes david.holmes at oracle.com
Tue Jun 5 02:53:05 UTC 2018


Hi Gunter,

On 5/06/2018 1:27 AM, Haug, Gunter wrote:
> Hi David, Chris,
> 
> would it be an option to unconditionally print the additional information? Regardless which way this is implemented it will be rather complicated and it only switches on/off a few field in the thread dump.

I'm not convinced this is all suitable information for a thread stack 
dump. I can see the time information being useful if using the dump to 
try and diagnoze a hang or responsiveness issue. But the allocated-bytes 
and classes-defined just doesn't seem useful in the context of a thread 
dump to me. Anyone interested in allocation stats is going to be looking 
at GC logs etc which is where this belongs. Ditto the class-stats belong 
in some kind of classloading logging IMHO.

I'm very reluctant to burden the implementation with capturing these 
kinds of stats, just to use for diagnostic purposes that may not even be 
asked for. I'm very much in the "you shouldn't pay for what you don't 
use" camp in that regard. (See my comments in JDK-8143176 referenced by 
Sean.)

The ThreadStatisticInfo adds overhead to every thread object - and I'd 
have to suspect there could be overlap with whatever flight recorder 
sticks in there too. (Thread has become even more bloated in recent time!).

Cheers,
David


> Thanks,
> Gunter
> 
> On 04.06.18, 01:13, "David Holmes" <david.holmes at oracle.com> wrote:
> 
>      Hi Gunter, Chris,
>      
>      Sorry just trying to catch up and this is only a partial look so far ...
>      
>      BTW these changes are not limited to serviceability code so adding in
>      runtime team as well.
>      
>      On 2/06/2018 9:12 AM, Chris Plummer wrote:
>      > Hi Gunter,
>      >
>      > On 6/1/18 3:17 AM, Haug, Gunter wrote:
>      >> Hi Chris,
>      >>
>      >> thanks for looking into this!
>      >>
>      >> Re the synchronization: The value is stored only in a VM operation at
>      >> a safepoint by the VM thread. I was of the opinion, that this could
>      >> not be interrupted by a second VM operation (of the same type). Or am
>      >> I missing something else?
>      > I was really thinking more about the temporary changing of
>      > PrintExtendedThreadInfo, not the value stored in the VMOp. You may be be
>      > correct that no more than one VMOp is executing, but while it is
>      > executing it is has changed the value of PrintExtendedThreadInfo, which
>      > might have an impact on anything else that triggers printing thread info
>      > while the VMOp is executing.
>      
>      Even if nothing else can possibly be running during the safepoint I find
>      it extremely bad form to change a command-line flag in this way,
>      particularly from a safepoint!
>      
>      If this flag is intended to be dynamically enabled as part of a dcmd
>      then it should also be a Manageable flag IMHO.
>      
>      That said ...
>      
>      >>   I did think about passing an argument to the various print_on member
>      >> functions of the thread classes, but this would require rather
>      >> extensive code changes for a rather tiny extension. Therefore I feel
>      >> doing it like this is the lesser evil.
>      
>      ... it's obviously not truly a global setting, but one that is needed on
>      a per-print-request basis. The flag is just the default, but if the
>      default is off you still want to enable extended printing on a
>      per-request basis.
>      
>      The current print_on mechanics is not set up for this kind of
>      flexibility. I think this needs more thought.
>      
>      ---
>      
>      Re osThread.cpp shared code changes ... I really hate to see all the
>      ifdefs in there. Please add a pd_print_on function to support this if
>      you truly want platform specific information.
>      
>      ---
>      
>      threadStatisticInfo.hpp
>      
>      typo: getElepsedTime() -> getElapsedTime()
>      
>      Thanks,
>      David
>      
>      
>      
>      >>
>      >> Re thread_dump(): I think it's correct or, well, at least it works ;-)
>      >> In fact jstack will transfer the "-e" and "-l" in only one string,
>      >> i.e. op->arg(0).
>      > So you are saying that op->arg(0) is "-e -l" when using jstack? I think
>      > you really need to clean up the parsing. As it stands right now passing,
>      > I get the feeling that if a user erroneously asks for help by using
>      > "jcmd <pid> Thread.Print -help", it will end up executing with -e an -l
>      > enabled, and no failure for the invalid "-help" option.
>      >>   Christoph has already explained my reasoning. But I agree, it's not
>      >> nice and I would be happy to do it like Christoph suggested.
>      > I'll respond separately to his suggestion.
>      >
>      > thanks,
>      >
>      > Chris
>      >>
>      >> And typo fixed, sorry.
>      >>
>      >> Thanks again,
>      >> Gunter
>      >>
>      >> On 01.06.18, 00:03, "Chris Plummer" <chris.plummer at oracle.com> wrote:
>      >>
>      >>      Hi Gunter,
>      >>      globals.hpp: fix typo "informatiuon"
>      >>      I worry a little bit about the synchronizing (if that's the right
>      >> word)
>      >>      of PrintExtendedThreadInfo and the dcmd's -e flag. When using -e,
>      >> you
>      >>      are temporarily enabling PrintExtendedThreadInfo if it was false.
>      >> This
>      >>      temporarily changes the behavior of thread dumps, and could
>      >> impact other
>      >>      uses that happen in parallel. Also, could two simultaneous uses
>      >> of -e
>      >>      result in PrintExtendedThreadInfo not getting restored properly?
>      >>      thread_dump() doesn't look right. It looks like you are iterating
>      >> char
>      >>      by char over the argument, and expect something like "-el" to be
>      >>      specified rather then "-e -l". The loop should be iterating over
>      >>      op->arg(i), not op->arg(0)[i].
>      >>      The rest of the changes look fine.
>      >>      thanks,
>      >>      Chris
>      >>      On 5/30/18 8:12 AM, Haug, Gunter wrote:
>      >>      > Hi all,
>      >>      >
>      >>      > As Chris proposed, I have made an the extended output
>      >> switchable. There is an VM flag (PrintExtendedThreadInfo), which is
>      >> false by default. Moreover, there is an Option (-e) which can be used
>      >> with jcmd Thread.print as well as with jstack. The -e option
>      >> essentially sets PrintExtendedThreadInfo true just for the respective
>      >> thread dump.
>      >>      >
>      >>      > Here is the updated webrev:
>      >>      >
>      >>      > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v2
>      >>      >
>      >>      > (https://bugs.openjdk.java.net/browse/JDK-8200720)
>      >>      >
>      >>      > Thanks,
>      >>      > Gunter
>      >>      >
>      >>      >
>      >>      > On 02.05.18, 17:07, "serviceability-dev on behalf of Haug,
>      >> Gunter" <serviceability-dev-bounces at openjdk.java.net on behalf of
>      >> gunter.haug at sap.com> wrote:
>      >>      >
>      >>      >      Hi Chris,
>      >>      >
>      >>      >      Thanks for looking into this.
>      >>      >      You're right, there is a little more we have. We have
>      >> implemented an IO tracing mechanism which - rather as a byproduct -
>      >> keeps track of bytes read and written per thread. However, this of
>      >> course requires changes not only in hotspot. We would be happy to
>      >> contribute this as well, but this is a far bigger change and will
>      >> probably lead to a far bigger discussion. Anyway, with the number of
>      >> bytes read, the number of classes defined doesn't look that arbitrary
>      >> anymore, as one can correlate IO to class loading.
>      >>      >
>      >>      >      Regarding the verbose option I think that's a good idea!
>      >>      >
>      >>      >      Thanks again,
>      >>      >      Gunter
>      >>      >
>      >>      >      On 01.05.18, 22:55, "Chris Plummer"
>      >> <chris.plummer at oracle.com> wrote:
>      >>      >
>      >>      >          Hi Gunter,
>      >>      >
>      >>      >          The output you are adding is all useful. I think the
>      >> question is (and
>      >>      >          I'd like to see a few people chime in on this for this
>      >> review) is
>      >>      >          whether or not all of it is the appropriate for a
>      >> thread's stack dump.
>      >>      >          For example, defined_classes is on the fringe of what
>      >> I would call
>      >>      >          generally useful info in a stack dump. Sure, there
>      >> might be that rare
>      >>      >          case when you need it, but how often compared to other
>      >> useful info
>      >>      >          maintained on a per thread basis. How many other bits
>      >> of useful info are
>      >>      >          not being printed in favor of defined_classes? It
>      >> seems you have more in
>      >>      >          the queue. How many? I'm worried about how cluttered
>      >> the stack dumps
>      >>      >          will get. Maybe we should add some sort of verbose
>      >> thread dumping
>      >>      >          option. Just a thought.
>      >>      >
>      >>      >          As for the implementation, overall it looks good, but
>      >> I can't speak to
>      >>      >          whether or not you are doing proper accounting of
>      >> defined_classes and
>      >>      >          bytes allocated. You'll need input from someone with
>      >> more knowledge of
>      >>      >          those areas. We'll also need to do some testing to
>      >> make sure tool tests
>      >>      >          are not impacted.
>      >>      >
>      >>      >          thanks,
>      >>      >
>      >>      >          Chris
>      >>      >
>      >>      >          On 4/30/18 2:51 AM, Haug, Gunter wrote:
>      >>      >          > Hi,
>      >>      >          >
>      >>      >          > this is an update to an RFR I posted on hotspot-dev,
>      >> but it is probably more suitable to post it here. Can I please have a
>      >> review and a sponsor for the following enhancement:
>      >>      >          >
>      >>      >          > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v1
>      >>      >          > https://bugs.openjdk.java.net/browse/JDK-8200720
>      >>      >          >
>      >>      >          > We at SAP have extended the thread dumps (obtained
>      >> by jstack or jcmd) by several fields providing thread specific
>      >> information. These extensions are quite popular with our support team.
>      >> With some knowledge of the architecture of the application, they often
>      >> allow for quick and simple diagnosis of a running system. Therefore we
>      >> would like to contribute these enhancements.
>      >>      >          >
>      >>      >          > I took a few simple examples here, namely cpu time,
>      >> elapsed time since thread creation, bytes allocated and classes
>      >> defined by the thread and the pthread-id or equivalent on platforms
>      >> where it makes sense. Provided it is known how the application should
>      >> behave, a misbehaving thread can identified easily.
>      >>      >          >
>      >>      >          > There is no measurable overhead for this
>      >> enhancement. However, it may be a problem that the format of the
>      >> output is changed. Tools parsing the output may have to be changed.
>      >>      >          >
>      >>      >          > Here is an example of the output generated:
>      >>      >          >
>      >>      >          > ------------------------------------------------------
>      >>      >          >
>      >>      >          > "main" #1 prio=5 os_prio=31 cpu=6300.65ms
>      >> elapsed=123.28s allocated=242236760B defined_classes=1725
>      >> tid=0x00007fa13a806000 nid=0x1c03 pthread-id=0x109708000 waiting on
>      >> condition [0x0000000109707000]
>      >>      >          >     java.lang.Thread.State: TIMED_WAITING (sleeping)
>      >>      >          >     JavaThread state: _thread_blocked
>      >>      >          > Thread: 0x00007fa13a806000 [0x1c03] State:
>      >> _at_safepoint _has_called_back 0 _at_poll_safepoint 0
>      >>      >          >     JavaThread state: _thread_blocked
>      >>      >          > at java.lang.Thread.sleep(java.base/Native Method)
>      >>      >          >             ...
>      >>      >          > ------------------------------------------------------
>      >>      >          >
>      >>      >          > As mentioned above, we have a whole bunch of other
>      >> enhancements to the thread dump similar to this one and would be
>      >> willing to contribute them if there is any interest.
>      >>      >          >
>      >>      >          > Thanks and best regards,
>      >>      >          > Gunter
>      >>      >          >
>      >>      >          >
>      >>      >
>      >>      >
>      >>      >
>      >>      >
>      >>      >
>      >>
>      >
>      >
>      
> 


More information about the hotspot-runtime-dev mailing list