RFR(S): 8200720: Print additional information in thread dump (times, allocated bytes etc.)

Langer, Christoph christoph.langer at sap.com
Mon Jun 11 07:30:30 UTC 2018


Hi Gunter,

the change looks good to me now.

I would think it is most pragmatic if parsing in the thread_dump function of attachListener.cpp could be done as you proposed. Let's wait if Chris is fine with it.

Small nit:
In src/jdk.jcmd/share/classes/sun/tools/jstack/JStack.java, around line 70, I'd prefer if you could write the if statements like this:

                if (arg.equals("-l")) {
                    locks = true;
                } else if (arg.equals("-e")) {
                    extended = true;
                } else {
                    usage(1);
                }

That'll keep the indentation a bit more flat...

Best regards
Christoph

> -----Original Message-----
> From: Haug, Gunter
> Sent: Freitag, 8. Juni 2018 16:05
> To: David Holmes <david.holmes at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; Chris Plummer
> <chris.plummer at oracle.com>; serviceability-dev <serviceability-
> dev at openjdk.java.net>; Langer, Christoph <christoph.langer at sap.com>;
> hotspot-runtime-dev at openjdk.java.net; Kirk Pepperdine
> <kirk at kodewerk.com>; Thomas Stüfe <thomas.stuefe at gmail.com>
> Subject: Re: RFR(S): 8200720: Print additional information in thread dump
> (times, allocated bytes etc.)
> 
> Hi all,
> 
> thanks a lot for all the input! I have prepared a new version of the webrev
> incorporating the suggestions you made (at least I tried):
> 
> http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v3/
> 
> This version outputs the thread times unconditionally while the other
> information is guarded by a flags. I think, most participants found the thread
> times the most valuable information and had no (strong) objections to
> adding them unconditionally.
> 
> @David
> Implementation is much simpler with conditional output only for
> JavaThreads. I could get rid of the terrible hack (changing the flag) without
> having to change too many source files.
> 
> The information on allocated bytes is present in the Thread class already
> before this proposed change, I just print it. It might be sensible to move
> _allocated_bytes and the respective methods to the ThreadStatisticInfo class
> as Götz suggested. I haven’t done that in the current version, though.
> 
> @David and Thomas
> I've removed the pthread-id output. I'm unsure myself what it could be good
> for. Anyway, we could add it with a separate change (with a better
> implementation) if there is a need to.
> 
> @Chris
> As you have written, the user has no direct contact to the attach listener of
> the VM and jstack won't misinterpret e.g. -help. I agree that the parsing in
> attachListener.cpp could be more robust. However, it hasn't been so far
> either. All that is done in the current implementation is a strcmp to -l
> everything else will be silently ignored. I can try to make this more robust or
> we could say that the new output is only available via jcmd. OTH nothing bad
> can happen with the current version of the proposed change, so we could
> also leave it as it is. What would you prefer?
> 
> @Kirk and Thomas
> Implementing a new diagnostic command is certainly well worth a thought.
> However, it appears to me that it's not necessary in this case. There are
> already flags to jstack and jcmd Thread.print, so it's not that uncommon. The
> amount of information that would be available by a new diagnostic command
> and that is now added to the thread dump is quite small and I think that is
> tolerable. Moreover, and most importantly, our support team is use to find
> the information in the thread dump and it would make things easier to us if I
> could leave it there.
> 
> Thanks again and have a nice weekend,
> Gunter
> 
> 
> On 06.06.18, 04:56, "David Holmes" <david.holmes at oracle.com> wrote:
> 
>     Hi Goetz,
> 
>     On 5/06/2018 11:07 PM, Lindenmaier, Goetz wrote:
>     > Hi
>     >
>     > this discussion touched a lot of points so far, which seem
>     > to lead to different conclusions.
>     >
>     > I think we should look at the values printed:
>     >
>     > 1. cpu=6300.65ms elapsed=123.28s
>     > Overhead
>     >   cpu time:
>     >     * system calls at thread dump time
>     >   elapsed time:
>     >     * 1 system call at thread creation time
>     >     * 1 64-bit field per thread for the thread start time
>     >     * 1 system call at thread dump time
>     >
>     > As I understand, JDK-8143176 would have had to get the
>     > time at each locking, which is much more time critical
>     > than doing this during thread creation. While
> 
>     Correct.
> 
>     > the time a lock was held would be much more useful,
>     > the ratio here, combined with knowledge about the application,
>     > could lead to the conclusion that the thread is wrongly
>     > blocked at much lower cost.
> 
>     Agreed. I see no issue with unconditionally adding this information as
>     it should address some of the concerns of 8143176 as well.
> 
>     >
>     > 2. pthread-id=0x109708000
>     > Overhead:
>     >   * a field access at thread dump time. Negligible I would say.
> 
>     I'd overlooked this addition, but it is of course the chunk of ifdef
>     code in osThread.cpp that I objected to. The problem I have here is the
>     convoluted mess of "thread identifiers" that we already have. We
>     currently print:
> 
>       st->print("tid=" INTPTR_FORMAT " ", p2i(this));
> 
>     which is just the address of the Thread/JavaThread object. Then:
> 
>       st->print("nid=0x%x ", thread_id());
> 
>     where 'n' is supposed to represent "native thread id", which is:
>     - linux: kernel thread id from syscall gettid
>     - solaris: thread library identity from thr_create() or thr_self()
>     - windows: thread id from _beginthreadex
>     - OS X: kernel thread id from pthread_mach_thread_np(pthread_self());
>     - Other BSD: kernel (?) thread id from syscall thr_self or getthrid
>     - AIX: thread library identity from pthread_create() or pthread_self()
> 
>     It's telling that the code Gunter added gets the thread library id on
>     linux, but a "kernel thread id" on other platforms! Do we want to see a
>     thread library id and a kernel thread id? IIRC nid is supposed be the id
>     you need to see the thread in a debugger. In which case I'm unclear what
>     role the thread id Gunter wants to add is playing? I don't think we need
>     to see kernel ids in general, and pthread-id isn't useful for debugging
>     is it?
> 
>     >
>     > 3. allocated=242236760B defined_classes=1725
>     > Overhead:
>     >    * 1 64-bit field per thread.
>     >    * counter increment on class creation
>     >
>     > Especially defined_classes seems to be controversial.
>     > As this only makes sense for Java threads, this could
>     > be printed in a line of it's own in Threads::print_on_error()
>     > (which already gets a flag to customize for jstack:
>     > print_concurrent_locks) by calling a dedicated function in Thread.
>     > No need for flag PrintExtendedThreadInfo thus.
>     >
>     > But I would propose to drop this information as it is
>     > too controversial.
> 
>     Agreed.
> 
>     > This leaves the times and the pthread-id to be decided whether
>     > they are worth the cost. If so, I think they should be printed
>     > unconditional.
> 
>     Agreed. With some refinement of the "thread id" issue.
> 
>     > If a flag is required to avoid bloating the info in the default case,
>     > I would include more infos, as os_prio and nid under that flag.
> 
>     Of course those things exist for good reason. The native id is supposed
>     to be what aids in you matching what you see in our thread dumps with
>     what you see in a debugger. Priority is less of an issue these days, but
>     there was a time ... ;-)
> 
>     > As jstack is deprecated, changes to jstack could be skipped always
>     > disabling the new printouts.
>     >
>     > Finally, I would propose to move _allocated_bytes into
>     > threadStatisticInfo.hpp.
> 
>     Didn't that part get dropped? (You don't need it for the time info or
>     thread id). That's a GC logging issue to me.
> 
>     Thanks,
>     David
> 
>     > Best regards,
>     >    Goetz.
>     >
>     >
>     >
>     >
>     >> -----Original Message-----
>     >> From: serviceability-dev [mailto:serviceability-dev-
>     >> bounces at openjdk.java.net] On Behalf Of David Holmes
>     >> Sent: Dienstag, 5. Juni 2018 04:53
>     >> To: Haug, Gunter <gunter.haug at sap.com>; Chris Plummer
>     >> <chris.plummer at oracle.com>; serviceability-dev <serviceability-
>     >> dev at openjdk.java.net>; Langer, Christoph
> <christoph.langer at sap.com>;
>     >> hotspot-runtime-dev at openjdk.java.net
>     >> Subject: Re: RFR(S): 8200720: Print additional information in thread
> dump
>     >> (times, allocated bytes etc.)
>     >>
>     >> Hi Gunter,
>     >>
>     >> On 5/06/2018 1:27 AM, Haug, Gunter wrote:
>     >>> Hi David, Chris,
>     >>>
>     >>> would it be an option to unconditionally print the additional
> information?
>     >> Regardless which way this is implemented it will be rather complicated
> and it
>     >> only switches on/off a few field in the thread dump.
>     >>
>     >> I'm not convinced this is all suitable information for a thread stack
>     >> dump. I can see the time information being useful if using the dump to
>     >> try and diagnoze a hang or responsiveness issue. But the allocated-
> bytes
>     >> and classes-defined just doesn't seem useful in the context of a thread
>     >> dump to me. Anyone interested in allocation stats is going to be looking
>     >> at GC logs etc which is where this belongs. Ditto the class-stats belong
>     >> in some kind of classloading logging IMHO.
>     >>
>     >> I'm very reluctant to burden the implementation with capturing these
>     >> kinds of stats, just to use for diagnostic purposes that may not even be
>     >> asked for. I'm very much in the "you shouldn't pay for what you don't
>     >> use" camp in that regard. (See my comments in JDK-8143176 referenced
> by
>     >> Sean.)
>     >>
>     >> The ThreadStatisticInfo adds overhead to every thread object - and I'd
>     >> have to suspect there could be overlap with whatever flight recorder
>     >> sticks in there too. (Thread has become even more bloated in recent
> time!).
>     >>
>     >> Cheers,
>     >> David
>     >>
>     >>
>     >>> Thanks,
>     >>> Gunter
>     >>>
>     >>> On 04.06.18, 01:13, "David Holmes" <david.holmes at oracle.com>
> wrote:
>     >>>
>     >>>       Hi Gunter, Chris,
>     >>>
>     >>>       Sorry just trying to catch up and this is only a partial look so far ...
>     >>>
>     >>>       BTW these changes are not limited to serviceability code so adding
> in
>     >>>       runtime team as well.
>     >>>
>     >>>       On 2/06/2018 9:12 AM, Chris Plummer wrote:
>     >>>       > Hi Gunter,
>     >>>       >
>     >>>       > On 6/1/18 3:17 AM, Haug, Gunter wrote:
>     >>>       >> Hi Chris,
>     >>>       >>
>     >>>       >> thanks for looking into this!
>     >>>       >>
>     >>>       >> Re the synchronization: The value is stored only in a VM
> operation at
>     >>>       >> a safepoint by the VM thread. I was of the opinion, that this
> could
>     >>>       >> not be interrupted by a second VM operation (of the same
> type). Or
>     >> am
>     >>>       >> I missing something else?
>     >>>       > I was really thinking more about the temporary changing of
>     >>>       > PrintExtendedThreadInfo, not the value stored in the VMOp. You
> may
>     >> be be
>     >>>       > correct that no more than one VMOp is executing, but while it is
>     >>>       > executing it is has changed the value of PrintExtendedThreadInfo,
>     >> which
>     >>>       > might have an impact on anything else that triggers printing
> thread info
>     >>>       > while the VMOp is executing.
>     >>>
>     >>>       Even if nothing else can possibly be running during the safepoint I
> find
>     >>>       it extremely bad form to change a command-line flag in this way,
>     >>>       particularly from a safepoint!
>     >>>
>     >>>       If this flag is intended to be dynamically enabled as part of a dcmd
>     >>>       then it should also be a Manageable flag IMHO.
>     >>>
>     >>>       That said ...
>     >>>
>     >>>       >>   I did think about passing an argument to the various print_on
>     >> member
>     >>>       >> functions of the thread classes, but this would require rather
>     >>>       >> extensive code changes for a rather tiny extension. Therefore I
> feel
>     >>>       >> doing it like this is the lesser evil.
>     >>>
>     >>>       ... it's obviously not truly a global setting, but one that is needed on
>     >>>       a per-print-request basis. The flag is just the default, but if the
>     >>>       default is off you still want to enable extended printing on a
>     >>>       per-request basis.
>     >>>
>     >>>       The current print_on mechanics is not set up for this kind of
>     >>>       flexibility. I think this needs more thought.
>     >>>
>     >>>       ---
>     >>>
>     >>>       Re osThread.cpp shared code changes ... I really hate to see all the
>     >>>       ifdefs in there. Please add a pd_print_on function to support this if
>     >>>       you truly want platform specific information.
>     >>>
>     >>>       ---
>     >>>
>     >>>       threadStatisticInfo.hpp
>     >>>
>     >>>       typo: getElepsedTime() -> getElapsedTime()
>     >>>
>     >>>       Thanks,
>     >>>       David
>     >>>
>     >>>
>     >>>
>     >>>       >>
>     >>>       >> Re thread_dump(): I think it's correct or, well, at least it works ;-)
>     >>>       >> In fact jstack will transfer the "-e" and "-l" in only one string,
>     >>>       >> i.e. op->arg(0).
>     >>>       > So you are saying that op->arg(0) is "-e -l" when using jstack? I
> think
>     >>>       > you really need to clean up the parsing. As it stands right now
> passing,
>     >>>       > I get the feeling that if a user erroneously asks for help by using
>     >>>       > "jcmd <pid> Thread.Print -help", it will end up executing with -e
> an -l
>     >>>       > enabled, and no failure for the invalid "-help" option.
>     >>>       >>   Christoph has already explained my reasoning. But I agree, it's
> not
>     >>>       >> nice and I would be happy to do it like Christoph suggested.
>     >>>       > I'll respond separately to his suggestion.
>     >>>       >
>     >>>       > thanks,
>     >>>       >
>     >>>       > Chris
>     >>>       >>
>     >>>       >> And typo fixed, sorry.
>     >>>       >>
>     >>>       >> Thanks again,
>     >>>       >> Gunter
>     >>>       >>
>     >>>       >> On 01.06.18, 00:03, "Chris Plummer"
> <chris.plummer at oracle.com>
>     >> wrote:
>     >>>       >>
>     >>>       >>      Hi Gunter,
>     >>>       >>      globals.hpp: fix typo "informatiuon"
>     >>>       >>      I worry a little bit about the synchronizing (if that's the right
>     >>>       >> word)
>     >>>       >>      of PrintExtendedThreadInfo and the dcmd's -e flag. When
> using -e,
>     >>>       >> you
>     >>>       >>      are temporarily enabling PrintExtendedThreadInfo if it was
> false.
>     >>>       >> This
>     >>>       >>      temporarily changes the behavior of thread dumps, and could
>     >>>       >> impact other
>     >>>       >>      uses that happen in parallel. Also, could two simultaneous
> uses
>     >>>       >> of -e
>     >>>       >>      result in PrintExtendedThreadInfo not getting restored
> properly?
>     >>>       >>      thread_dump() doesn't look right. It looks like you are
> iterating
>     >>>       >> char
>     >>>       >>      by char over the argument, and expect something like "-el" to
> be
>     >>>       >>      specified rather then "-e -l". The loop should be iterating over
>     >>>       >>      op->arg(i), not op->arg(0)[i].
>     >>>       >>      The rest of the changes look fine.
>     >>>       >>      thanks,
>     >>>       >>      Chris
>     >>>       >>      On 5/30/18 8:12 AM, Haug, Gunter wrote:
>     >>>       >>      > Hi all,
>     >>>       >>      >
>     >>>       >>      > As Chris proposed, I have made an the extended output
>     >>>       >> switchable. There is an VM flag (PrintExtendedThreadInfo),
> which is
>     >>>       >> false by default. Moreover, there is an Option (-e) which can be
> used
>     >>>       >> with jcmd Thread.print as well as with jstack. The -e option
>     >>>       >> essentially sets PrintExtendedThreadInfo true just for the
> respective
>     >>>       >> thread dump.
>     >>>       >>      >
>     >>>       >>      > Here is the updated webrev:
>     >>>       >>      >
>     >>>       >>      > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v2
>     >>>       >>      >
>     >>>       >>      > (https://bugs.openjdk.java.net/browse/JDK-8200720)
>     >>>       >>      >
>     >>>       >>      > Thanks,
>     >>>       >>      > Gunter
>     >>>       >>      >
>     >>>       >>      >
>     >>>       >>      > On 02.05.18, 17:07, "serviceability-dev on behalf of Haug,
>     >>>       >> Gunter" <serviceability-dev-bounces at openjdk.java.net on
> behalf of
>     >>>       >> gunter.haug at sap.com> wrote:
>     >>>       >>      >
>     >>>       >>      >      Hi Chris,
>     >>>       >>      >
>     >>>       >>      >      Thanks for looking into this.
>     >>>       >>      >      You're right, there is a little more we have. We have
>     >>>       >> implemented an IO tracing mechanism which - rather as a
> byproduct -
>     >>>       >> keeps track of bytes read and written per thread. However, this
> of
>     >>>       >> course requires changes not only in hotspot. We would be
> happy to
>     >>>       >> contribute this as well, but this is a far bigger change and will
>     >>>       >> probably lead to a far bigger discussion. Anyway, with the
> number of
>     >>>       >> bytes read, the number of classes defined doesn't look that
> arbitrary
>     >>>       >> anymore, as one can correlate IO to class loading.
>     >>>       >>      >
>     >>>       >>      >      Regarding the verbose option I think that's a good idea!
>     >>>       >>      >
>     >>>       >>      >      Thanks again,
>     >>>       >>      >      Gunter
>     >>>       >>      >
>     >>>       >>      >      On 01.05.18, 22:55, "Chris Plummer"
>     >>>       >> <chris.plummer at oracle.com> wrote:
>     >>>       >>      >
>     >>>       >>      >          Hi Gunter,
>     >>>       >>      >
>     >>>       >>      >          The output you are adding is all useful. I think the
>     >>>       >> question is (and
>     >>>       >>      >          I'd like to see a few people chime in on this for this
>     >>>       >> review) is
>     >>>       >>      >          whether or not all of it is the appropriate for a
>     >>>       >> thread's stack dump.
>     >>>       >>      >          For example, defined_classes is on the fringe of what
>     >>>       >> I would call
>     >>>       >>      >          generally useful info in a stack dump. Sure, there
>     >>>       >> might be that rare
>     >>>       >>      >          case when you need it, but how often compared to
> other
>     >>>       >> useful info
>     >>>       >>      >          maintained on a per thread basis. How many other bits
>     >>>       >> of useful info are
>     >>>       >>      >          not being printed in favor of defined_classes? It
>     >>>       >> seems you have more in
>     >>>       >>      >          the queue. How many? I'm worried about how
> cluttered
>     >>>       >> the stack dumps
>     >>>       >>      >          will get. Maybe we should add some sort of verbose
>     >>>       >> thread dumping
>     >>>       >>      >          option. Just a thought.
>     >>>       >>      >
>     >>>       >>      >          As for the implementation, overall it looks good, but
>     >>>       >> I can't speak to
>     >>>       >>      >          whether or not you are doing proper accounting of
>     >>>       >> defined_classes and
>     >>>       >>      >          bytes allocated. You'll need input from someone with
>     >>>       >> more knowledge of
>     >>>       >>      >          those areas. We'll also need to do some testing to
>     >>>       >> make sure tool tests
>     >>>       >>      >          are not impacted.
>     >>>       >>      >
>     >>>       >>      >          thanks,
>     >>>       >>      >
>     >>>       >>      >          Chris
>     >>>       >>      >
>     >>>       >>      >          On 4/30/18 2:51 AM, Haug, Gunter wrote:
>     >>>       >>      >          > Hi,
>     >>>       >>      >          >
>     >>>       >>      >          > this is an update to an RFR I posted on hotspot-dev,
>     >>>       >> but it is probably more suitable to post it here. Can I please have
> a
>     >>>       >> review and a sponsor for the following enhancement:
>     >>>       >>      >          >
>     >>>       >>      >          >
> http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v1
>     >>>       >>      >          > https://bugs.openjdk.java.net/browse/JDK-8200720
>     >>>       >>      >          >
>     >>>       >>      >          > We at SAP have extended the thread dumps
> (obtained
>     >>>       >> by jstack or jcmd) by several fields providing thread specific
>     >>>       >> information. These extensions are quite popular with our
> support
>     >> team.
>     >>>       >> With some knowledge of the architecture of the application,
> they
>     >> often
>     >>>       >> allow for quick and simple diagnosis of a running system.
> Therefore
>     >> we
>     >>>       >> would like to contribute these enhancements.
>     >>>       >>      >          >
>     >>>       >>      >          > I took a few simple examples here, namely cpu time,
>     >>>       >> elapsed time since thread creation, bytes allocated and classes
>     >>>       >> defined by the thread and the pthread-id or equivalent on
> platforms
>     >>>       >> where it makes sense. Provided it is known how the application
>     >> should
>     >>>       >> behave, a misbehaving thread can identified easily.
>     >>>       >>      >          >
>     >>>       >>      >          > There is no measurable overhead for this
>     >>>       >> enhancement. However, it may be a problem that the format of
> the
>     >>>       >> output is changed. Tools parsing the output may have to be
> changed.
>     >>>       >>      >          >
>     >>>       >>      >          > Here is an example of the output generated:
>     >>>       >>      >          >
>     >>>       >>      >          > ------------------------------------------------------
>     >>>       >>      >          >
>     >>>       >>      >          > "main" #1 prio=5 os_prio=31 cpu=6300.65ms
>     >>>       >> elapsed=123.28s allocated=242236760B defined_classes=1725
>     >>>       >> tid=0x00007fa13a806000 nid=0x1c03 pthread-id=0x109708000
> waiting
>     >> on
>     >>>       >> condition [0x0000000109707000]
>     >>>       >>      >          >     java.lang.Thread.State: TIMED_WAITING (sleeping)
>     >>>       >>      >          >     JavaThread state: _thread_blocked
>     >>>       >>      >          > Thread: 0x00007fa13a806000 [0x1c03] State:
>     >>>       >> _at_safepoint _has_called_back 0 _at_poll_safepoint 0
>     >>>       >>      >          >     JavaThread state: _thread_blocked
>     >>>       >>      >          > at java.lang.Thread.sleep(java.base/Native Method)
>     >>>       >>      >          >             ...
>     >>>       >>      >          > ------------------------------------------------------
>     >>>       >>      >          >
>     >>>       >>      >          > As mentioned above, we have a whole bunch of
> other
>     >>>       >> enhancements to the thread dump similar to this one and would
> be
>     >>>       >> willing to contribute them if there is any interest.
>     >>>       >>      >          >
>     >>>       >>      >          > Thanks and best regards,
>     >>>       >>      >          > Gunter
>     >>>       >>      >          >
>     >>>       >>      >          >
>     >>>       >>      >
>     >>>       >>      >
>     >>>       >>      >
>     >>>       >>      >
>     >>>       >>      >
>     >>>       >>
>     >>>       >
>     >>>       >
>     >>>
>     >>>
> 



More information about the serviceability-dev mailing list