RFR(S): 8200720: Print additional information in thread dump (times, allocated bytes etc.)
Thomas Stüfe
thomas.stuefe at gmail.com
Wed Jun 6 04:45:05 UTC 2018
On Wed, Jun 6, 2018 at 4:55 AM, David Holmes <david.holmes at oracle.com> wrote:
> Hi Goetz,
>
> On 5/06/2018 11:07 PM, Lindenmaier, Goetz wrote:
>>
>> Hi
>>
>> this discussion touched a lot of points so far, which seem
>> to lead to different conclusions.
>>
>> I think we should look at the values printed:
>>
>> 1. cpu=6300.65ms elapsed=123.28s
>> Overhead
>> cpu time:
>> * system calls at thread dump time
>> elapsed time:
>> * 1 system call at thread creation time
>> * 1 64-bit field per thread for the thread start time
>> * 1 system call at thread dump time
>>
>> As I understand, JDK-8143176 would have had to get the
>> time at each locking, which is much more time critical
>> than doing this during thread creation. While
>
>
> Correct.
>
>> the time a lock was held would be much more useful,
>> the ratio here, combined with knowledge about the application,
>> could lead to the conclusion that the thread is wrongly
>> blocked at much lower cost.
>
>
> Agreed. I see no issue with unconditionally adding this information as it
> should address some of the concerns of 8143176 as well.
>
>>
>> 2. pthread-id=0x109708000
>> Overhead:
>> * a field access at thread dump time. Negligible I would say.
>
>
> I'd overlooked this addition, but it is of course the chunk of ifdef code in
> osThread.cpp that I objected to. The problem I have here is the convoluted
> mess of "thread identifiers" that we already have. We currently print:
>
> st->print("tid=" INTPTR_FORMAT " ", p2i(this));
>
> which is just the address of the Thread/JavaThread object. Then:
>
> st->print("nid=0x%x ", thread_id());
>
> where 'n' is supposed to represent "native thread id", which is:
> - linux: kernel thread id from syscall gettid
> - solaris: thread library identity from thr_create() or thr_self()
> - windows: thread id from _beginthreadex
> - OS X: kernel thread id from pthread_mach_thread_np(pthread_self());
> - Other BSD: kernel (?) thread id from syscall thr_self or getthrid
> - AIX: thread library identity from pthread_create() or pthread_self()
>
> It's telling that the code Gunter added gets the thread library id on linux,
> but a "kernel thread id" on other platforms! Do we want to see a thread
> library id and a kernel thread id? IIRC nid is supposed be the id you need
> to see the thread in a debugger. In which case I'm unclear what role the
> thread id Gunter wants to add is playing? I don't think we need to see
> kernel ids in general, and pthread-id isn't useful for debugging is it?
>
Yes, feels just wrong. Plus, not even sufficient, since you can have
both pthread-id and kernel thread id, e.g. on AIX, and depending on
your situation you may need both. But then, you may need a lot of
things, and where do we draw the line? I admit I also feel
uncomfortable with adding all this output to thread dump, which is
noise in 99% of cases.
(BTW what about tools parsing the thread dump in its current form?
Wont we make a lot of scripts out there unhappy?)
--
Code wise, at the very least, I would abstract this kind of coding
into something like
os::print_native_thread_id(outputStream* st)
with the expectation of "only writes a few words, no newline".
Platforms like AIX may then print both.
(Note: not "get_native_thread_id()", since that would stipulates that
there is only one and that is can be expressed as a common type over
all architectures. Also, I would not want anyone to use it as a
number.)
--
As others before me (Kirk) noted, I find the information added by this
patch useful. But after looking closely at the patch, I would strongly
prefer a new separate jcmd for this. Lets call it "VM.thread_stats".
You still would have to collect and carry the statistics by thread,
but output would be separate both from the user perspective and also
in coding - you would not have to force-shove it into
Thread::print_on, for instance. Such a hypothetical command could be
in the future extended with other statistics as we see fit. I do not
see any advantage in overloading the thread dump vehicle in this way.
Just my 5 cent.
Best Regards, Thomas
>>
>> 3. allocated=242236760B defined_classes=1725
>> Overhead:
>> * 1 64-bit field per thread.
>> * counter increment on class creation
>>
>> Especially defined_classes seems to be controversial.
>> As this only makes sense for Java threads, this could
>> be printed in a line of it's own in Threads::print_on_error()
>> (which already gets a flag to customize for jstack:
>> print_concurrent_locks) by calling a dedicated function in Thread.
>> No need for flag PrintExtendedThreadInfo thus.
>>
>> But I would propose to drop this information as it is
>> too controversial.
>
>
> Agreed.
>
>> This leaves the times and the pthread-id to be decided whether
>> they are worth the cost. If so, I think they should be printed
>> unconditional.
>
>
> Agreed. With some refinement of the "thread id" issue.
>
>> If a flag is required to avoid bloating the info in the default case,
>> I would include more infos, as os_prio and nid under that flag.
>
>
> Of course those things exist for good reason. The native id is supposed to
> be what aids in you matching what you see in our thread dumps with what you
> see in a debugger. Priority is less of an issue these days, but there was a
> time ... ;-)
>
>> As jstack is deprecated, changes to jstack could be skipped always
>> disabling the new printouts.
>>
>> Finally, I would propose to move _allocated_bytes into
>> threadStatisticInfo.hpp.
>
>
> Didn't that part get dropped? (You don't need it for the time info or thread
> id). That's a GC logging issue to me.
>
> Thanks,
> David
>
>
>> Best regards,
>> Goetz.
>>
>>
>>
>>
>>> -----Original Message-----
>>> From: serviceability-dev [mailto:serviceability-dev-
>>> bounces at openjdk.java.net] On Behalf Of David Holmes
>>> Sent: Dienstag, 5. Juni 2018 04:53
>>> To: Haug, Gunter <gunter.haug at sap.com>; Chris Plummer
>>> <chris.plummer at oracle.com>; serviceability-dev <serviceability-
>>> dev at openjdk.java.net>; Langer, Christoph <christoph.langer at sap.com>;
>>> hotspot-runtime-dev at openjdk.java.net
>>> Subject: Re: RFR(S): 8200720: Print additional information in thread dump
>>> (times, allocated bytes etc.)
>>>
>>> Hi Gunter,
>>>
>>> On 5/06/2018 1:27 AM, Haug, Gunter wrote:
>>>>
>>>> Hi David, Chris,
>>>>
>>>> would it be an option to unconditionally print the additional
>>>> information?
>>>
>>> Regardless which way this is implemented it will be rather complicated
>>> and it
>>> only switches on/off a few field in the thread dump.
>>>
>>> I'm not convinced this is all suitable information for a thread stack
>>> dump. I can see the time information being useful if using the dump to
>>> try and diagnoze a hang or responsiveness issue. But the allocated-bytes
>>> and classes-defined just doesn't seem useful in the context of a thread
>>> dump to me. Anyone interested in allocation stats is going to be looking
>>> at GC logs etc which is where this belongs. Ditto the class-stats belong
>>> in some kind of classloading logging IMHO.
>>>
>>> I'm very reluctant to burden the implementation with capturing these
>>> kinds of stats, just to use for diagnostic purposes that may not even be
>>> asked for. I'm very much in the "you shouldn't pay for what you don't
>>> use" camp in that regard. (See my comments in JDK-8143176 referenced by
>>> Sean.)
>>>
>>> The ThreadStatisticInfo adds overhead to every thread object - and I'd
>>> have to suspect there could be overlap with whatever flight recorder
>>> sticks in there too. (Thread has become even more bloated in recent
>>> time!).
>>>
>>> Cheers,
>>> David
>>>
>>>
>>>> Thanks,
>>>> Gunter
>>>>
>>>> On 04.06.18, 01:13, "David Holmes" <david.holmes at oracle.com> wrote:
>>>>
>>>> Hi Gunter, Chris,
>>>>
>>>> Sorry just trying to catch up and this is only a partial look so
>>>> far ...
>>>>
>>>> BTW these changes are not limited to serviceability code so adding
>>>> in
>>>> runtime team as well.
>>>>
>>>> On 2/06/2018 9:12 AM, Chris Plummer wrote:
>>>> > Hi Gunter,
>>>> >
>>>> > On 6/1/18 3:17 AM, Haug, Gunter wrote:
>>>> >> Hi Chris,
>>>> >>
>>>> >> thanks for looking into this!
>>>> >>
>>>> >> Re the synchronization: The value is stored only in a VM
>>>> operation at
>>>> >> a safepoint by the VM thread. I was of the opinion, that this
>>>> could
>>>> >> not be interrupted by a second VM operation (of the same type).
>>>> Or
>>>
>>> am
>>>>
>>>> >> I missing something else?
>>>> > I was really thinking more about the temporary changing of
>>>> > PrintExtendedThreadInfo, not the value stored in the VMOp. You
>>>> may
>>>
>>> be be
>>>>
>>>> > correct that no more than one VMOp is executing, but while it is
>>>> > executing it is has changed the value of
>>>> PrintExtendedThreadInfo,
>>>
>>> which
>>>>
>>>> > might have an impact on anything else that triggers printing
>>>> thread info
>>>> > while the VMOp is executing.
>>>>
>>>> Even if nothing else can possibly be running during the safepoint
>>>> I find
>>>> it extremely bad form to change a command-line flag in this way,
>>>> particularly from a safepoint!
>>>>
>>>> If this flag is intended to be dynamically enabled as part of a
>>>> dcmd
>>>> then it should also be a Manageable flag IMHO.
>>>>
>>>> That said ...
>>>>
>>>> >> I did think about passing an argument to the various print_on
>>>
>>> member
>>>>
>>>> >> functions of the thread classes, but this would require rather
>>>> >> extensive code changes for a rather tiny extension. Therefore I
>>>> feel
>>>> >> doing it like this is the lesser evil.
>>>>
>>>> ... it's obviously not truly a global setting, but one that is
>>>> needed on
>>>> a per-print-request basis. The flag is just the default, but if
>>>> the
>>>> default is off you still want to enable extended printing on a
>>>> per-request basis.
>>>>
>>>> The current print_on mechanics is not set up for this kind of
>>>> flexibility. I think this needs more thought.
>>>>
>>>> ---
>>>>
>>>> Re osThread.cpp shared code changes ... I really hate to see all
>>>> the
>>>> ifdefs in there. Please add a pd_print_on function to support this
>>>> if
>>>> you truly want platform specific information.
>>>>
>>>> ---
>>>>
>>>> threadStatisticInfo.hpp
>>>>
>>>> typo: getElepsedTime() -> getElapsedTime()
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>>
>>>> >>
>>>> >> Re thread_dump(): I think it's correct or, well, at least it
>>>> works ;-)
>>>> >> In fact jstack will transfer the "-e" and "-l" in only one
>>>> string,
>>>> >> i.e. op->arg(0).
>>>> > So you are saying that op->arg(0) is "-e -l" when using jstack?
>>>> I think
>>>> > you really need to clean up the parsing. As it stands right now
>>>> passing,
>>>> > I get the feeling that if a user erroneously asks for help by
>>>> using
>>>> > "jcmd <pid> Thread.Print -help", it will end up executing with
>>>> -e an -l
>>>> > enabled, and no failure for the invalid "-help" option.
>>>> >> Christoph has already explained my reasoning. But I agree,
>>>> it's not
>>>> >> nice and I would be happy to do it like Christoph suggested.
>>>> > I'll respond separately to his suggestion.
>>>> >
>>>> > thanks,
>>>> >
>>>> > Chris
>>>> >>
>>>> >> And typo fixed, sorry.
>>>> >>
>>>> >> Thanks again,
>>>> >> Gunter
>>>> >>
>>>> >> On 01.06.18, 00:03, "Chris Plummer" <chris.plummer at oracle.com>
>>>
>>> wrote:
>>>>
>>>> >>
>>>> >> Hi Gunter,
>>>> >> globals.hpp: fix typo "informatiuon"
>>>> >> I worry a little bit about the synchronizing (if that's
>>>> the right
>>>> >> word)
>>>> >> of PrintExtendedThreadInfo and the dcmd's -e flag. When
>>>> using -e,
>>>> >> you
>>>> >> are temporarily enabling PrintExtendedThreadInfo if it was
>>>> false.
>>>> >> This
>>>> >> temporarily changes the behavior of thread dumps, and
>>>> could
>>>> >> impact other
>>>> >> uses that happen in parallel. Also, could two simultaneous
>>>> uses
>>>> >> of -e
>>>> >> result in PrintExtendedThreadInfo not getting restored
>>>> properly?
>>>> >> thread_dump() doesn't look right. It looks like you are
>>>> iterating
>>>> >> char
>>>> >> by char over the argument, and expect something like "-el"
>>>> to be
>>>> >> specified rather then "-e -l". The loop should be
>>>> iterating over
>>>> >> op->arg(i), not op->arg(0)[i].
>>>> >> The rest of the changes look fine.
>>>> >> thanks,
>>>> >> Chris
>>>> >> On 5/30/18 8:12 AM, Haug, Gunter wrote:
>>>> >> > Hi all,
>>>> >> >
>>>> >> > As Chris proposed, I have made an the extended output
>>>> >> switchable. There is an VM flag (PrintExtendedThreadInfo),
>>>> which is
>>>> >> false by default. Moreover, there is an Option (-e) which can
>>>> be used
>>>> >> with jcmd Thread.print as well as with jstack. The -e option
>>>> >> essentially sets PrintExtendedThreadInfo true just for the
>>>> respective
>>>> >> thread dump.
>>>> >> >
>>>> >> > Here is the updated webrev:
>>>> >> >
>>>> >> > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v2
>>>> >> >
>>>> >> > (https://bugs.openjdk.java.net/browse/JDK-8200720)
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Gunter
>>>> >> >
>>>> >> >
>>>> >> > On 02.05.18, 17:07, "serviceability-dev on behalf of
>>>> Haug,
>>>> >> Gunter" <serviceability-dev-bounces at openjdk.java.net on behalf
>>>> of
>>>> >> gunter.haug at sap.com> wrote:
>>>> >> >
>>>> >> > Hi Chris,
>>>> >> >
>>>> >> > Thanks for looking into this.
>>>> >> > You're right, there is a little more we have. We
>>>> have
>>>> >> implemented an IO tracing mechanism which - rather as a
>>>> byproduct -
>>>> >> keeps track of bytes read and written per thread. However, this
>>>> of
>>>> >> course requires changes not only in hotspot. We would be happy
>>>> to
>>>> >> contribute this as well, but this is a far bigger change and
>>>> will
>>>> >> probably lead to a far bigger discussion. Anyway, with the
>>>> number of
>>>> >> bytes read, the number of classes defined doesn't look that
>>>> arbitrary
>>>> >> anymore, as one can correlate IO to class loading.
>>>> >> >
>>>> >> > Regarding the verbose option I think that's a good
>>>> idea!
>>>> >> >
>>>> >> > Thanks again,
>>>> >> > Gunter
>>>> >> >
>>>> >> > On 01.05.18, 22:55, "Chris Plummer"
>>>> >> <chris.plummer at oracle.com> wrote:
>>>> >> >
>>>> >> > Hi Gunter,
>>>> >> >
>>>> >> > The output you are adding is all useful. I
>>>> think the
>>>> >> question is (and
>>>> >> > I'd like to see a few people chime in on this
>>>> for this
>>>> >> review) is
>>>> >> > whether or not all of it is the appropriate for
>>>> a
>>>> >> thread's stack dump.
>>>> >> > For example, defined_classes is on the fringe
>>>> of what
>>>> >> I would call
>>>> >> > generally useful info in a stack dump. Sure,
>>>> there
>>>> >> might be that rare
>>>> >> > case when you need it, but how often compared
>>>> to other
>>>> >> useful info
>>>> >> > maintained on a per thread basis. How many
>>>> other bits
>>>> >> of useful info are
>>>> >> > not being printed in favor of defined_classes?
>>>> It
>>>> >> seems you have more in
>>>> >> > the queue. How many? I'm worried about how
>>>> cluttered
>>>> >> the stack dumps
>>>> >> > will get. Maybe we should add some sort of
>>>> verbose
>>>> >> thread dumping
>>>> >> > option. Just a thought.
>>>> >> >
>>>> >> > As for the implementation, overall it looks
>>>> good, but
>>>> >> I can't speak to
>>>> >> > whether or not you are doing proper accounting
>>>> of
>>>> >> defined_classes and
>>>> >> > bytes allocated. You'll need input from someone
>>>> with
>>>> >> more knowledge of
>>>> >> > those areas. We'll also need to do some testing
>>>> to
>>>> >> make sure tool tests
>>>> >> > are not impacted.
>>>> >> >
>>>> >> > thanks,
>>>> >> >
>>>> >> > Chris
>>>> >> >
>>>> >> > On 4/30/18 2:51 AM, Haug, Gunter wrote:
>>>> >> > > Hi,
>>>> >> > >
>>>> >> > > this is an update to an RFR I posted on
>>>> hotspot-dev,
>>>> >> but it is probably more suitable to post it here. Can I please
>>>> have a
>>>> >> review and a sponsor for the following enhancement:
>>>> >> > >
>>>> >> > >
>>>> http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v1
>>>> >> > >
>>>> https://bugs.openjdk.java.net/browse/JDK-8200720
>>>> >> > >
>>>> >> > > We at SAP have extended the thread dumps
>>>> (obtained
>>>> >> by jstack or jcmd) by several fields providing thread specific
>>>> >> information. These extensions are quite popular with our
>>>> support
>>>
>>> team.
>>>>
>>>> >> With some knowledge of the architecture of the application,
>>>> they
>>>
>>> often
>>>>
>>>> >> allow for quick and simple diagnosis of a running system.
>>>> Therefore
>>>
>>> we
>>>>
>>>> >> would like to contribute these enhancements.
>>>> >> > >
>>>> >> > > I took a few simple examples here, namely cpu
>>>> time,
>>>> >> elapsed time since thread creation, bytes allocated and classes
>>>> >> defined by the thread and the pthread-id or equivalent on
>>>> platforms
>>>> >> where it makes sense. Provided it is known how the application
>>>
>>> should
>>>>
>>>> >> behave, a misbehaving thread can identified easily.
>>>> >> > >
>>>> >> > > There is no measurable overhead for this
>>>> >> enhancement. However, it may be a problem that the format of
>>>> the
>>>> >> output is changed. Tools parsing the output may have to be
>>>> changed.
>>>> >> > >
>>>> >> > > Here is an example of the output generated:
>>>> >> > >
>>>> >> > >
>>>> ------------------------------------------------------
>>>> >> > >
>>>> >> > > "main" #1 prio=5 os_prio=31 cpu=6300.65ms
>>>> >> elapsed=123.28s allocated=242236760B defined_classes=1725
>>>> >> tid=0x00007fa13a806000 nid=0x1c03 pthread-id=0x109708000
>>>> waiting
>>>
>>> on
>>>>
>>>> >> condition [0x0000000109707000]
>>>> >> > > java.lang.Thread.State: TIMED_WAITING
>>>> (sleeping)
>>>> >> > > JavaThread state: _thread_blocked
>>>> >> > > Thread: 0x00007fa13a806000 [0x1c03] State:
>>>> >> _at_safepoint _has_called_back 0 _at_poll_safepoint 0
>>>> >> > > JavaThread state: _thread_blocked
>>>> >> > > at java.lang.Thread.sleep(java.base/Native
>>>> Method)
>>>> >> > > ...
>>>> >> > >
>>>> ------------------------------------------------------
>>>> >> > >
>>>> >> > > As mentioned above, we have a whole bunch of
>>>> other
>>>> >> enhancements to the thread dump similar to this one and would
>>>> be
>>>> >> willing to contribute them if there is any interest.
>>>> >> > >
>>>> >> > > Thanks and best regards,
>>>> >> > > Gunter
>>>> >> > >
>>>> >> > >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >>
>>>> >
>>>> >
>>>>
>>>>
>
More information about the serviceability-dev
mailing list