RFR(S): 8200720: Print additional information in thread dump (times, allocated bytes etc.)
David Holmes
david.holmes at oracle.com
Sun Jun 3 23:18:55 UTC 2018
PS.
uint64_t getElepsedTime() const { return
os::javaTimeMillis() - getStartTime(); }
Millisecond resolution seems rather coarse.
David
-----
On 4/06/2018 9:12 AM, David Holmes wrote:
> Hi Gunter, Chris,
>
> Sorry just trying to catch up and this is only a partial look so far ...
>
> BTW these changes are not limited to serviceability code so adding in
> runtime team as well.
>
> On 2/06/2018 9:12 AM, Chris Plummer wrote:
>> Hi Gunter,
>>
>> On 6/1/18 3:17 AM, Haug, Gunter wrote:
>>> Hi Chris,
>>>
>>> thanks for looking into this!
>>>
>>> Re the synchronization: The value is stored only in a VM operation at
>>> a safepoint by the VM thread. I was of the opinion, that this could
>>> not be interrupted by a second VM operation (of the same type). Or am
>>> I missing something else?
>> I was really thinking more about the temporary changing of
>> PrintExtendedThreadInfo, not the value stored in the VMOp. You may be
>> be correct that no more than one VMOp is executing, but while it is
>> executing it is has changed the value of PrintExtendedThreadInfo,
>> which might have an impact on anything else that triggers printing
>> thread info while the VMOp is executing.
>
> Even if nothing else can possibly be running during the safepoint I find
> it extremely bad form to change a command-line flag in this way,
> particularly from a safepoint!
>
> If this flag is intended to be dynamically enabled as part of a dcmd
> then it should also be a Manageable flag IMHO.
>
> That said ...
>
>>> I did think about passing an argument to the various print_on
>>> member functions of the thread classes, but this would require rather
>>> extensive code changes for a rather tiny extension. Therefore I feel
>>> doing it like this is the lesser evil.
>
> ... it's obviously not truly a global setting, but one that is needed on
> a per-print-request basis. The flag is just the default, but if the
> default is off you still want to enable extended printing on a
> per-request basis.
>
> The current print_on mechanics is not set up for this kind of
> flexibility. I think this needs more thought.
>
> ---
>
> Re osThread.cpp shared code changes ... I really hate to see all the
> ifdefs in there. Please add a pd_print_on function to support this if
> you truly want platform specific information.
>
> ---
>
> threadStatisticInfo.hpp
>
> typo: getElepsedTime() -> getElapsedTime()
>
> Thanks,
> David
>
>
>
>>>
>>> Re thread_dump(): I think it's correct or, well, at least it works
>>> ;-) In fact jstack will transfer the "-e" and "-l" in only one
>>> string, i.e. op->arg(0).
>> So you are saying that op->arg(0) is "-e -l" when using jstack? I
>> think you really need to clean up the parsing. As it stands right now
>> passing, I get the feeling that if a user erroneously asks for help by
>> using "jcmd <pid> Thread.Print -help", it will end up executing with
>> -e an -l enabled, and no failure for the invalid "-help" option.
>>> Christoph has already explained my reasoning. But I agree, it's not
>>> nice and I would be happy to do it like Christoph suggested.
>> I'll respond separately to his suggestion.
>>
>> thanks,
>>
>> Chris
>>>
>>> And typo fixed, sorry.
>>>
>>> Thanks again,
>>> Gunter
>>>
>>> On 01.06.18, 00:03, "Chris Plummer" <chris.plummer at oracle.com> wrote:
>>>
>>> Hi Gunter,
>>> globals.hpp: fix typo "informatiuon"
>>> I worry a little bit about the synchronizing (if that's the
>>> right word)
>>> of PrintExtendedThreadInfo and the dcmd's -e flag. When using
>>> -e, you
>>> are temporarily enabling PrintExtendedThreadInfo if it was
>>> false. This
>>> temporarily changes the behavior of thread dumps, and could
>>> impact other
>>> uses that happen in parallel. Also, could two simultaneous uses
>>> of -e
>>> result in PrintExtendedThreadInfo not getting restored properly?
>>> thread_dump() doesn't look right. It looks like you are
>>> iterating char
>>> by char over the argument, and expect something like "-el" to be
>>> specified rather then "-e -l". The loop should be iterating over
>>> op->arg(i), not op->arg(0)[i].
>>> The rest of the changes look fine.
>>> thanks,
>>> Chris
>>> On 5/30/18 8:12 AM, Haug, Gunter wrote:
>>> > Hi all,
>>> >
>>> > As Chris proposed, I have made an the extended output
>>> switchable. There is an VM flag (PrintExtendedThreadInfo), which is
>>> false by default. Moreover, there is an Option (-e) which can be used
>>> with jcmd Thread.print as well as with jstack. The -e option
>>> essentially sets PrintExtendedThreadInfo true just for the respective
>>> thread dump.
>>> >
>>> > Here is the updated webrev:
>>> >
>>> > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v2
>>> >
>>> > (https://bugs.openjdk.java.net/browse/JDK-8200720)
>>> >
>>> > Thanks,
>>> > Gunter
>>> >
>>> >
>>> > On 02.05.18, 17:07, "serviceability-dev on behalf of Haug,
>>> Gunter" <serviceability-dev-bounces at openjdk.java.net on behalf of
>>> gunter.haug at sap.com> wrote:
>>> >
>>> > Hi Chris,
>>> >
>>> > Thanks for looking into this.
>>> > You're right, there is a little more we have. We have
>>> implemented an IO tracing mechanism which - rather as a byproduct -
>>> keeps track of bytes read and written per thread. However, this of
>>> course requires changes not only in hotspot. We would be happy to
>>> contribute this as well, but this is a far bigger change and will
>>> probably lead to a far bigger discussion. Anyway, with the number of
>>> bytes read, the number of classes defined doesn't look that arbitrary
>>> anymore, as one can correlate IO to class loading.
>>> >
>>> > Regarding the verbose option I think that's a good idea!
>>> >
>>> > Thanks again,
>>> > Gunter
>>> >
>>> > On 01.05.18, 22:55, "Chris Plummer"
>>> <chris.plummer at oracle.com> wrote:
>>> >
>>> > Hi Gunter,
>>> >
>>> > The output you are adding is all useful. I think the
>>> question is (and
>>> > I'd like to see a few people chime in on this for
>>> this review) is
>>> > whether or not all of it is the appropriate for a
>>> thread's stack dump.
>>> > For example, defined_classes is on the fringe of what
>>> I would call
>>> > generally useful info in a stack dump. Sure, there
>>> might be that rare
>>> > case when you need it, but how often compared to
>>> other useful info
>>> > maintained on a per thread basis. How many other bits
>>> of useful info are
>>> > not being printed in favor of defined_classes? It
>>> seems you have more in
>>> > the queue. How many? I'm worried about how cluttered
>>> the stack dumps
>>> > will get. Maybe we should add some sort of verbose
>>> thread dumping
>>> > option. Just a thought.
>>> >
>>> > As for the implementation, overall it looks good, but
>>> I can't speak to
>>> > whether or not you are doing proper accounting of
>>> defined_classes and
>>> > bytes allocated. You'll need input from someone with
>>> more knowledge of
>>> > those areas. We'll also need to do some testing to
>>> make sure tool tests
>>> > are not impacted.
>>> >
>>> > thanks,
>>> >
>>> > Chris
>>> >
>>> > On 4/30/18 2:51 AM, Haug, Gunter wrote:
>>> > > Hi,
>>> > >
>>> > > this is an update to an RFR I posted on
>>> hotspot-dev, but it is probably more suitable to post it here. Can I
>>> please have a review and a sponsor for the following enhancement:
>>> > >
>>> > > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v1
>>> > > https://bugs.openjdk.java.net/browse/JDK-8200720
>>> > >
>>> > > We at SAP have extended the thread dumps (obtained
>>> by jstack or jcmd) by several fields providing thread specific
>>> information. These extensions are quite popular with our support
>>> team. With some knowledge of the architecture of the application,
>>> they often allow for quick and simple diagnosis of a running system.
>>> Therefore we would like to contribute these enhancements.
>>> > >
>>> > > I took a few simple examples here, namely cpu time,
>>> elapsed time since thread creation, bytes allocated and classes
>>> defined by the thread and the pthread-id or equivalent on platforms
>>> where it makes sense. Provided it is known how the application should
>>> behave, a misbehaving thread can identified easily.
>>> > >
>>> > > There is no measurable overhead for this
>>> enhancement. However, it may be a problem that the format of the
>>> output is changed. Tools parsing the output may have to be changed.
>>> > >
>>> > > Here is an example of the output generated:
>>> > >
>>> > > ------------------------------------------------------
>>> > >
>>> > > "main" #1 prio=5 os_prio=31 cpu=6300.65ms
>>> elapsed=123.28s allocated=242236760B defined_classes=1725
>>> tid=0x00007fa13a806000 nid=0x1c03 pthread-id=0x109708000 waiting on
>>> condition [0x0000000109707000]
>>> > > java.lang.Thread.State: TIMED_WAITING (sleeping)
>>> > > JavaThread state: _thread_blocked
>>> > > Thread: 0x00007fa13a806000 [0x1c03] State:
>>> _at_safepoint _has_called_back 0 _at_poll_safepoint 0
>>> > > JavaThread state: _thread_blocked
>>> > > at java.lang.Thread.sleep(java.base/Native Method)
>>> > > ...
>>> > > ------------------------------------------------------
>>> > >
>>> > > As mentioned above, we have a whole bunch of other
>>> enhancements to the thread dump similar to this one and would be
>>> willing to contribute them if there is any interest.
>>> > >
>>> > > Thanks and best regards,
>>> > > Gunter
>>> > >
>>> > >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>
>>
More information about the serviceability-dev
mailing list