RFR(s): 8149036: Add UL tracing for thread related events at os level
Thomas Stüfe
thomas.stuefe at gmail.com
Thu Feb 25 08:21:36 UTC 2016
Hi David,
I will open a new issue in JBS for the cleanups you requested.
Kind Regards, Thomas
On Thu, Feb 25, 2016 at 5:14 AM, David Holmes <david.holmes at oracle.com>
wrote:
> Hi Thomas,
>
> Seems this has already been pushed so a small follow up may be needed.
>
> On 25/02/2016 3:47 AM, Thomas Stüfe wrote:
>
>> Hi David,
>>
>> thank you for the review!
>>
>> New webrev:
>>
>> http://cr.openjdk.java.net/~stuefe/webrevs/8149036-add-tracing-for-thread-events/webrev.03/webrev/
>>
>
> Thanks for the updates.
>
> src/os/solaris/vm/os_solaris.cpp
>
> Style-nit:
>
> static char* describe_thr_create_attributes(char* buf, size_t buflen,
> size_t stacksize, long flags)
> {
>
> should be:
>
> static char* describe_thr_create_attributes(char* buf, size_t buflen,
> size_t stacksize, long flags) {
>
> Similarly for describe_beginthreadex_attributes.
>
> ---
>
> src/share/vm/runtime/thread.cpp
>
> Looking more closely I'm curious about the choice of location for the
> "thread exiting/detaching" report. Why in the middle of JavaThread::exit?
> And why not on the destroy_vm path? I would have put this right before
> Threads::remove(this). It also took me a while to realize why I saw a lot
> of "Thread finished" but no "exiting" - they were non JavaThreads. So maybe
> we should say "JavaThread exiting/detaching"?
>
> ---
>
> Now that I can see the output I also noticed other nits - sorry:
>
> [0.081s][info ][os,thread] Thread is alive (tid: 18347, pthread id:
> 2660186944).
> [0.083s][info ][os,thread] Thread finished (tid 18347, pthread id
> 2660186944).
>
> the alive line has colons and the finished line does not.
>
> There's some inconsistency regarding showing gettid and pthread_self
> values eg:
>
> [0.082s][info ][os,thread] Thread attached (tid: 17983, pthread id:
> 4135000896).
> [0.082s][info ][os,thread] Thread 17983 detaching.
> [0.081s][info ][os,thread] Thread started (pthread id: 2660186944,
> attributes: stacksize: 512k, guardsize: 4k, detached).
>
> One has both ids, one has only tid, and one has only pthread-id.
>
> Thanks,
> David
> -----
>
> The changes (one of these days I plan to figure out how to print
>> relative webrevs with mercurial queues) are:
>>
>> - I fixed the issues you found, see comments inline
>> - I changed most of the calls from log_debug() to log_info(), because I
>> found that tracing level more fitting
>>
>>
>> On Wed, Feb 24, 2016 at 1:17 PM, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>> Hi Thomas,
>>
>> Been waiting for the dust to settle :)
>>
>>
>> :)
>>
>>
>>
>> On 24/02/2016 7:24 PM, Thomas Stüfe wrote:
>>
>> Coleen, Marcus,
>>
>> thank you for reviewing, and @Coleen, thanks for the sponsoring
>> offer!
>>
>> Here the latest Webrev:
>>
>> http://cr.openjdk.java.net/~stuefe/webrevs/8149036-add-tracing-for-thread-events/webrev.02/webrev/index.html
>>
>> As Marcus suggested, I added a new "thread" tag and added the
>> tag to the
>> logging calls.
>>
>>
>> I have a few comments ...
>>
>> src/os/bsd/vm/os_bsd.cpp
>>
>>
>> 786 LogHandle(os, thread) log;
>> 787 if (log.is_debug()) {
>>
>>
>> Oh, good catch, and this was actually a bug, because the "log.warning()"
>> was in the "log.is_debug()" condition so the warning would not have been
>> printed unless log level was debug or higher. I fixed this for the
>> affected platforms.
>>
>> I think it is preferable to only create the LogHandle inside the if
>> block. Normally we do an "if (is_enabled...())" check first.
>>
>> 793 log.warning("Failed to start thread - pthread_create
>> failed (%s) for attributes: %s.",
>> 794 strerror(errno),
>> os::Posix::describe_pthread_attr(buf, sizeof(buf), &attr));
>>
>> pthread_* functions don't set errno they return the error code
>> directly.
>>
>>
>> Oops. Yes, I fixed that.
>>
>> Ditto for src/os/linux/vm/os_linux.cpp
>>
>> ---
>>
>> src/os/solaris/vm/os_solaris.cpp
>>
>> Why did you use a different style for the thr_create logging
>> compared to *NIX? The flags can be described similarly to a
>> pthread-attr object.
>>
>> Similar comment for os_windows.cpp.
>>
>>
>> Okay, for both platforms I implemented a similar printout as for the
>> Posix platforms. Now we see flags and stacksize on all platforms.
>>
>> ---
>>
>> src/share/vm/runtime/thread.cpp
>>
>> Personally I think plain warning(...) calls should be converted to
>> untagged logging calls - otherwise the warning will likely never be
>> seen and we want these warnings to be quite visible. But the policy
>> is to do what you did.
>>
>>
>> That is a hot topic in our group and usually frowned at, because
>> unexpected warning output (which may or may not matter to the customer)
>> disturbs automated processes parsing this output.
>>
>>
>> btw, I found the command line a bit confusing:
>>
>> If a logging call is tagged with two tags A and B, I would have
>> thought the
>> default behaviour of "-Xlog:A" is to log all calls tagged at
>> least with A,
>> not all calls which are solely tagged with A and nothing else. I
>> could not
>> even think ot a use case for the latter.
>>
>>
>> The syntax/semantics is somewhat subjective and obviously have to be
>> learned. I like the fact -Xlog:os only turns on things tagged only
>> with "os".
>>
>> After some searching, I found that the behaviour I want and I
>> would think
>> makes sense as default is "-Xlog:A*".
>>
>>
>> -Xlog:help shows
>>
>> -Xlog Usage: -Xlog[:[what][:[output][:[decorators][:output-options]]]]
>> where 'what' is a combination of tags and levels on the
>> form tag1[+tag2...][*][=level][,...]
>> Unless wildcard (*) is specified, only log messages tagged
>> with exactly the tags specified will be matched.
>>
>> This also means that log lines will be mysteriously disappearing
>> from
>> output if someone adds another tag to that line, unless I always
>> use an
>> asterix? Or am I just slow to understand the command line syntax?
>>
>>
>> Unfortunately you need to know how the code is tagged to know what
>> log output to ask for. Once we put these in place we're introducing
>> a new set of compatibility constraints that will make it hard to
>> rework the tagging in the future.
>>
>>
>> Well, I think this design decision (if it even was a deliberate
>> decision) is unfortunate. In the end, like Coleen writes, many may
>> decide to skip to not use multiple tags. Which is unfortunate, because
>> the ability to have multiple tags per command line is quite a cool
>> feature.
>>
>> I also still fail to see any use case where I would want to log a tag
>> but only those log sites which are not tagged with something different
>> too. Similarly to a keyword based item search - One usually wants all
>> items for a keyword and does not care if the item is tagged with other
>> keywords too.
>>
>> Thanks,
>>
>> Thomas
>>
>> Thanks,
>> David
>>
>>
>> Kind Regards, Thomas
>>
>>
>>
>>
>>
>>
>> On Tue, Feb 23, 2016 at 6:52 PM, Marcus Larsson
>> <marcus.larsson at oracle.com <mailto:marcus.larsson at oracle.com>>
>>
>> wrote:
>>
>> Hi,
>>
>> On 2016-02-22 17:54, Thomas Stüfe wrote:
>>
>> Dear all,
>>
>> please take a look at this proposed addition to UL. This
>> adds a number of
>> trace points to thread creation. In detail:
>>
>> - it traces thread creation and thread creation errors,
>> including pthread
>> attributes (for Posix platforms)
>> - it traces stack location and creation/removal of stack
>> guard pages.
>>
>> This all was first AIX-only tracing, but I converted
>> this to UL and made
>> it
>> available on all platforms.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8149036
>> Webrev:
>>
>>
>> http://cr.openjdk.java.net/~stuefe/webrevs/8149036-add-tracing-for-thread-events/webrev.00/webrev/
>>
>>
>> It might be a good idea to add a 'thread' tag and use that
>> in addition to
>> the os tag for these messages. It would allow easy filtering
>> of these
>> messages for those interested/uninterested. Just 'os' alone
>> is quite a wide
>> area.
>>
>> Thanks,
>> Marcus
>>
>>
>> Note also that I added a helper function, os::errno_name(),
>> which is a very
>>
>> simple replacement for strerror() without its problems
>> (thread safety,
>> unwanted localizations...).
>>
>> What do you think?
>>
>> Kind Regards, Thomas
>>
>>
>>
>>
>>
More information about the hotspot-runtime-dev
mailing list