RFR(s): 8149036: Add UL tracing for thread related events at os level

Thu Feb 25 04:14:57 UTC 2016

Hi Thomas,

Seems this has already been pushed so a small follow up may be needed.

On 25/02/2016 3:47 AM, Thomas Stüfe wrote:
> Hi David,
>
> thank you for the review!
>
> New webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8149036-add-tracing-for-thread-events/webrev.03/webrev/

Thanks for the updates.

src/os/solaris/vm/os_solaris.cpp

Style-nit:

static char* describe_thr_create_attributes(char* buf, size_t buflen,
   size_t stacksize, long flags)
{

should be:

static char* describe_thr_create_attributes(char* buf, size_t buflen,
                                             size_t stacksize, long flags) {

Similarly for describe_beginthreadex_attributes.

---

src/share/vm/runtime/thread.cpp

Looking more closely I'm curious about the choice of location for the 
"thread exiting/detaching" report. Why in the middle of 
JavaThread::exit? And why not on the destroy_vm path? I would have put 
this right before Threads::remove(this). It also took me a while to 
realize why I saw a lot of "Thread finished" but no "exiting" - they 
were non JavaThreads. So maybe we should say "JavaThread exiting/detaching"?

---

Now that I can see the output I also noticed other nits - sorry:

[0.081s][info ][os,thread] Thread is alive (tid: 18347, pthread id: 
2660186944).
[0.083s][info ][os,thread] Thread finished (tid 18347, pthread id 
2660186944).

the alive line has colons and the finished line does not.

There's some inconsistency regarding showing gettid and pthread_self 
values eg:

[0.082s][info ][os,thread] Thread attached (tid: 17983, pthread id: 
4135000896).
[0.082s][info ][os,thread] Thread 17983 detaching.
[0.081s][info ][os,thread] Thread started (pthread id: 2660186944, 
attributes: stacksize: 512k, guardsize: 4k, detached).

One has both ids, one has only tid, and one has only pthread-id.

Thanks,
David
-----

> The changes (one of these days I plan to figure out how to print
> relative webrevs with mercurial queues) are:
>
> - I fixed the issues you found, see comments inline
> - I changed most of the calls from log_debug() to log_info(), because I
> found that tracing level more fitting
>
>
> On Wed, Feb 24, 2016 at 1:17 PM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     Hi Thomas,
>
>     Been waiting for the dust to settle :)
>
>
> :)
>
>
>
>     On 24/02/2016 7:24 PM, Thomas Stüfe wrote:
>
>         Coleen, Marcus,
>
>         thank you for reviewing, and @Coleen, thanks for the sponsoring
>         offer!
>
>         Here the latest Webrev:
>         http://cr.openjdk.java.net/~stuefe/webrevs/8149036-add-tracing-for-thread-events/webrev.02/webrev/index.html
>
>         As Marcus suggested, I added a new "thread" tag and added the
>         tag to the
>         logging calls.
>
>
>     I have a few comments ...
>
>     src/os/bsd/vm/os_bsd.cpp
>
>
>       786     LogHandle(os, thread) log;
>       787     if (log.is_debug()) {
>
>
> Oh, good catch, and this was actually a bug, because the "log.warning()"
> was in the "log.is_debug()" condition so the warning would not have been
> printed unless log level was debug or higher. I fixed this for the
> affected platforms.
>
>     I think it is preferable to only create the LogHandle inside the if
>     block. Normally we do an "if (is_enabled...())" check first.
>
>       793         log.warning("Failed to start thread - pthread_create
>     failed (%s) for attributes: %s.",
>       794           strerror(errno),
>     os::Posix::describe_pthread_attr(buf, sizeof(buf), &attr));
>
>     pthread_* functions don't set errno they return the error code directly.
>
>
> Oops. Yes, I fixed that.
>
>     Ditto for src/os/linux/vm/os_linux.cpp
>
>     ---
>
>     src/os/solaris/vm/os_solaris.cpp
>
>     Why did you use a different style for the thr_create logging
>     compared to *NIX? The flags can be described similarly to a
>     pthread-attr object.
>
>     Similar comment for os_windows.cpp.
>
>
> Okay, for both platforms I implemented a similar printout as for the
> Posix platforms. Now we see flags and stacksize on all platforms.
>
>     ---
>
>     src/share/vm/runtime/thread.cpp
>
>     Personally I think plain warning(...) calls should be converted to
>     untagged logging calls - otherwise the warning will likely never be
>     seen and we want these warnings to be quite visible. But the policy
>     is to do what you did.
>
>
> That is a hot topic in our group and usually frowned at, because
> unexpected warning output (which may or may not matter to the customer)
> disturbs automated processes parsing this output.
>
>
>         btw, I found the command line a bit confusing:
>
>         If a logging call is tagged with two tags A and B, I would have
>         thought the
>         default behaviour of "-Xlog:A" is to log all calls tagged at
>         least with A,
>         not all calls which are solely tagged with A and nothing else. I
>         could not
>         even think ot a use case for the latter.
>
>
>     The syntax/semantics is somewhat subjective and obviously have to be
>     learned. I like the fact -Xlog:os only turns on things tagged only
>     with "os".
>
>         After some searching, I found that the behaviour I want and I
>         would think
>         makes sense as default is "-Xlog:A*".
>
>
>     -Xlog:help shows
>
>     -Xlog Usage: -Xlog[:[what][:[output][:[decorators][:output-options]]]]
>               where 'what' is a combination of tags and levels on the
>     form tag1[+tag2...][*][=level][,...]
>               Unless wildcard (*) is specified, only log messages tagged
>     with exactly the tags specified will be matched.
>
>         This also means that log lines will be mysteriously disappearing
>         from
>         output if someone adds another tag to that line, unless I always
>         use an
>         asterix? Or am I just slow to understand the command line syntax?
>
>
>     Unfortunately you need to know how the code is tagged to know what
>     log output to ask for. Once we put these in place we're introducing
>     a new set of compatibility constraints that will make it hard to
>     rework the tagging in the future.
>
>
> Well, I think this design decision (if it even was a deliberate
> decision) is unfortunate. In the end, like Coleen writes, many may
> decide to skip to not use multiple tags. Which is unfortunate, because
> the  ability to have multiple tags per command line is quite a cool feature.
>
> I also still fail to see any use case where I would want to log a tag
> but only those log sites which are not tagged with something different
> too. Similarly to a keyword based item search - One usually wants all
> items for a keyword and does not care if the item is tagged with other
> keywords too.
>
> Thanks,
>
> Thomas
>
>     Thanks,
>     David
>
>
>         Kind Regards, Thomas
>
>
>
>
>
>
>         On Tue, Feb 23, 2016 at 6:52 PM, Marcus Larsson
>         <marcus.larsson at oracle.com <mailto:marcus.larsson at oracle.com>>
>         wrote:
>
>             Hi,
>
>             On 2016-02-22 17:54, Thomas Stüfe wrote:
>
>                 Dear all,
>
>                 please take a look at this proposed addition to UL. This
>                 adds a number of
>                 trace points to thread creation. In detail:
>
>                 - it traces thread creation and thread creation errors,
>                 including pthread
>                 attributes (for Posix platforms)
>                 - it traces stack location and creation/removal of stack
>                 guard pages.
>
>                 This all was first AIX-only tracing, but I converted
>                 this to UL and made
>                 it
>                 available on all platforms.
>
>                 Bug: https://bugs.openjdk.java.net/browse/JDK-8149036
>                 Webrev:
>
>                 http://cr.openjdk.java.net/~stuefe/webrevs/8149036-add-tracing-for-thread-events/webrev.00/webrev/
>
>
>             It might be a good idea to add a 'thread' tag and use that
>             in addition to
>             the os tag for these messages. It would allow easy filtering
>             of these
>             messages for those interested/uninterested. Just 'os' alone
>             is quite a wide
>             area.
>
>             Thanks,
>             Marcus
>
>
>             Note also that I added a helper function, os::errno_name(),
>             which is a very
>
>                 simple replacement for strerror() without its problems
>                 (thread safety,
>                 unwanted localizations...).
>
>                 What do you think?
>
>                 Kind Regards, Thomas
>
>
>
>