RFR: 8151526: Print -Xlog configuration in the hs_err_pid file
Coleen Phillimore
coleen.phillimore at oracle.com
Mon Apr 18 13:32:32 UTC 2016
On 4/17/16 5:46 AM, David Holmes wrote:
> Hi Max,
>
> Sorry for being late to the party here ...
>
> On 15/04/2016 2:22 AM, Max Ockner wrote:
>> new webrev:
>> http://cr.openjdk.java.net/~mockner/8151526.05/
>> - new describe_* methods are now private
>> - VMError is now a friend class in logConfiguration.
>
> but I also think reporting the available log level and tags is not the
> kind of thing we need in the hs_err file. The hs_err file captures
> information specific to the execution of the VM that crashed - the
> dynamic state if you like. The available log levels and tags are
> static information in the source code and don't tell me anything about
> the crash. (We didn't report which TraceXXX flags existed).
Yes, I agree, this line to describe_available() doesn't need to appear
in the crash log (but keep in vm.info because it's tied in with the main
'describe' function).
>
> While I'm all for providing useful crash information the hs_err file
> is starting to suffer from bloat in my opinion.
This one line is the least of the problems in the hs_err file. Wait
until you see it printing the 100s of GC threads and all the NTAMS
whatever they are.
Coleen
>
> Thanks,
> David
> -----
>
>
>> Thanks,
>> Max
>>
>> On 4/14/2016 4:30 AM, Marcus Larsson wrote:
>>> Hi,
>>>
>>> On 04/13/2016 09:33 PM, Max Ockner wrote:
>>>> Hello,
>>>> I have made these changes.
>>>> - refactored describe into a portion which uses locking and one that
>>>> does not.
>>>> - in the print_vm_info function in vmError.cpp, describe() is used,
>>>> since this function is called while the program is running.
>>>> - in the report function, the describe_no_lock and describe_with_lock
>>>> functions are both used without taking a lock. If it doesnt crash,
>>>> then fantastic. If it does crash, the error handler will skip this
>>>> step anyway.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~mockner/8151526.04
>>>
>>> I suggest making the new two functions private in LogConfiguration,
>>> and friending VMError, instead of keeping them public. That should
>>> reduce confusion about which describe-method to use, and prevent
>>> misuse of the describe_with_lock in other places.
>>>
>>> Otherwise this looks good.
>>>
>>> Thanks,
>>> Marcus
>>>
>>>>
>>>> Thanks,
>>>> Max
>>>>
>>>> On 4/12/2016 1:30 PM, Marcus Larsson wrote:
>>>>> Hey,
>>>>>
>>>>> On 2016-04-12 17:01, Max Ockner wrote:
>>>>>> Marcus,
>>>>>>
>>>>>> I just spoke with Coleen about this. If I attempt to print
>>>>>> configuration during a crash without lock and it crashes, the error
>>>>>> handler will eat it up and start at the next step. It would be fun
>>>>>> to write a locking mechanism, but this seems better to me.
>>>>>>
>>>>>> I will continue to call describe() from VM.info, but I will
>>>>>> refactor the code in describe so that all of the information can be
>>>>>> printed easily without taking a lock. For a crash, I'll use the
>>>>>> refactored print functions to print all the necessary information
>>>>>> without taking a lock. Even if the vm crashes during this step it
>>>>>> is not such a bad thing because of the robust error handler.
>>>>>
>>>>> Alright, that sounds good!
>>>>>
>>>>> Thanks,
>>>>> Marcus
>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Max
>>>>>>
>>>>>> On 4/12/2016 8:46 AM, Marcus Larsson wrote:
>>>>>>> Hi Max,
>>>>>>>
>>>>>>> On 04/11/2016 07:36 PM, Max Ockner wrote:
>>>>>>>> Hello again,
>>>>>>>>
>>>>>>>> I have looked into this and I see a possible workaround to the
>>>>>>>> issue raised by Marcus:
>>>>>>>>
>>>>>>>> LogConfiguration::describe() takes the ConfigurationLock,
>>>>>>>> which
>>>>>>>> might be problematic for the error handler. I suppose the
>>>>>>>> error
>>>>>>>> handler could potentially deadlock if its thread already holds
>>>>>>>> the lock. Unsure how to resolve this, because skipping to take
>>>>>>>> the lock means that we might crash due to another thread
>>>>>>>> changing the configuration as we read it.
>>>>>>>>
>>>>>>>>
>>>>>>>> My proposed solution is this:
>>>>>>>>
>>>>>>>> If the ConfigurationLock is held by another thread during the
>>>>>>>> crash, then I think we could print "Log configuration can not be
>>>>>>>> accessed because ConfigurationLock is currently held by another
>>>>>>>> thread" or something similar. If the ConfigurationLock can be
>>>>>>>> taken by the current thread, then print the configuration.
>>>>>>>>
>>>>>>>> We can accomplish this as follows:
>>>>>>>>
>>>>>>>> In *logConfiguration.cpp*: Create a ConfigurationLockMaybe class
>>>>>>>> extending ConfigurationLock.
>>>>>>>>
>>>>>>>> class ConfigurationLockMaybe : ConfigurationLock {
>>>>>>>> private:
>>>>>>>> bool _success;
>>>>>>>> public:
>>>>>>>> ConfigurationLockMaybe() {
>>>>>>>> if (_semaphore.trywait()){_success = true;}
>>>>>>>> }
>>>>>>>> ~ConfigurationLock() {
>>>>>>>> if(_success){_semaphore.signal();}
>>>>>>>> }
>>>>>>>> bool has_lock(){return _success;}
>>>>>>>> };
>>>>>>>>
>>>>>>>> _semaphore is a static var in ConfigurationLock, so the same copy
>>>>>>>> is accessed in child classes.
>>>>>>>>
>>>>>>>> Another method LogConfiguration::describe_on_crash(outputStream*
>>>>>>>> out) can use ConfigurationLockMaybe to attempt to print the
>>>>>>>> configuration.
>>>>>>>>
>>>>>>>> In *vmError.cpp*: Replace LogConfiguration::describe with
>>>>>>>> LogConfiguration::describe_on_crash.
>>>>>>>
>>>>>>> Sounds good to me.
>>>>>>>
>>>>>>> ConfigurationLockMaybe sounds a bit weird but I don't have a
>>>>>>> better suggestion.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Marcus
>>>>>>>
>>>>>>>>
>>>>>>>> Max
>>>>>>>>
>>>>>>>> On 4/11/2016 10:23 AM, Max Ockner wrote:
>>>>>>>>> I missed this before. I'm going to investigate.
>>>>>>>>> Max
>>>>>>>>>
>>>>>>>>> On 4/11/2016 4:19 AM, Marcus Larsson wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 04/11/2016 09:00 AM, Thomas Stüfe wrote:
>>>>>>>>>>> Hi Max,
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Apr 8, 2016 at 9:56 PM, Max Ockner
>>>>>>>>>>> <max.ockner at oracle.com <mailto:max.ockner at oracle.com>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I forgot to respond to this:
>>>>>>>>>>> On 4/1/2016 4:45 AM, Marcus Larsson wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> On 03/31/2016 08:08 PM, Max Ockner wrote:
>>>>>>>>>>>>> (Replies in line)
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/30/2016 9:31 AM, Thomas Stüfe wrote:
>>>>>>>>>>>>>> Hi Max,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Disclaimer: not a (R)eviewer.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Do we really need a list of all tags and all decorators?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Just going by what we currently put in the hs_err file,
>>>>>>>>>>>>> I think
>>>>>>>>>>>>> this may be the first "Do we really need" we've ever
>>>>>>>>>>>>> asked for
>>>>>>>>>>>>> this type of change.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All joking aside, I think it is a good idea to direct
>>>>>>>>>>>>> users
>>>>>>>>>>>>> toward proper UL usage whenever we have a chance.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also: I assume what you print is the Log state as it is
>>>>>>>>>>>>>> at the
>>>>>>>>>>>>>> time the hs-err file is printed. If logging was
>>>>>>>>>>>>>> enabled/changed during lifetime of the VM, e.g. with
>>>>>>>>>>>>>> jcmd,
>>>>>>>>>>>>>> would it be possible to see that? At least a warning if
>>>>>>>>>>>>>> logging was not enabled from the start on.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Would it be possible? Yes, but I think this would
>>>>>>>>>>>>> require a
>>>>>>>>>>>>> framework change. It does not look like any marks are
>>>>>>>>>>>>> made when
>>>>>>>>>>>>> the LogConfiguration changes during runtime, and we
>>>>>>>>>>>>> would need
>>>>>>>>>>>>> to record that in order to know what to print when we
>>>>>>>>>>>>> dump into
>>>>>>>>>>>>> hs_err.
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, this would required the framework to keep some sort
>>>>>>>>>>>> of log
>>>>>>>>>>>> of configuration changes. Is there value in knowing that
>>>>>>>>>>>> the log
>>>>>>>>>>>> configuration changed since startup?
>>>>>>>>>>>
>>>>>>>>>>> I think it is useful along with other information. An
>>>>>>>>>>> example case:
>>>>>>>>>>>
>>>>>>>>>>> Let's say we are logging a message whenever a lock is
>>>>>>>>>>> acquired or
>>>>>>>>>>> released. In a subsystem like this, the action which dooms
>>>>>>>>>>> the vm
>>>>>>>>>>> may not cause an immediate crash. This leaves time for
>>>>>>>>>>> logging to
>>>>>>>>>>> turn on or off in between the bad action and the crash. If
>>>>>>>>>>> you
>>>>>>>>>>> don't know when logging was active and when it was not,
>>>>>>>>>>> then the
>>>>>>>>>>> absence of a particular message does not give you much
>>>>>>>>>>> information. You may not see the message that has the
>>>>>>>>>>> answer for
>>>>>>>>>>> you because it was not logged, though the action may have
>>>>>>>>>>> occurred.
>>>>>>>>>>>
>>>>>>>>>>> However, if we don't have the framework support for this
>>>>>>>>>>> then I
>>>>>>>>>>> believe it should be added later.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This is a good example. A small easy solution, as I mentioned
>>>>>>>>>>> before, could be just to distinguish between "logging
>>>>>>>>>>> parameters stable since startup" and "logging parameters
>>>>>>>>>>> changed at some time". That way you would at least know
>>>>>>>>>>> whether to trust the logging output.
>>>>>>>>>>>
>>>>>>>>>>> But you are right, this can be added later.
>>>>>>>>>>
>>>>>>>>>> I've filed https://bugs.openjdk.java.net/browse/JDK-8153945 for
>>>>>>>>>> this.
>>>>>>>>>>
>>>>>>>>>> Max: Did you see my comments about the ConfigurationLock? It
>>>>>>>>>> was an inline reply to your first mail.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Marcus
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Kind Regards, Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Max
>>>>>>>>>>>>>> Kind Regards, Thomas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Mar 29, 2016 at 9:00 PM, Max Ockner
>>>>>>>>>>>>>> <max.ockner at oracle.com <mailto:max.ockner at oracle.com>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>> Please review another Unified Logging change.
>>>>>>>>>>>>>> They are
>>>>>>>>>>>>>> almost done, and we are serious this time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> bug:
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151526
>>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mockner/8151526.01/
>>>>>>>>>>>>>> <http://cr.openjdk.java.net/%7Emockner/8151526.01/>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> LogConfiguration::describe() takes the ConfigurationLock,
>>>>>>>>>>>> which
>>>>>>>>>>>> might be problematic for the error handler. I suppose the
>>>>>>>>>>>> error
>>>>>>>>>>>> handler could potentially deadlock if its thread already
>>>>>>>>>>>> holds
>>>>>>>>>>>> the lock. Unsure how to resolve this, because skipping to
>>>>>>>>>>>> take
>>>>>>>>>>>> the lock means that we might crash due to another thread
>>>>>>>>>>>> changing the configuration as we read it.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Marcus
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The logging configuration is now printed in each
>>>>>>>>>>>>>> hs_err
>>>>>>>>>>>>>> file. The output is the same as you would see from
>>>>>>>>>>>>>> -Xlog:logging=trace and it is obtained from
>>>>>>>>>>>>>> LogConfiguration::describe().
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Below is a sample of the hs_err contents. The
>>>>>>>>>>>>>> Logging info
>>>>>>>>>>>>>> is printed after VM Arguments and Whitebox, and
>>>>>>>>>>>>>> before
>>>>>>>>>>>>>> Environment Variables.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> VM Arguments:
>>>>>>>>>>>>>> java_command: Kaboom
>>>>>>>>>>>>>> java_class_path (initial): .
>>>>>>>>>>>>>> Launcher Type: SUN_STANDARD
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Logging:
>>>>>>>>>>>>>> Available log levels: off, trace, debug, info,
>>>>>>>>>>>>>> warning, error
>>>>>>>>>>>>>> Available log decorators: time (t), uptime (u),
>>>>>>>>>>>>>> timemillis
>>>>>>>>>>>>>> (tm), uptimemillis (um), timenanos (tn),
>>>>>>>>>>>>>> uptimenanos (un),
>>>>>>>>>>>>>> hostname (hn), pid (p), tid (ti), level (l), tags
>>>>>>>>>>>>>> (tg)
>>>>>>>>>>>>>> Available log tags: alloc, age, barrier,
>>>>>>>>>>>>>> biasedlocking,
>>>>>>>>>>>>>> bot, census, classhisto, classresolve, classinit,
>>>>>>>>>>>>>> classload, classloaderdata, classunload, classpath,
>>>>>>>>>>>>>> compaction, cpu, cset, defaultmethods, ergo,
>>>>>>>>>>>>>> exceptions,
>>>>>>>>>>>>>> exit, freelist, gc, heap, humongous, ihop, itables,
>>>>>>>>>>>>>> jni,
>>>>>>>>>>>>>> liveness, logging, marking, metaspace, modules,
>>>>>>>>>>>>>> monitorinflation, os, phases, plab, promotion,
>>>>>>>>>>>>>> preorder,
>>>>>>>>>>>>>> protectiondomain, ref, refine, region, remset,
>>>>>>>>>>>>>> safepoint,
>>>>>>>>>>>>>> safepointcleanup, scavenge, scrub, stacktrace,
>>>>>>>>>>>>>> start,
>>>>>>>>>>>>>> startuptime, state, stats, stringdedup, stringtable,
>>>>>>>>>>>>>> survivor, sweep, task, thread, tlab, time,
>>>>>>>>>>>>>> verboseverification, verify, vmoperation, vtables
>>>>>>>>>>>>>> Log output configuration:
>>>>>>>>>>>>>> #0: stdout all=off uptime,level,tags,
>>>>>>>>>>>>>> #1: stderr all=warning uptime,level,tags,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Environment Variables:
>>>>>>>>>>>>>> JAVA_HOME=/scratch/mockner/UL/8151526/build/linux-x86_64-normal-server-fastdebug/images/jdk
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> PATH=/home/mockner/tools/webrev:/bin:/usr/bin:/usr/local/bin:/usr/sbin:/home/mockner/bin:/home/mockner/linux/bin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> SHELL=/bin/bash
>>>>>>>>>>>>>> OS=Linux
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
More information about the hotspot-runtime-dev
mailing list