RFR: 8027429: Add diagnostic command VM.info to get hs_err print-out (NEW WEBREV)
David Holmes
david.holmes at oracle.com
Fri Nov 13 04:52:11 UTC 2015
Hi Coleen,
On 13/11/2015 7:41 AM, Coleen Phillimore wrote:
>
> Hi, I should learn my lesson that Frederic, Bertrand and David are
> generally right.
>
> My restructuring went afoul of the STEP framework, and so I have made
> VMError::print_vm_info() report the safe things that VMError::report()
> reports. Hopefully, this will address the concern about safety. It
> also allows us to remove the lock and static buffer in the VM.info
> function. The duplication isn't as bad as I thought and is a better
> solution than piggy backing from VMError::report().
>
> Please review:
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8027429.03/
> bug link https://bugs.openjdk.java.net/browse/JDK-8027429
This looks good to me. A few small queries:
888 if (Universe::is_fully_initialized()) {
I don't think it is possible to run a Dcmd before the VM is fully
initialized.
889 Universe::heap()->print_on_error(st);
Is this safe? I'm not sure what it does - print heap memory regions ??
891 st->print_cr("Polling page: " INTPTR_FORMAT,
p2i(os::get_polling_page()));
Is this useful ??
Thanks,
David
> I tested with jcmd <pid> VM.info concurrently and with an inserted crash
> during VM.info call to make sure the original hs_err_pid report comes
> out okay.
>
> Thanks,
> Coleen
>
> On 11/11/15 4:16 PM, Coleen Phillimore wrote:
>>
>>
>> On 11/10/15 7:12 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> I still do not like the intertwining of the info and error code, but
>>> I won't say more on that.
>>
>> Yes, I understand the approach has good arguments on both sides. I'm
>> trying to find a compromise that can move this feature forward.
>>>
>>> However you may now have a concurrency problem:
>>>
>>> 842 // Report for the vm_info_cmd, skipping crash specific code
>>> 843 void VMError::report_info(outputStream* st) {
>>> 844
>>> 845 // don't allocate large buffer on stack
>>> 846 static char buf[O_BUFLEN];
>>>
>>> Can you have concurrent info requests in flight? For the info case
>>> the buffer could be allocated on the stack I think, or even
>>> dynamically. (Aside: will this be executed in the service thread?)
>>
>> There is a lock added to protect the VM.info dcmd, so there would be
>> only sequential access to the static buffer. I don't think this is
>> executed in the Service Thread. I see no evidence of this anyway.
>>
>>>
>>> report_process could possibly be refactored further as there's a lot
>>> of initial stuff that won't apply to the info case:
>>> - we won't print all threads
>>> - the VM state will always be "not at safepoint"
>>> - The Universe must be initialized (not clear at what point during
>>> shutdown a dcmd can still come in?)
>>> - we don't print owned locks
>>
>> Yes, I can do this. It actually improves how this looks.
>>
>> Thanks,
>> Coleen
>>>
>>> That's all from me.
>>>
>>> Thanks,
>>> David
>>>
>>>
>>> On 6/11/2015 7:01 AM, Coleen Phillimore wrote:
>>>>
>>>> Hi,
>>>>
>>>> I've done some refactoring and cleanup on this which the refactoring
>>>> enabled. Again, it's important that the VM.info output be mostly the
>>>> same order as the hs_err_pid* report file so that sustaining can easily
>>>> match and find output that they are looking for. It's also important
>>>> not to copy this code because there may be new information added
>>>> that we
>>>> also want in the VM.info output. If any future information that we
>>>> want
>>>> to print seems unsafe, the vm_info_cmd can be used to exclude it or it
>>>> can be printed only in the error reporting function. Rejecting the
>>>> entire change because of the theoretical unsafe addition of code isn't
>>>> the right thing to do, although yeah, I've argued the other way in the
>>>> past. This is something that is helpful to sustaining and our
>>>> licensees.
>>>>
>>>> In this patch, the functions that VM.info calls are safe while the JVM
>>>> is running, but the stepping is required because at crash time, they
>>>> might not be. Although I've never seen the error reporting crash in
>>>> these places.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8027429.01/
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8027429
>>>>
>>>> This refactoring has been run through RBT.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 11/4/15 2:25 AM, David Holmes wrote:
>>>>> On 4/11/2015 1:36 AM, Mattis Castegren wrote:
>>>>>> Also note that some of the crash information is not interesting for
>>>>>> VM.info. For example, we don't want to print a thread dump or the
>>>>>> registers/stack data for any specific thread. If people want thread
>>>>>> dumps, they can use the separate Dcmd. We only want the general
>>>>>> system state.
>>>>>
>>>>> I'm a little surprised that thread info is not of interest, but as you
>>>>> say that is available by other means - which was part of my earlier
>>>>> point about there already being safe means to get some of the info.
>>>>>
>>>>> If this is really only about the small subset of safe "system
>>>>> information" that VMError reports, then I see even less motivation to
>>>>> piggy-back VM-info onto VMerror and entangle the two. If each block of
>>>>> information can be considered a single "step" in the error case then
>>>>> we can simply factor out the actual printing parts and share them. If
>>>>> the error case wants fine-grained steps then we can't do that.
>>>>>
>>>>> David
>>>>>
>>>>>> Kind Regards
>>>>>> /Mattis
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Coleen Phillimore
>>>>>> Sent: den 3 november 2015 16:25
>>>>>> To: Bertrand Delsart; Thomas Stüfe; Mattis Castegren
>>>>>> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net
>>>>>> Subject: Re: RFR: 8027429: Add diagnostic command VM.info to get
>>>>>> hs_err print-out
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've done a bit of refactoring with this change and have tested it
>>>>>> with and without intentional crashes to the VM.info reporting. If it
>>>>>> crashes within VM.info, there's a stack trace that clearly shows
>>>>>> where it comes from in the hs_err file. And the hs_err_pid file is
>>>>>> correctly produced. If we can make the jcmd come with a warning that
>>>>>> it *could* potentially crash the vm, I think this feature is a good
>>>>>> addition for sustaining.
>>>>>>
>>>>>> Without crashing, VM.info gives a lot of information that we normally
>>>>>> give with the hs_err file. The reason for the careful STEP'ping in
>>>>>> hs_err production is that the JVM has already crashed so we can't
>>>>>> count on anything. The VM.info command is designed to be run on a
>>>>>> moderately stable JVM so is less likely to crash on the things that
>>>>>> we're printing out.
>>>>>>
>>>>>> I think we should fix the Threads walking code that Fred pointed out
>>>>>> to take the Threads_lock, but I don't think we should *copy* code for
>>>>>> VM.info or copy print_on_error functions.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 11/3/15 10:18 AM, Bertrand Delsart wrote:
>>>>>>> On 03/11/2015 15:06, Bertrand Delsart wrote:
>>>>>>>> Thanks Thomas,
>>>>>>>>
>>>>>>>> I fully agree it can work. The problem is that it may also fail
>>>>>>>> if we
>>>>>>>> are not careful.
>>>>>>>>
>>>>>>>> Fred noticed for instance that "Threads::print_on_error" is not
>>>>>>>> protected by _vm_info_cmd (step 290). This is an obvious source of
>>>>>>>> potential crashes if you take into account the fact that the
>>>>>>>> print_on_error methods were designed not to use locks.
>>>>>>>
>>>>>>> Fred just pointed out that this may be OK since _thread is NULL.
>>>>>>>
>>>>>>> I'm not sure of why we do not print the threads when there is notion
>>>>>>> of current thread (and I would wonder whether we should print
>>>>>>> them for
>>>>>>> VMInfo) but luckily that may have been sufficient to prevent the
>>>>>>> crash.
>>>>>>>
>>>>>>> However, I still do not like it from a design point of view for the
>>>>>>> other reasons stated in my e-mail.
>>>>>>>
>>>>>>> Bertrand.
>>>>>>>
>>>>>>>>
>>>>>>>> It seems that the focus has been on removing what was clearly
>>>>>>>> useless, not what might be dangerous due to how the method (and
>>>>>>>> what
>>>>>>>> it calls) has been designed.
>>>>>>>>
>>>>>>>> In addition, adding _vm_info_cmd is IMHO not sufficient unless you
>>>>>>>> pass that argument to all layers. This is the biggest design issue.
>>>>>>>> For instance all calls to print_on_error could have this additional
>>>>>>>> argument so that the different components can adapt their behavior
>>>>>>>> (and are aware there are now two use cases). I would start by
>>>>>>>> Threads::print_on_error, grabbing the threads lock in the info
>>>>>>>> case,
>>>>>>>> but the same if true for all methods called by VMError::report. I
>>>>>>>> would go as far as adding a comment in all print_on_error methods
>>>>>>>> stating why the method is safe for both use cases (e.g. why no
>>>>>>>> locks
>>>>>>>> are needed) or adding "assert(!vm_info_cmd)" for methods
>>>>>>>> VMError:report is supposed to skip.
>>>>>>>> The current design, where the various components may not be aware
>>>>>>>> that they are called for two different purposes, is not robust
>>>>>>>> enough.
>>>>>>>>
>>>>>>>> Note also that being 'robust enough for the "crash" case' is not a
>>>>>>>> strong enough guarantee for a general purpose info DCmd...
>>>>>>>> particularly if, as you point out, an error that occurs in one
>>>>>>>> of the
>>>>>>>> info STEP crashes the VM instead of just bringing us to the next
>>>>>>>> error reporting STEP. Now, as stated before, I could agree if we
>>>>>>>> use
>>>>>>>> this DCmd only when a crash is acceptable.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Bertrand
>>>>>>>>
>>>>>>>> On 02/11/2015 14:32, Thomas Stüfe wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> we have implemented in the SAP JVM the same mechanism. Our
>>>>>>>>> implementation is in use since a number of years and is quite
>>>>>>>>> useful. So I would agree with Coleen.
>>>>>>>>>
>>>>>>>>> However, there are some things to keep in mind when doing this:
>>>>>>>>>
>>>>>>>>> Error handling is different between the "info-only" and "crash"
>>>>>>>>> case. In
>>>>>>>>> "info-only" case, VMError::report() is called without enclosing
>>>>>>>>> VMError::report_and_die() and we run with the normal signal
>>>>>>>>> handler
>>>>>>>>> intact, not with the secondary signal handler. So, normal signal
>>>>>>>>> handling works, which is good. But it also means that any error
>>>>>>>>> inside one of the error reporting STEPs will bring down the VM,
>>>>>>>>> instead of just bringing us to the next error reporting STEP
>>>>>>>>> ("Error
>>>>>>>>> occurred during error reporting").
>>>>>>>>>
>>>>>>>>> The problem is that a number of error reporting steps are a bit
>>>>>>>>> risky and written with the assurance that the worst which can
>>>>>>>>> happen
>>>>>>>>> is that the STEP will be interrupted. These risky steps should not
>>>>>>>>> be executed in the "only-info" case.
>>>>>>>>>
>>>>>>>>> But the proposed change already does this, it introduces
>>>>>>>>> "_vm_info_cmd"
>>>>>>>>> and conditionally excludes dangerous or pointless error reporting
>>>>>>>>> STEPs (e.g. callstack is pointless).
>>>>>>>>>
>>>>>>>>> As far as I can see, the other concerns concern only the "crash"
>>>>>>>>> case and simply do not apply to the "info-only" case. If an error
>>>>>>>>> reporting step is robust enough for the "crash" case (low stack
>>>>>>>>> usage, not allocating memory to avoid deadlocks etc), it should be
>>>>>>>>> safe enough for the "info" case.
>>>>>>>>>
>>>>>>>>> Kind Regards, Thomas
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 2, 2015 at 12:31 PM, Mattis Castegren
>>>>>>>>> <mattis.castegren at oracle.com <mailto:mattis.castegren at oracle.com>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> Of course, when I wrote "Review and push this fix" I meant
>>>>>>>>> doing a
>>>>>>>>> thorough technical review. We do not want VM.info to be
>>>>>>>>> unsafe, and
>>>>>>>>> I will let David reply to the technical questions that has
>>>>>>>>> been
>>>>>>>>> raised. I know he did consider some of these issues, but last
>>>>>>>>> week
>>>>>>>>> he was on Java one, and this week he is out. I will let him
>>>>>>>>> comment
>>>>>>>>> once he gets back.
>>>>>>>>>
>>>>>>>>> Kind Regards
>>>>>>>>> /Mattis
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Bertrand Delsart
>>>>>>>>> Sent: den 2 november 2015 10:13
>>>>>>>>> To: David Holmes; Mattis Castegren; Coleen Phillimore;
>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>> Subject: Re: RFR: 8027429: Add diagnostic command VM.info
>>>>>>>>> to get
>>>>>>>>> hs_err print-out
>>>>>>>>>
>>>>>>>>> On 02/11/2015 02:45, David Holmes wrote:
>>>>>>>>> > On 30/10/2015 8:38 PM, Mattis Castegren wrote:
>>>>>>>>> >> Hi
>>>>>>>>> >>
>>>>>>>>> >> I agree that there may be things we could add to VM.info
>>>>>>>>> that we
>>>>>>>>> >> couldn't add in an hs_err file due to the fact that we
>>>>>>>>> are
>>>>>>>>> crashing.
>>>>>>>>> >> However, I really do believe that we should use the same
>>>>>>>>> code.
>>>>>>>>> The
>>>>>>>>> >> goal of the hs_err file is exactly the same as the
>>>>>>>>> goal of
>>>>>>>>> VM.info,
>>>>>>>>> >> to print as much information about the system as
>>>>>>>>> possible so
>>>>>>>>> that
>>>>>>>>> >> when we look at an hs_err file or the output from
>>>>>>>>> VM.info,
>>>>>>>>> we get as
>>>>>>>>> >> much information about the system as possible. That way
>>>>>>>>> we can
>>>>>>>>> reduce
>>>>>>>>> >> the need to go back and forth with the customer to ask
>>>>>>>>> for more
>>>>>>>>> information.
>>>>>>>>> >>
>>>>>>>>> >> If we enumerate what we need in VM.info, the list
>>>>>>>>> would look
>>>>>>>>> very
>>>>>>>>> >> much like what we already do in VMError, and therefore I
>>>>>>>>> think we
>>>>>>>>> >> should use the same code. If we find something that we
>>>>>>>>> would
>>>>>>>>> want,
>>>>>>>>> >> that is not currently printed in hs_err files, I think we
>>>>>>>>> have
>>>>>>>>> two options:
>>>>>>>>> >>
>>>>>>>>> >> 1) Add it to hs_err files as well. If it is
>>>>>>>>> interesting when
>>>>>>>>> >> gathering customer information before a crash, it
>>>>>>>>> probably is
>>>>>>>>> >> interesting after a crash
>>>>>>>>> >> 2) If it for some reason cannot be captured when
>>>>>>>>> crashing,
>>>>>>>>> add it to
>>>>>>>>> >> the code with the condition that we don't print it during
>>>>>>>>> crash
>>>>>>>>> time.
>>>>>>>>> >>
>>>>>>>>> >> I would therefore suggest the following plan of action
>>>>>>>>> >>
>>>>>>>>> >> 1) Review and push this fix to see if there are any
>>>>>>>>> technical issues
>>>>>>>>> >> with regards to thread safety, etc. I know David has had
>>>>>>>>> this in
>>>>>>>>> mind
>>>>>>>>> >> when doing the change. After that, we have a command that
>>>>>>>>> support can
>>>>>>>>> >> use.
>>>>>>>>> >> 2) Think about if there is any information missing, both
>>>>>>>>> from hs_err
>>>>>>>>> >> files and from VM.info. Here, we may want to ask support
>>>>>>>>> about if
>>>>>>>>> >> there is any additional information they need to ask for
>>>>>>>>> when they
>>>>>>>>> >> get this data. If possible, this information should be
>>>>>>>>> added
>>>>>>>>> to both
>>>>>>>>> >> hs_err files and VM.info, we don't want support to
>>>>>>>>> have to
>>>>>>>>> ask for
>>>>>>>>> >> VM.info if we already have an hs_err file. Of course, if
>>>>>>>>> there is
>>>>>>>>> >> anything Dev would want to see when analyzing crashes in
>>>>>>>>> testing, we
>>>>>>>>> >> should add that too.
>>>>>>>>> >
>>>>>>>>> > I have no issue with VMError and VM.Info using the same
>>>>>>>>> code
>>>>>>>>> (where
>>>>>>>>> > appropriate), but I think it wrong to try and implement
>>>>>>>>> VM.Info
>>>>>>>>> > directly using the actual VMError mechanism. It is
>>>>>>>>> crude and
>>>>>>>>> makes it
>>>>>>>>> > difficult to make independent changes to either facility.
>>>>>>>>> >
>>>>>>>>> > Sorry.
>>>>>>>>> >
>>>>>>>>> > David
>>>>>>>>>
>>>>>>>>> Furthermore, we cannot just "Review and push this fix to
>>>>>>>>> see if
>>>>>>>>> there are any technical issues". Two reviewers have already
>>>>>>>>> stated
>>>>>>>>> that there are issues and these issues must be addressed
>>>>>>>>> before
>>>>>>>>> pushing.
>>>>>>>>>
>>>>>>>>> The answer cannot just be that we need this feature, it must
>>>>>>>>> address
>>>>>>>>> the technical issues that were reported.
>>>>>>>>>
>>>>>>>>> One possible answer is that we all agree that VMInfo would be
>>>>>>>>> unsafe.
>>>>>>>>> The method would have to be renamed UnsafeVMInfo and it
>>>>>>>>> must be
>>>>>>>>> clear that its 'impact' would be very risky, potentially
>>>>>>>>> crashing
>>>>>>>>> the JVM.
>>>>>>>>>
>>>>>>>>> If this is sufficient for your needs then we could agree with
>>>>>>>>> it. My
>>>>>>>>> only additional requirement would be to clearly mark that the
>>>>>>>>> method
>>>>>>>>> has been called so that we do not waste time on non-issues if
>>>>>>>>> calling VMInfo leads to a crash (or, even worse but less
>>>>>>>>> likely, to
>>>>>>>>> a silent data corruption that may lead to problems later).
>>>>>>>>> The bare
>>>>>>>>> minimum would be to double check that a crash during an
>>>>>>>>> execution of
>>>>>>>>> VM.info is clearly identifiable (this is worth double
>>>>>>>>> checking
>>>>>>>>> because of the use of static buffers for the error reporting,
>>>>>>>>> see
>>>>>>>>> below).
>>>>>>>>>
>>>>>>>>> If an 'unsafe' VMInfo DCmd is not sufficient, you should
>>>>>>>>> wonder
>>>>>>>>> whether a safe version of it is still a "small enhancement".
>>>>>>>>> A JEP
>>>>>>>>> may be required.
>>>>>>>>>
>>>>>>>>> VMError::report and the methods it calls (print_on_error)
>>>>>>>>> have been
>>>>>>>>> written with assumptions clearly stated in the comments above
>>>>>>>>> VMError::report (see below the two most important ones).
>>>>>>>>> HotSpot
>>>>>>>>> error reporting strategy differed from JRockit. This explains
>>>>>>>>> why
>>>>>>>>> the JRockit approach cannot be applied as is.
>>>>>>>>>
>>>>>>>>> I agree with David that you can try leveraging some of the
>>>>>>>>> logic
>>>>>>>>> from VMError::report. However, to be sure there is no
>>>>>>>>> misunderstanding, let's be clearer about David's "(where
>>>>>>>>> appropriate)": we would for instance have to review all the
>>>>>>>>> called
>>>>>>>>> methods, check which ones purposefully ignore locking
>>>>>>>>> (potentially
>>>>>>>>> crashing) and provide alternate code for the unsafe ones
>>>>>>>>> (which may
>>>>>>>>> not be trivial in some cases). As a simple example, a method
>>>>>>>>> called
>>>>>>>>> "print_on_error" should not be called when we are not in an
>>>>>>>>> erroneous condition. However, in some components of the
>>>>>>>>> JVM, the
>>>>>>>>> low
>>>>>>>>> level implementations of 'print_on_error' and
>>>>>>>>> 'print_info_safely'
>>>>>>>>> could share the same code.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Bertrand.
>>>>>>>>>
>>>>>>>>> Here are the important comments about VMError::report
>>>>>>>>> assumptions:
>>>>>>>>>
>>>>>>>>> // In general, a hang in error handler is much worse than a
>>>>>>>>> crash or
>>>>>>>>> internal
>>>>>>>>> // error, as it's harder to recover from a hang. Deadlock can
>>>>>>>>> happen
>>>>>>>>> if we
>>>>>>>>> // try to grab a lock that is already owned by current
>>>>>>>>> thread,
>>>>>>>>> or if the
>>>>>>>>> // owner is blocked forever (e.g. in
>>>>>>>>> os::infinite_sleep()). If
>>>>>>>>> possible, the
>>>>>>>>> // error handler and all the functions it called should avoid
>>>>>>>>> grabbing any
>>>>>>>>> // lock. An important thing to notice is that memory
>>>>>>>>> allocation
>>>>>>>>> needs a
>>>>>>>>> lock.
>>>>>>>>> //
>>>>>>>>> // We should avoid using large stack allocated buffers. Many
>>>>>>>>> errors
>>>>>>>>> happen
>>>>>>>>> // when stack space is already low. Making things even
>>>>>>>>> worse is
>>>>>>>>> that
>>>>>>>>> there
>>>>>>>>> // could be nested report_and_die() calls on stack (see
>>>>>>>>> above).
>>>>>>>>> Only one
>>>>>>>>> // thread can report error, so large buffers are statically
>>>>>>>>> allocated in
>>>>>>>>> data
>>>>>>>>> // segment.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >> Kind Regards
>>>>>>>>> >> /Mattis
>>>>>>>>> >>
>>>>>>>>> >> -----Original Message-----
>>>>>>>>> >> From: David Holmes
>>>>>>>>> >> Sent: den 29 oktober 2015 23:20
>>>>>>>>> >> To: Mattis Castegren; Coleen Phillimore;
>>>>>>>>> >> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>> >> Subject: Re: RFR: 8027429: Add diagnostic command
>>>>>>>>> VM.info to
>>>>>>>>> get
>>>>>>>>> >> hs_err print-out
>>>>>>>>> >>
>>>>>>>>> >> On 29/10/2015 11:36 PM, Mattis Castegren wrote:
>>>>>>>>> >>> Some background: We have this command in JRockit. The
>>>>>>>>> information you
>>>>>>>>> >>> gather when you crash to give a good summary of what
>>>>>>>>> system
>>>>>>>>> you run
>>>>>>>>> >>> on is pretty much exactly the information you need to
>>>>>>>>> get a
>>>>>>>>> good
>>>>>>>>> >>> summary on a system that has not crashed. The JRockit
>>>>>>>>> command is
>>>>>>>>> >>> extremely useful for support, and saves a lot of work
>>>>>>>>> going
>>>>>>>>> back and
>>>>>>>>> >>> forth asking about system information.
>>>>>>>>> >>>
>>>>>>>>> >>> Also, if we write something new in the hs_err file,
>>>>>>>>> like if
>>>>>>>>> there has
>>>>>>>>> >>> been any out of memory errors, we often would want
>>>>>>>>> the same
>>>>>>>>> >>> information in the VM.info output. From my experience in
>>>>>>>>> >>> Sustaining/Support, I can't think of any information you
>>>>>>>>> would want
>>>>>>>>> >>> in VM.info that you wouldn't also want in the hs_err
>>>>>>>>> file
>>>>>>>>> and the
>>>>>>>>> >>> other way around, apart from details about the crash
>>>>>>>>> (obviously).
>>>>>>>>> >>>
>>>>>>>>> >>> I don't see a reason to exactly enumerate what
>>>>>>>>> information
>>>>>>>>> VM.info
>>>>>>>>> >>> should provide. From a sustaining/support
>>>>>>>>> perspective, we
>>>>>>>>> want a
>>>>>>>>> >>> one-stop command to gather as much useful information as
>>>>>>>>> possible,
>>>>>>>>> >>> which is the same idea we have for the hs_err file.
>>>>>>>>> >>
>>>>>>>>> >> The reason to enumerate what is required is to see where
>>>>>>>>> that
>>>>>>>>> >> information already exists and can be collected from. The
>>>>>>>>> VMError
>>>>>>>>> >> report has to be very careful about what it does and
>>>>>>>>> how it
>>>>>>>>> does it
>>>>>>>>> >> because of the fact we may have crashed and the general
>>>>>>>>> process
>>>>>>>>> state
>>>>>>>>> >> is indeterminate. A Dcmd can simply gather whatever
>>>>>>>>> information is
>>>>>>>>> >> required from the available sources in whatever order or
>>>>>>>>> format
>>>>>>>>> desired.
>>>>>>>>> >>
>>>>>>>>> >> I have no problem with this command, just how it has been
>>>>>>>>> proposed to
>>>>>>>>> >> implement it.
>>>>>>>>> >>
>>>>>>>>> >> David
>>>>>>>>> >>
>>>>>>>>> >>> Kind Regards
>>>>>>>>> >>> /Mattis
>>>>>>>>> >>>
>>>>>>>>> >>> -----Original Message-----
>>>>>>>>> >>> From: David Holmes
>>>>>>>>> >>> Sent: den 29 oktober 2015 13:08
>>>>>>>>> >>> To: Coleen Phillimore;
>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>> >>> Subject: Re: RFR: 8027429: Add diagnostic command
>>>>>>>>> VM.info
>>>>>>>>> to get
>>>>>>>>> >>> hs_err print-out
>>>>>>>>> >>>
>>>>>>>>> >>> On 29/10/2015 10:02 PM, Coleen Phillimore wrote:
>>>>>>>>> >>>>
>>>>>>>>> >>>>
>>>>>>>>> >>>> On 10/29/15 7:58 AM, David Holmes wrote:
>>>>>>>>> >>>>> On 29/10/2015 9:26 PM, Coleen Phillimore wrote:
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>> So I actually disagree. I don't think there should
>>>>>>>>> be an
>>>>>>>>> >>>>>> additional separate mechanism to get the same
>>>>>>>>> information
>>>>>>>>> that we
>>>>>>>>> >>>>>> get with hs_err reporting. I've wanted to have a
>>>>>>>>> feature like
>>>>>>>>> >>>>>> this for a long time.
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>> I pre-reviewed this change and I thought it looked
>>>>>>>>> good in
>>>>>>>>> general.
>>>>>>>>> >>>>>> I didn't see the thread iteration problem that Fred
>>>>>>>>> refers to
>>>>>>>>> >>>>>> below, but I think the individual problems can be
>>>>>>>>> fixed.
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>> The last thing I want is this code to be copied
>>>>>>>>> somewhere else.
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> Factored as needed not copied. VMError is not an
>>>>>>>>> "info"
>>>>>>>>> reporting
>>>>>>>>> >>>>> mechanism.
>>>>>>>>> >>>>
>>>>>>>>> >>>> If you look at the things that are reported in each
>>>>>>>>> "STEP",
>>>>>>>>> there's a
>>>>>>>>> >>>> small amount of code and the order is important.
>>>>>>>>> >>>
>>>>>>>>> >>> The order is less important in an "info" request I would
>>>>>>>>> think.
>>>>>>>>> >>>
>>>>>>>>> >>>> I'd like the vm info to use the same order and report
>>>>>>>>> what it
>>>>>>>>> can do
>>>>>>>>> >>>> safely. Refactoring 5 lines of code into functions
>>>>>>>>> doesn't
>>>>>>>>> make
>>>>>>>>> >>>> sense.
>>>>>>>>> >>>
>>>>>>>>> >>> I need to consider exactly what it is the "info"
>>>>>>>>> needs to
>>>>>>>>> report in
>>>>>>>>> >>> more detail. There are existing facilities (system
>>>>>>>>> properties,
>>>>>>>>> >>> management
>>>>>>>>> >>> APIs) for various bits of runtime information, which
>>>>>>>>> VMError can't
>>>>>>>>> >>> utilize but a Dcmd can.
>>>>>>>>> >>>
>>>>>>>>> >>> David
>>>>>>>>> >>>
>>>>>>>>> >>>> Coleen
>>>>>>>>> >>>>
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> David
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>> Coleen
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>> On 10/28/15 8:48 PM, David Holmes wrote:
>>>>>>>>> >>>>>>> I agree with Fred, this kind of info reporting
>>>>>>>>> should
>>>>>>>>> not be
>>>>>>>>> >>>>>>> piggy-backed onto VMError handling for the reasons
>>>>>>>>> stated
>>>>>>>>> (and the
>>>>>>>>> >>>>>>> error handling logic is complicated enough as it
>>>>>>>>> is!). For
>>>>>>>>> things
>>>>>>>>> >>>>>>> like thread lists there are already safe management
>>>>>>>>> functions that
>>>>>>>>> >>>>>>> can/should be used.
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>> Thanks,
>>>>>>>>> >>>>>>> David H.
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>> On 29/10/2015 3:29 AM, Frederic Parain wrote:
>>>>>>>>> >>>>>>>> Hi David,
>>>>>>>>> >>>>>>>>
>>>>>>>>> >>>>>>>> I haven't review all the code yet, but I'm
>>>>>>>>> concerned
>>>>>>>>> with the
>>>>>>>>> >>>>>>>> fact that the diagnostic command is calling
>>>>>>>>> VMError::report().
>>>>>>>>> >>>>>>>> This method has been implemented to be executed
>>>>>>>>> in the
>>>>>>>>> particular
>>>>>>>>> >>>>>>>> context of fatal errors, and its usage while the
>>>>>>>>> VM is
>>>>>>>>> running
>>>>>>>>> >>>>>>>> normally seems dangerous.
>>>>>>>>> >>>>>>>>
>>>>>>>>> >>>>>>>> For instance, VMError::report() consciously avoids
>>>>>>>>> grabbing locks
>>>>>>>>> >>>>>>>> because of the risk of deadlock during the error
>>>>>>>>> reporting.
>>>>>>>>> >>>>>>>> The consequence is that some data structures are
>>>>>>>>> browsed in an
>>>>>>>>> >>>>>>>> unsafe way. One example: VMError::report() calls
>>>>>>>>> >>>>>>>> Threads::print_on_error() which iterates over the
>>>>>>>>> thread list
>>>>>>>>> >>>>>>>> *without owning the Threads_lock*.
>>>>>>>>> >>>>>>>>
>>>>>>>>> >>>>>>>> The implementation of the diagnostic command seems
>>>>>>>>> also to
>>>>>>>>> >>>>>>>> exclude a lot of reporting from the initial
>>>>>>>>> VMError::report()
>>>>>>>>> >>>>>>>> method. Have you considered implementing a new
>>>>>>>>> MT-safe
>>>>>>>>> reporting
>>>>>>>>> >>>>>>>> method rather than trying to modify the special
>>>>>>>>> VMError::report()
>>>>>>>>> >>>>>>>> methods? (Note that some code factorization between
>>>>>>>>> >>>>>>>> VMError::report() and this new method should be
>>>>>>>>> possible).
>>>>>>>>> >>>>>>>>
>>>>>>>>> >>>>>>>> Thanks,
>>>>>>>>> >>>>>>>>
>>>>>>>>> >>>>>>>> Fred
>>>>>>>>> >>>>>>>>
>>>>>>>>> >>>>>>>> On 28/10/2015 17:18, david buck wrote:
>>>>>>>>> >>>>>>>>> Hi!
>>>>>>>>> >>>>>>>>>
>>>>>>>>> >>>>>>>>> Please review my change for this small
>>>>>>>>> enhancement.
>>>>>>>>> >>>>>>>>>
>>>>>>>>> >>>>>>>>> bug:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8027429
>>>>>>>>> >>>>>>>>> webrev:
>>>>>>>>> http://cr.openjdk.java.net/~dbuck/8027429_jdk9_01/
>>>>>>>>> >>>>>>>>>
>>>>>>>>> >>>>>>>>> Cheers,
>>>>>>>>> >>>>>>>>> -Buck
>>>>>>>>> >>>>>>>>
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Bertrand Delsart, Grenoble Engineering
>>>>>>>>> Center
>>>>>>>>> Oracle, 180 av. de l'Europe, ZIRST de
>>>>>>>>> Montbonnot
>>>>>>>>> 38330 Montbonnot Saint Martin, FRANCE
>>>>>>>>> bertrand.delsart at oracle.com
>>>>>>>>> <mailto:bertrand.delsart at oracle.com>
>>>>>>>>> Phone : +33 4 76 18 81 23
>>>>>>>>>
>>>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>>> NOTICE: This email message is for the sole use of the
>>>>>>>>> intended
>>>>>>>>> recipient(s) and may contain confidential and privileged
>>>>>>>>> information. Any unauthorized review, use, disclosure or
>>>>>>>>> distribution is prohibited. If you are not the intended
>>>>>>>>> recipient,
>>>>>>>>> please contact the sender by reply email and destroy all
>>>>>>>>> copies of
>>>>>>>>> the original message.
>>>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>
>
More information about the hotspot-runtime-dev
mailing list