Project to improve hs_err files

Mattis Castegren mattis.castegren at oracle.com
Tue Feb 18 09:52:25 PST 2014


Hi

This request came in from the compiler team. Nils, I believe you brought it up, do you have any examples when getting register information from an assert would be useful?

We rarely get bugs filed on debug builds in Sustaining, so I don't have much of an opinion here

Kind Regards
/Mattis

-----Original Message-----
From: David Holmes 
Sent: den 18 februari 2014 05:45
To: Mattis Castegren; hotspot-dev at openjdk.java.net
Cc: David Buck; Roger Calnan
Subject: Re: Project to improve hs_err files

Hi Mattis,

As I wrote in 8035084 re gathering register info on assert/guarantee 
failures:

"It is not at all clear to me that this is needed or even that useful. 
When an assert or guarantee fails you know exactly why. If you need to 
then go and examine other state then a core dump is the best. Anything 
you do to try and capture the register information will modify some of 
the registers - and that could be more confusing than not having any 
register information. It may also be that the interesting register 
values have already been overwritten by the time the assert/guarantee 
actually fires. What you really want is the state of the registers 
_before_ the assert/guarantee is evaluated."

Cheers,
David

On 18/02/2014 3:25 AM, Mattis Castegren wrote:
> Hi
>
> Thanks for the comments on this thread lately. I have added all comments to my tracking page, and I have filed bugs for the new suggestions.
>
> Now that JDK8 is all but done, I would like to get this project moving again. However, before I ask someone in my team to start working on these bugs, I would like to make one last round on the mailing list to see if anyone have a strong opinion against any of the feature requests (all labeled with hs_err_improvements):
>
> https://bugs.openjdk.java.net/issues/?jql=labels%20%3D%20hs_err_improvements%20and%20resolution%20is%20empty
>
> I expect there to be some discussions about the robustness of the implementation, but that can be handled in the code reviews. What I want to know is if anyone have any larger objections that we should sort out before we even start implementation.
>
> The biggest change is https://bugs.openjdk.java.net/browse/JDK-8026324 - Add summary section to hs_err file
>
> Overall, the feedback I have got on this feature has been positive, but the last time I asked everyone was busy with JKD8 Zero Bug Bounce, so I thought it best to ask one last time.
>
> We plan to start working on this sometime next week
>
> Kind Regards
> /Mattis
>
> PS: Still gathering suggestions, so send them if you have them.
>
>
> -----Original Message-----
> From: John Rose
> Sent: den 12 februari 2014 20:15
> To: Mattis Castegren
> Cc: hotspot-dev at openjdk.java.net
> Subject: Re: Project to improve hs_err files
>
> The hs_err file has grown to include lots of handy information.  I agree that it would be reasonable to add more, and I'm really glad that you are thinking about it in this level of detail.  This is especially good as you are an experienced consumer of these files.
>
> The typical size of such a file is currently about 40kb.  As long as the most useful information is kept near the top, there is (IMO) room for this file to grow 2x or more in typical size.
>
> Some of the configuration information you mention may be present at the top of the hotspot.log file, before the big <tty> element.  It might be fruitful to ensure that such preamble information is always captured at startup, and dumped into the log file.
>
> I don't think dump-time disassembly is practical, since we don't have an engine bundled in the JVM, but we should make it possible with post-processing to get a good disassembly from hex dumps in the error dump file.  This has been done before; perhaps it needs reviving or refinement.
>
> Here's another thought, along the lines of symbolic disassembly, but for data rather than code:
>
> One thing I would like to see more of is memory contents, along with a way to interpret their meaning.  The memory blocks around current PC and SP is supplied.  It might be worth while dumping additional memory blocks one or two indirections away from the (apparent) pointers in those initial memory blocks.  I often wonder, "is that the object I care about?" when looking at those memory dumps.  I am guessing that there is a cheap, robust way to put more clues into the dump, without getting entangled in object parsing (which as David points out could cause further crashing).  Perhaps there is a way to classify data words in a post-processing tool, like we can pull out disassembled code.  At least, we can observe whether an apparent point refers into a live part of the heap (assuming we have the right few words of heap boundary info).
>
> We could also (maybe) identify Klass pointers in the headers of objects and output a little bit of data in the crash log to make it possible to identify the (apparent) classes of (apparent) object pointers in the regions dumped.  At least the values of well-known classes (in SystemDictionary::_something), if they occur as the (apparent) classes of hex dump addresses, could be supplied as an extra hint.  Clearly this could scale beyond the reasonable size of a crash dump, so some sort of size limit would need to be applied.  (The size limit could be set to zero, or the log file section removed, if customers are nervous about memory dumps.)  I think there is scope for tasteful engineering here, especially if we push fancy formatting work into a post-pass tool.
>
> Perhaps there is a way to join hands with the SA (serviceability agent) infrastructure, and run a tiny SA instance out of a relatively limited supply of hex dump from the crash file, instead of out of the full picture supplied by the core file or a live process.  It's at least an interesting thought experiment.
>
> Please keep up this good work!
>
> - John
>
> On Sep 9, 2013, at 10:38 AM, Mattis Castegren <mattis.castegren at oracle.com> wrote:
>
>> Hi. I sent this email to serviceability and runtime, but I got a request to forward the mail to all of hotspot dev as hs_err files affects all areas. Please let me know if you have any feedback. Don't worry about if the suggestions are feasible or not, that will come in a second step.
>


More information about the hotspot-dev mailing list