PING: RFR: 8233373: hs_err should report the location of JFR files

Markus Gronlund markus.gronlund at oracle.com
Wed Jan 22 12:05:18 UTC 2020


Hi Yasumasa,

There is a problem with the current suggestion in that _dump_path is being used to represent two different concepts.

JFR emergency dump support consist of two stages:
Stage 1. The running recording(s) needs to be finalized by having an available chunk to back the chunkwriter (emergency_chunk_path()).
Stage 2. The information in the repository (if it exists) needs to be concatenated into a single, emergency JFR recording file (emergency_dump_path()).

Stage 1 (chunk file for finalization):
Mode 1: Currently running a disk recording, the chunkwriter has a valid fd mapped onto a chunk file in the repository (a chunk file has the name like repository/2020_01_22_11_09_00.jfr). It will use it as-is to finalize the recording(s).
Mode 2: Currently running in-memory recording(s), and the repository has not yet been created (lazy creation on first disk recording). There exist an optimization path here: the chunkwriter can take the emergency_dump_path() from stage 2 directly to finalize the chunk and hence the full JFR recording file. All done, stage 2 is nop.
Mode 3: Currently running an in-memory recording(s), and there exist a repository. Again, the chunkwriter needs a chunk file in the repository for finalization. The _dump_path now created points here into the repository. This means that in stage 2, the code will attempt to concatenate itself as part of creating the emergency JFR recording file from the chunks in the repository. In addition, the _dump_path now represents a chunk in the repository, not a proper path as represented in cwd or in the dump path settings. 
Mode 4: The existing disk recording was stopped just before emergency dump logic got exclusive access. Same problem as 3.

Stage 2 (emergency JFR recording file):
Creates a path and a file descriptor for the emergency JFR recording file (hs_err_pidXXXX.jfr), it considers the crash / OOM reason in building up the file name, usually in cwd but we want to change this to honor if an explicit filepath setting exist moving forward. A file iterator over the repository (if it exist) is created and the existing chunks in the repository is concatenated to create the recording file.

So, the _dump_path is being used to represent both emergency_chunk_path() and emergency_dump_path() which will lead to the problem described above.

Another aspect to consider is that we want to move the Jfr::on_vm_shutdown() to after the error reporting has completed. This will change the requirement on how and when the paths and file descriptors are constructed, leading to a situation more akin to how the core file (.core / .mdmp) reporting / creation is done. I think we should try to emulate how the core file reporting / creation is done, but without having to report back an explicit status onto the VMError instance. I also think, that if we emulate the core file creation, this could also open a way to only have a single reporting location (line 645, removing the _verbose flag) and not having to try to fit in the reporting at lines 1603- .

There are also minor aspects to consider in your patch, for example if you change to use a statically allocated array you need to remove the ResourceMark’s and associated comments. I would also remove the get_ prefix in the accessors.

We might also want to think about how to represent this suggested output:

#
# JFR data is saved as:
#   D:\temp\SPECjbb2005\SPECjbb2005\hs_err_pid23364.jfr

I would suggest to again emulate the core file creation, perhaps:

#
# JFR recording file will be written. Location: D:\temp\SPECjbb2005\SPECjbb2005\hs_err_pid23364.jfr
#



Thanks
Markus

-----Original Message-----
From: Yasumasa Suenaga <suenaga at oss.nttdata.com> 
Sent: den 22 januari 2020 06:44
To: hotspot-jfr-dev at openjdk.java.net
Cc: Daniel Daugherty <daniel.daugherty at oracle.com>; yasuenag at gmail.com; hotspot-runtime-dev at openjdk.java.net
Subject: Re: PING: RFR: 8233373: hs_err should report the location of JFR files

Thanks Coleen and Dan!

I uploaded new webrev which includes the fix of comment.

   http://cr.openjdk.java.net/~ysuenaga/JDK-8233373/webrev.02/

Can I get reviewer(s) from JFR folks?


Yasumasa


On 2020/01/22 5:39, Daniel D. Daugherty wrote:
> Adding the JFR folks to this review thread...
> 
> It would be good if one JFR person chimed in here...
> 
> Dan
> 
> 
> On 1/21/20 1:21 PM, coleen.phillimore at oracle.com wrote:
>>
>> This code looks fine to me.
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233373/webrev.01/src/hotspo
>> t/share/utilities/vmError.cpp.udiff.html
>>
>> Small nit.  Can you change:
>>
>> + const char *path = NULL;
>> +
>> + path = Jfr::get_emergency_dump_path();
>>
>> to one line:
>>
>> + const char *path =Jfr::get_emergency_dump_path();
>>
>>
>> Thanks,
>> Coleen
>>
>>
>>
>>
>> On 1/18/20 9:19 PM, Yasumasa Suenaga wrote:
>>> PING: Could you review it?
>>>
>>>>    JBS: https://bugs.openjdk.java.net/browse/JDK-8233373
>>>>    webrev: 
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233373/webrev.01/
>>>
>>> I believe this enhancement helps troubleshooters for postmortem analysis with JFR.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2019/12/13 11:10, Yasumasa Suenaga wrote:
>>>> (Add hotspot-runtime-dev)
>>>>
>>>> PING: Could you review it?
>>>>
>>>>    JBS: https://bugs.openjdk.java.net/browse/JDK-8233373
>>>>    webrev: 
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233373/webrev.01/
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2019/11/29 23:37, Yasumasa Suenaga wrote:
>>>>> Hi Markus,
>>>>>
>>>>> How about this?
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233373/webrev.01/
>>>>>
>>>>> I do no agree to pass outputStream instance to JFR implementation 
>>>>> in HotSpot because we need to consider '#' and indents.
>>>>> I think it might be hard to maintain if "#" and indents would be printed in different place.
>>>>>
>>>>> So I want to add getters for emergency dump path and repository path to Jfr class.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2019/11/29 19:26, Markus Gronlund wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I have not looked at all the details yet, but from a high level:
>>>>>>
>>>>>> You should try to keep JFR specific logic encapsulated in JFR code as much as possible. jfrRepository.hpp and jfrEmergencyDump.hpp are internal modules, so they should not leak out into vmError.cpp.
>>>>>> The jfr.hpp is the external interface for the VM to interact directly with JFR. I would hope that there is only one invocation needed in vmError.cpp, something like JFR_ONLY(Jfr::on_error_reporting(outputStream* out);).
>>>>>> If there are multiple callouts needed to coordinate the layout, then you might need multiple entries in jfr.hpp, but please try to keep them to a minimum.  Pure textual output, such as aligning spaces and # can be printed directly in vmError.cpp of course.
>>>>>>
>>>>>> Thanks
>>>>>> Markus
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>> Sent: den 19 november 2019 02:32
>>>>>> To: hotspot-jfr-dev at openjdk.java.net
>>>>>> Cc: yasuenag at gmail.com
>>>>>> Subject: PING: RFR: 8233373: hs_err should report the location of 
>>>>>> JFR files
>>>>>>
>>>>>> PING: Could you review?
>>>>>>
>>>>>>>     JBS: https://bugs.openjdk.java.net/browse/JDK-8233373
>>>>>>>     webrev: 
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233373/webrev.00/
>>>>>>
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2019/11/01 13:38, Yasumasa Suenaga wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Please review this change:
>>>>>>>
>>>>>>>     JBS: https://bugs.openjdk.java.net/browse/JDK-8233373
>>>>>>>     webrev: 
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233373/webrev.00/
>>>>>>>
>>>>>>> Since JFR Event Streaming (JEP 349), JFR file in the repository would keep most recently data.
>>>>>>> It is very useful for postmortem analysis. So it helps JVM engineer to acquire them from the user if hs_err log points it.
>>>>>>>
>>>>>>>
>>>>>>> Examples:
>>>>>>>
>>>>>>> * Show JFR repository path:
>>>>>>> ```
>>>>>>> # JFR files might be saved in the repository:
>>>>>>> # /tmp/2019_11_01_11_09_30_3529
>>>>>>> ```
>>>>>>>
>>>>>>> * Show emergency dump path
>>>>>>> ```
>>>>>>> # JFR data is saved as:
>>>>>>> # /home/ysuenaga/github/garakuta/NativeSEGV/hs_err_pid3807.jfr
>>>>>>> ```
>>>>>>>
>>>>>>>
>>>>>>> This webrev passed all tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8233373-20191101-0248-6329106)
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>
> 


More information about the hotspot-runtime-dev mailing list