RFR (trivial): 8250826: jhsdb does not work with coredump which comes from Substrate VM

Sun Aug 2 00:20:03 UTC 2020

Hi Chris,

Thanks for your comment!
I pushed new change to submit repo, but the build failed on macOS. Could you share details?
(I do not have Mac)

   commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297
   job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989

On 2020/08/01 13:06, Chris Plummer wrote:
> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote:
>> Hi Chris,
>>
>> On 2020/07/31 7:29, Chris Plummer wrote:
>>> Hi Yasumasa,
>>>
>>> If I understand correctly we first call add_map_info() for all the PT_LOAD segments in the core file. We then process all the library segments, calling add_map_info() for them if the target_vaddr has not already been addded. If has already been added, which I assume is the case for any library segment that is already in the core file, then the core file version is replaced the the library version.  I'm a little unclear of the purpose of this replacing of the core PT_LOAD segments with those found in the libraries. If you could explain this that would help me understand your change.
>>
>> Read only segments in ELF should not be any different from PT_LOAD segments in the core.
>> And head of ELF header might be included in coredump (See JDK-7133122). Thus we need to replace PT_LOAD segments the library version.
> Ok. The code in the area really should have been commented better when first written. The purpose is not understandable simply by reading the code.

I added some comments to existing code. Please tell me if it is insufficient.

>>> I'm also unsure why existing_map->fd would ever be something other than the core file. Why would another library map the same target_vaddr.
>>
>> When mmap() is called to read-only ELF segments / sections, Linux kernel seems to allocate other memory segments which has same top virtual memory address. I've not yet found out from the code of Linux kernel, but I confirmed this behavior on GDB.
> Ok. Same comment as above. This should have been explained with comments in the code.

Added some comments.

> As for your fix, if I understand correctly the issue is that a single segment in the library is being split into two segments in the process (and therefore in the core file) due to an mprotect being done on part of the segment. Because of this the segment size in the library does match the segment size in the core file. So with your fix the library segment is used, but what about the other half of the segment that is in the core file? Don't we now have overlapping segments; the full original segment from the library, and then a second segment that overlaps the tail end of the library segment? Will that cause any confusion later on?

As long as vaddr is valid, it doesn't matter even if it overlaps because SA would sort the map with vaddr, and would lookup with it.
In Substrate VM, there are RO and RW sections in that order, so it is ok with webrev.00 . However it might not be appropriate because RW section might be top of PT_LOAD.

To make it more generalized, I changed it to the commit on submit repo.
It would check access flags between in coredump and in binary. If they are different, we respect current (loaded from coredump) map because it might be changed at runtime.

The change for LabsJDK 11 is more simple because JDK 11 does not have ps_core_common.c .
So I share you it. It may help you:

   http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch

Thanks,

Yasumasa

> thanks,
> 
> Chris
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 7/30/20 1:18 PM, Chris Plummer wrote:
>>>> Hi Yasumasa,
>>>>
>>>> I'm reviewing this RFR, and I'd like to ask that it not be pushed as trivial. Although it is just a one line change, it takes an extensive knowledge to understand the impact. I'll read up on the filed graal issue and try to understand the ELF code a bit better.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote:
>>>>> Hi all,
>>>>>
>>>>> Please review this trivial change:
>>>>>
>>>>>   JBS: https://bugs.openjdk.java.net/browse/JDK-8250826
>>>>>   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/
>>>>>
>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via jhsdb.
>>>>>
>>>>> I've reported this issue to GraalVM community [1], and I 've found out the cause of this issue is .svm_heap would be separated to RO and RW areas by mprotect() calls in run time in spite of .svm_heap is RO section in ELF (please see [1] for details).
>>>>>
>>>>> It is corner case, but we will see same problem on jhsdb when we attempt to analyze coredump which comes from some applications / libraries which would separate RO sections in ELF like Substrate VM.
>>>>>
>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], then community members suggested me to discuss in serviceability-dev.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> [1] https://github.com/oracle/graal/issues/2579
>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9
>>>>
>>>>
>>>
>>>
> 
>