RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM

Chris Plummer chris.plummer at oracle.com
Mon Aug 3 21:41:00 UTC 2020


Hi Yasumasa,

Your updated fix resulted in using the core file map whereas the 
original fix used the library map. In both cases the assert is avoided, 
which I think is the main goal. Does it matter which map is used?

   42 #ifndef PF_R
   43 #define PF_R 0x4
   44 #endif

  156   if ((map = allocate_init_map(ph->core->classes_jsa_fd,
  157                                offset, vaddr, memsz, PF_R)) == NULL) {

I'm not so sure this is appropriate for OSX. It uses mach-o files, not 
elf files. The segment_command flags field comes from loader.h [1]. I 
don't see anything in there that looks like the equivalent of ELF access 
flags.

/* Constants for the flags field of the segment_command */
#define    SG_HIGHVM    0x1    /* the file contents for this segment is for
                    the high part of the VM space, the low part
                    is zero filled (for stacks in core files) */
#define    SG_FVMLIB    0x2    /* this segment is the VM that is 
allocated by
                    a fixed VM library, for overlap checking in
                    the link editor */
#define    SG_NORELOC    0x4    /* this segment has nothing that was 
relocated
                    in it and nothing relocated to it, that is
                    it maybe safely replaced without relocation*/
#define SG_PROTECTED_VERSION_1    0x8 /* This segment is protected.  If the
                        segment starts at file offset 0, the
                        first page of the segment is not
                        protected.  All other pages of the
                        segment are protected. */

Since the flags don't matter for OSX, maybe you should just pass 0. You 
can do something like:

#ifndef PF_R
#define MAP_R_FLAG PF_R
#else
#define MAP_R_FLAG 0
#endif

Some minor comment fixes are needed:

  397         // Access flags fot this memory region is different 
between the library

"fot" -> "for"
"is" -> "are"

  399         // We should respect to coredump.

"to" -> "the"

  404         // And head of ELF header might be included in coredump 
(See JDK-7133122).
  405         // Thus we need to replace PT_LOAD segments the library 
version.

How about:

  404         // Also the first page of the ELF header might be included 
in the coredump (See JDK-7133122).
  405         // Thus we need to replace the PT_LOAD segment with the 
library version.

thanks,

Chris

[1] 
https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html

On 8/2/20 12:18 AM, Yasumasa Suenaga wrote:
> Hi Chris,
>
> (Remove "trivial" from subject)
>
> Thanks for the information! I fixed errors in new webrev. It passed 
> tests on submit repo 
> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525)
>
>   http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/
>
>
> I tried to use elf.h instead of #define for PF_R, however it failed 
> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335).
>
>   http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d
>
> Thus I added #define for it in this webrev.
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/08/02 10:22, Chris Plummer wrote:
>> Hi Yasumasa,
>>
>> [2020-08-01T14:15:42,514Z] Creating 
>> support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8 
>> file(s)
>> [2020-08-01T14:15:43,961Z] 
>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: 
>> error: no member named 'flags' in 'struct map_info'
>> [2020-08-01T14:15:43,961Z]   map->flags  = flags;
>> [2020-08-01T14:15:43,961Z]   ~~~  ^
>> [2020-08-01T14:15:43,963Z] 
>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: 
>> error: use of undeclared identifier 'PF_R'
>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) {
>>
>> I'll look at the code changes later. No time at the moment.
>>
>> thanks,
>>
>> Chris
>>
>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source 
>> 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga 
>> wrote:
>>> Hi Chris,
>>>
>>> Thanks for your comment!
>>> I pushed new change to submit repo, but the build failed on macOS. 
>>> Could you share details?
>>> (I do not have Mac)
>>>
>>>   commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297
>>>   job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989
>>>
>>> On 2020/08/01 13:06, Chris Plummer wrote:
>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote:
>>>>> Hi Chris,
>>>>>
>>>>> On 2020/07/31 7:29, Chris Plummer wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> If I understand correctly we first call add_map_info() for all 
>>>>>> the PT_LOAD segments in the core file. We then process all the 
>>>>>> library segments, calling add_map_info() for them if the 
>>>>>> target_vaddr has not already been addded. If has already been 
>>>>>> added, which I assume is the case for any library segment that is 
>>>>>> already in the core file, then the core file version is replaced 
>>>>>> the the library version.  I'm a little unclear of the purpose of 
>>>>>> this replacing of the core PT_LOAD segments with those found in 
>>>>>> the libraries. If you could explain this that would help me 
>>>>>> understand your change.
>>>>>
>>>>> Read only segments in ELF should not be any different from PT_LOAD 
>>>>> segments in the core.
>>>>> And head of ELF header might be included in coredump (See 
>>>>> JDK-7133122). Thus we need to replace PT_LOAD segments the library 
>>>>> version.
>>>> Ok. The code in the area really should have been commented better 
>>>> when first written. The purpose is not understandable simply by 
>>>> reading the code.
>>>
>>> I added some comments to existing code. Please tell me if it is 
>>> insufficient.
>>>
>>>
>>>>>> I'm also unsure why existing_map->fd would ever be something 
>>>>>> other than the core file. Why would another library map the same 
>>>>>> target_vaddr.
>>>>>
>>>>> When mmap() is called to read-only ELF segments / sections, Linux 
>>>>> kernel seems to allocate other memory segments which has same top 
>>>>> virtual memory address. I've not yet found out from the code of 
>>>>> Linux kernel, but I confirmed this behavior on GDB.
>>>> Ok. Same comment as above. This should have been explained with 
>>>> comments in the code.
>>>
>>> Added some comments.
>>>
>>>
>>>> As for your fix, if I understand correctly the issue is that a 
>>>> single segment in the library is being split into two segments in 
>>>> the process (and therefore in the core file) due to an mprotect 
>>>> being done on part of the segment. Because of this the segment size 
>>>> in the library does match the segment size in the core file. So 
>>>> with your fix the library segment is used, but what about the other 
>>>> half of the segment that is in the core file? Don't we now have 
>>>> overlapping segments; the full original segment from the library, 
>>>> and then a second segment that overlaps the tail end of the library 
>>>> segment? Will that cause any confusion later on?
>>>
>>> As long as vaddr is valid, it doesn't matter even if it overlaps 
>>> because SA would sort the map with vaddr, and would lookup with it.
>>> In Substrate VM, there are RO and RW sections in that order, so it 
>>> is ok with webrev.00 . However it might not be appropriate because 
>>> RW section might be top of PT_LOAD.
>>>
>>> To make it more generalized, I changed it to the commit on submit repo.
>>> It would check access flags between in coredump and in binary. If 
>>> they are different, we respect current (loaded from coredump) map 
>>> because it might be changed at runtime.
>>>
>>> The change for LabsJDK 11 is more simple because JDK 11 does not 
>>> have ps_core_common.c .
>>> So I share you it. It may help you:
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch 
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be 
>>>>>>> pushed as trivial. Although it is just a one line change, it 
>>>>>>> takes an extensive knowledge to understand the impact. I'll read 
>>>>>>> up on the filed graal issue and try to understand the ELF code a 
>>>>>>> bit better.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Please review this trivial change:
>>>>>>>>
>>>>>>>>   JBS: https://bugs.openjdk.java.net/browse/JDK-8250826
>>>>>>>>   webrev: 
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/
>>>>>>>>
>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks 
>>>>>>>> from coredump via jhsdb.
>>>>>>>>
>>>>>>>> I've reported this issue to GraalVM community [1], and I 've 
>>>>>>>> found out the cause of this issue is .svm_heap would be 
>>>>>>>> separated to RO and RW areas by mprotect() calls in run time in 
>>>>>>>> spite of .svm_heap is RO section in ELF (please see [1] for 
>>>>>>>> details).
>>>>>>>>
>>>>>>>> It is corner case, but we will see same problem on jhsdb when 
>>>>>>>> we attempt to analyze coredump which comes from some 
>>>>>>>> applications / libraries which would separate RO sections in 
>>>>>>>> ELF like Substrate VM.
>>>>>>>>
>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2], 
>>>>>>>> then community members suggested me to discuss in 
>>>>>>>> serviceability-dev.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://github.com/oracle/graal/issues/2579
>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>




More information about the serviceability-dev mailing list