RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Tue Aug 4 16:27:26 UTC 2020


Hi Yasumasa,

It looks good.
I forgot to say there is no need in new webrev.

Thanks,
Serguei


On 8/4/20 05:22, Yasumasa Suenaga wrote:
> Hi Serguei,
>
> Thanks for your comment!
>
> On 2020/08/04 18:11, serguei.spitsyn at oracle.com wrote:
>> Hi Yasumasa,
>>
>> The fix looks good to me.
>> Thanks to Chris for discussing the details in previous emails.
>>
>> Just one suggestion:
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c.frames.html 
>>
>>
>> 44 #ifdef PF_R
>> 45 #define MAP_R_FLAG PF_R
>> 46 #else
>> 47 #define MAP_R_FLAG 0
>> 48 #endif
>>
>> Could you, please add a small comment before?
>> Something like this would be enough, I think:
>>    // Define a segment permission flag allowing read.
>
> I added it to new webrev:
>   http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.03/
>
>
> Thanks,
>
> Yasumasa
>
>
>> Thanks,
>> Serguei
>>
>>
>> On 8/3/20 17:54, Yasumasa Suenaga wrote:
>>> Thanks Chris!
>>> I will push it when I got second reviewer.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/08/04 9:54, Chris Plummer wrote:
>>>> Hi Yasumasa,
>>>>
>>>> Your changes look good now.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 8/3/20 5:47 PM, Yasumasa Suenaga wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Thank you for the comment!
>>>>> I updated webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/
>>>>>   Diff from webrev.01: 
>>>>> http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2
>>>>>
>>>>> On 2020/08/04 6:41, Chris Plummer wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> Your updated fix resulted in using the core file map whereas the 
>>>>>> original fix used the library map. In both cases the assert is 
>>>>>> avoided, which I think is the main goal. Does it matter which map 
>>>>>> is used?
>>>>>
>>>>> In GraalVM, read only segment is conflicted, thus it does not 
>>>>> matter which map is used.
>>>>> However this webrev is more generalize, so segments in coredump 
>>>>> should be used.
>>>>>
>>>>>>    42 #ifndef PF_R
>>>>>>    43 #define PF_R 0x4
>>>>>>    44 #endif
>>>>>>
>>>>>>   156   if ((map = allocate_init_map(ph->core->classes_jsa_fd,
>>>>>>   157                                offset, vaddr, memsz, PF_R)) 
>>>>>> == NULL) {
>>>>>>
>>>>>> I'm not so sure this is appropriate for OSX. It uses mach-o 
>>>>>> files, not elf files. The segment_command flags field comes from 
>>>>>> loader.h [1]. I don't see anything in there that looks like the 
>>>>>> equivalent of ELF access flags.
>>>>>>
>>>>>> /* Constants for the flags field of the segment_command */
>>>>>> #define    SG_HIGHVM    0x1    /* the file contents for this 
>>>>>> segment is for
>>>>>>                     the high part of the VM space, the low part
>>>>>>                     is zero filled (for stacks in core files) */
>>>>>> #define    SG_FVMLIB    0x2    /* this segment is the VM that is 
>>>>>> allocated by
>>>>>>                     a fixed VM library, for overlap checking in
>>>>>>                     the link editor */
>>>>>> #define    SG_NORELOC    0x4    /* this segment has nothing that 
>>>>>> was relocated
>>>>>>                     in it and nothing relocated to it, that is
>>>>>>                     it maybe safely replaced without relocation*/
>>>>>> #define SG_PROTECTED_VERSION_1    0x8 /* This segment is 
>>>>>> protected. If the
>>>>>>                         segment starts at file offset 0, the
>>>>>>                         first page of the segment is not
>>>>>>                         protected.  All other pages of the
>>>>>>                         segment are protected. */
>>>>>>
>>>>>> Since the flags don't matter for OSX, maybe you should just pass 
>>>>>> 0. You can do something like:
>>>>>>
>>>>>> #ifndef PF_R
>>>>>> #define MAP_R_FLAG PF_R
>>>>>> #else
>>>>>> #define MAP_R_FLAG 0
>>>>>> #endif
>>>>>
>>>>> Thanks!
>>>>> I thought PF_R can be used PF_R from elf.h on macOS:
>>>>> https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h
>>>>>
>>>>> I merged your code in this webrev.
>>>>>
>>>>>> Some minor comment fixes are needed:
>>>>>>
>>>>>>   397         // Access flags fot this memory region is different 
>>>>>> between the library
>>>>>>
>>>>>> "fot" -> "for"
>>>>>> "is" -> "are"
>>>>>>
>>>>>>   399         // We should respect to coredump.
>>>>>>
>>>>>> "to" -> "the"
>>>>>>
>>>>>>   404         // And head of ELF header might be included in 
>>>>>> coredump (See JDK-7133122).
>>>>>>   405         // Thus we need to replace PT_LOAD segments the 
>>>>>> library version.
>>>>>>
>>>>>> How about:
>>>>>>
>>>>>>   404         // Also the first page of the ELF header might be 
>>>>>> included in the coredump (See JDK-7133122).
>>>>>>   405         // Thus we need to replace the PT_LOAD segment with 
>>>>>> the library version.
>>>>>
>>>>> Fixed them.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> [1] 
>>>>>> https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html
>>>>>>
>>>>>> On 8/2/20 12:18 AM, Yasumasa Suenaga wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> (Remove "trivial" from subject)
>>>>>>>
>>>>>>> Thanks for the information! I fixed errors in new webrev. It 
>>>>>>> passed tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525)
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/
>>>>>>>
>>>>>>>
>>>>>>> I tried to use elf.h instead of #define for PF_R, however it 
>>>>>>> failed (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335).
>>>>>>>
>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d
>>>>>>>
>>>>>>> Thus I added #define for it in this webrev.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/08/02 10:22, Chris Plummer wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> [2020-08-01T14:15:42,514Z] Creating 
>>>>>>>> support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a 
>>>>>>>> from 8 file(s)
>>>>>>>> [2020-08-01T14:15:43,961Z] 
>>>>>>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8: 
>>>>>>>> error: no member named 'flags' in 'struct map_info'
>>>>>>>> [2020-08-01T14:15:43,961Z]   map->flags  = flags;
>>>>>>>> [2020-08-01T14:15:43,961Z]   ~~~  ^
>>>>>>>> [2020-08-01T14:15:43,963Z] 
>>>>>>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54: 
>>>>>>>> error: use of undeclared identifier 'PF_R'
>>>>>>>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) {
>>>>>>>>
>>>>>>>> I'll look at the code changes later. No time at the moment.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source 
>>>>>>>> 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa 
>>>>>>>> Suenaga wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> Thanks for your comment!
>>>>>>>>> I pushed new change to submit repo, but the build failed on 
>>>>>>>>> macOS. Could you share details?
>>>>>>>>> (I do not have Mac)
>>>>>>>>>
>>>>>>>>>   commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297
>>>>>>>>>   job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989
>>>>>>>>>
>>>>>>>>> On 2020/08/01 13:06, Chris Plummer wrote:
>>>>>>>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>
>>>>>>>>>>> On 2020/07/31 7:29, Chris Plummer wrote:
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> If I understand correctly we first call add_map_info() for 
>>>>>>>>>>>> all the PT_LOAD segments in the core file. We then process 
>>>>>>>>>>>> all the library segments, calling add_map_info() for them 
>>>>>>>>>>>> if the target_vaddr has not already been addded. If has 
>>>>>>>>>>>> already been added, which I assume is the case for any 
>>>>>>>>>>>> library segment that is already in the core file, then the 
>>>>>>>>>>>> core file version is replaced the the library version.  I'm 
>>>>>>>>>>>> a little unclear of the purpose of this replacing of the 
>>>>>>>>>>>> core PT_LOAD segments with those found in the libraries. If 
>>>>>>>>>>>> you could explain this that would help me understand your 
>>>>>>>>>>>> change.
>>>>>>>>>>>
>>>>>>>>>>> Read only segments in ELF should not be any different from 
>>>>>>>>>>> PT_LOAD segments in the core.
>>>>>>>>>>> And head of ELF header might be included in coredump (See 
>>>>>>>>>>> JDK-7133122). Thus we need to replace PT_LOAD segments the 
>>>>>>>>>>> library version.
>>>>>>>>>> Ok. The code in the area really should have been commented 
>>>>>>>>>> better when first written. The purpose is not understandable 
>>>>>>>>>> simply by reading the code.
>>>>>>>>>
>>>>>>>>> I added some comments to existing code. Please tell me if it 
>>>>>>>>> is insufficient.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> I'm also unsure why existing_map->fd would ever be 
>>>>>>>>>>>> something other than the core file. Why would another 
>>>>>>>>>>>> library map the same target_vaddr.
>>>>>>>>>>>
>>>>>>>>>>> When mmap() is called to read-only ELF segments / sections, 
>>>>>>>>>>> Linux kernel seems to allocate other memory segments which 
>>>>>>>>>>> has same top virtual memory address. I've not yet found out 
>>>>>>>>>>> from the code of Linux kernel, but I confirmed this behavior 
>>>>>>>>>>> on GDB.
>>>>>>>>>> Ok. Same comment as above. This should have been explained 
>>>>>>>>>> with comments in the code.
>>>>>>>>>
>>>>>>>>> Added some comments.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> As for your fix, if I understand correctly the issue is that 
>>>>>>>>>> a single segment in the library is being split into two 
>>>>>>>>>> segments in the process (and therefore in the core file) due 
>>>>>>>>>> to an mprotect being done on part of the segment. Because of 
>>>>>>>>>> this the segment size in the library does match the segment 
>>>>>>>>>> size in the core file. So with your fix the library segment 
>>>>>>>>>> is used, but what about the other half of the segment that is 
>>>>>>>>>> in the core file? Don't we now have overlapping segments; the 
>>>>>>>>>> full original segment from the library, and then a second 
>>>>>>>>>> segment that overlaps the tail end of the library segment? 
>>>>>>>>>> Will that cause any confusion later on?
>>>>>>>>>
>>>>>>>>> As long as vaddr is valid, it doesn't matter even if it 
>>>>>>>>> overlaps because SA would sort the map with vaddr, and would 
>>>>>>>>> lookup with it.
>>>>>>>>> In Substrate VM, there are RO and RW sections in that order, 
>>>>>>>>> so it is ok with webrev.00 . However it might not be 
>>>>>>>>> appropriate because RW section might be top of PT_LOAD.
>>>>>>>>>
>>>>>>>>> To make it more generalized, I changed it to the commit on 
>>>>>>>>> submit repo.
>>>>>>>>> It would check access flags between in coredump and in binary. 
>>>>>>>>> If they are different, we respect current (loaded from 
>>>>>>>>> coredump) map because it might be changed at runtime.
>>>>>>>>>
>>>>>>>>> The change for LabsJDK 11 is more simple because JDK 11 does 
>>>>>>>>> not have ps_core_common.c .
>>>>>>>>> So I share you it. It may help you:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote:
>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be 
>>>>>>>>>>>>> pushed as trivial. Although it is just a one line change, 
>>>>>>>>>>>>> it takes an extensive knowledge to understand the impact. 
>>>>>>>>>>>>> I'll read up on the filed graal issue and try to 
>>>>>>>>>>>>> understand the ELF code a bit better.
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please review this trivial change:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   JBS: https://bugs.openjdk.java.net/browse/JDK-8250826
>>>>>>>>>>>>>>   webrev: 
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java 
>>>>>>>>>>>>>> stacks from coredump via jhsdb.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've reported this issue to GraalVM community [1], and I 
>>>>>>>>>>>>>> 've found out the cause of this issue is .svm_heap would 
>>>>>>>>>>>>>> be separated to RO and RW areas by mprotect() calls in 
>>>>>>>>>>>>>> run time in spite of .svm_heap is RO section in ELF 
>>>>>>>>>>>>>> (please see [1] for details).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is corner case, but we will see same problem on jhsdb 
>>>>>>>>>>>>>> when we attempt to analyze coredump which comes from some 
>>>>>>>>>>>>>> applications / libraries which would separate RO sections 
>>>>>>>>>>>>>> in ELF like Substrate VM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this 
>>>>>>>>>>>>>> issue [2], then community members suggested me to discuss 
>>>>>>>>>>>>>> in serviceability-dev.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1] https://github.com/oracle/graal/issues/2579
>>>>>>>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>



More information about the serviceability-dev mailing list