RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM
Chris Plummer
chris.plummer at oracle.com
Tue Aug 4 00:54:25 UTC 2020
Hi Yasumasa,
Your changes look good now.
thanks,
Chris
On 8/3/20 5:47 PM, Yasumasa Suenaga wrote:
> Hi Chris,
>
> Thank you for the comment!
> I updated webrev:
>
> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/
> Diff from webrev.01:
> http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2
>
> On 2020/08/04 6:41, Chris Plummer wrote:
>> Hi Yasumasa,
>>
>> Your updated fix resulted in using the core file map whereas the
>> original fix used the library map. In both cases the assert is
>> avoided, which I think is the main goal. Does it matter which map is
>> used?
>
> In GraalVM, read only segment is conflicted, thus it does not matter
> which map is used.
> However this webrev is more generalize, so segments in coredump should
> be used.
>
>> 42 #ifndef PF_R
>> 43 #define PF_R 0x4
>> 44 #endif
>>
>> 156 if ((map = allocate_init_map(ph->core->classes_jsa_fd,
>> 157 offset, vaddr, memsz, PF_R)) ==
>> NULL) {
>>
>> I'm not so sure this is appropriate for OSX. It uses mach-o files,
>> not elf files. The segment_command flags field comes from loader.h
>> [1]. I don't see anything in there that looks like the equivalent of
>> ELF access flags.
>>
>> /* Constants for the flags field of the segment_command */
>> #define SG_HIGHVM 0x1 /* the file contents for this segment
>> is for
>> the high part of the VM space, the low part
>> is zero filled (for stacks in core files) */
>> #define SG_FVMLIB 0x2 /* this segment is the VM that is
>> allocated by
>> a fixed VM library, for overlap checking in
>> the link editor */
>> #define SG_NORELOC 0x4 /* this segment has nothing that was
>> relocated
>> in it and nothing relocated to it, that is
>> it maybe safely replaced without relocation*/
>> #define SG_PROTECTED_VERSION_1 0x8 /* This segment is protected.
>> If the
>> segment starts at file offset 0, the
>> first page of the segment is not
>> protected. All other pages of the
>> segment are protected. */
>>
>> Since the flags don't matter for OSX, maybe you should just pass 0.
>> You can do something like:
>>
>> #ifndef PF_R
>> #define MAP_R_FLAG PF_R
>> #else
>> #define MAP_R_FLAG 0
>> #endif
>
> Thanks!
> I thought PF_R can be used PF_R from elf.h on macOS:
> https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h
>
> I merged your code in this webrev.
>
>> Some minor comment fixes are needed:
>>
>> 397 // Access flags fot this memory region is different
>> between the library
>>
>> "fot" -> "for"
>> "is" -> "are"
>>
>> 399 // We should respect to coredump.
>>
>> "to" -> "the"
>>
>> 404 // And head of ELF header might be included in coredump
>> (See JDK-7133122).
>> 405 // Thus we need to replace PT_LOAD segments the library
>> version.
>>
>> How about:
>>
>> 404 // Also the first page of the ELF header might be
>> included in the coredump (See JDK-7133122).
>> 405 // Thus we need to replace the PT_LOAD segment with the
>> library version.
>
> Fixed them.
>
>
> Thanks,
>
> Yasumasa
>
>
>> thanks,
>>
>> Chris
>>
>> [1]
>> https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html
>>
>> On 8/2/20 12:18 AM, Yasumasa Suenaga wrote:
>>> Hi Chris,
>>>
>>> (Remove "trivial" from subject)
>>>
>>> Thanks for the information! I fixed errors in new webrev. It passed
>>> tests on submit repo
>>> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525)
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/
>>>
>>>
>>> I tried to use elf.h instead of #define for PF_R, however it failed
>>> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335).
>>>
>>> http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d
>>>
>>> Thus I added #define for it in this webrev.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/08/02 10:22, Chris Plummer wrote:
>>>> Hi Yasumasa,
>>>>
>>>> [2020-08-01T14:15:42,514Z] Creating
>>>> support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from
>>>> 8 file(s)
>>>> [2020-08-01T14:15:43,961Z]
>>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8:
>>>> error: no member named 'flags' in 'struct map_info'
>>>> [2020-08-01T14:15:43,961Z] map->flags = flags;
>>>> [2020-08-01T14:15:43,961Z] ~~~ ^
>>>> [2020-08-01T14:15:43,963Z]
>>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54:
>>>> error: use of undeclared identifier 'PF_R'
>>>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) {
>>>>
>>>> I'll look at the code changes later. No time at the moment.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source
>>>> 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa
>>>> Suenaga wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Thanks for your comment!
>>>>> I pushed new change to submit repo, but the build failed on macOS.
>>>>> Could you share details?
>>>>> (I do not have Mac)
>>>>>
>>>>> commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297
>>>>> job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989
>>>>>
>>>>> On 2020/08/01 13:06, Chris Plummer wrote:
>>>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> On 2020/07/31 7:29, Chris Plummer wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> If I understand correctly we first call add_map_info() for all
>>>>>>>> the PT_LOAD segments in the core file. We then process all the
>>>>>>>> library segments, calling add_map_info() for them if the
>>>>>>>> target_vaddr has not already been addded. If has already been
>>>>>>>> added, which I assume is the case for any library segment that
>>>>>>>> is already in the core file, then the core file version is
>>>>>>>> replaced the the library version. I'm a little unclear of the
>>>>>>>> purpose of this replacing of the core PT_LOAD segments with
>>>>>>>> those found in the libraries. If you could explain this that
>>>>>>>> would help me understand your change.
>>>>>>>
>>>>>>> Read only segments in ELF should not be any different from
>>>>>>> PT_LOAD segments in the core.
>>>>>>> And head of ELF header might be included in coredump (See
>>>>>>> JDK-7133122). Thus we need to replace PT_LOAD segments the
>>>>>>> library version.
>>>>>> Ok. The code in the area really should have been commented better
>>>>>> when first written. The purpose is not understandable simply by
>>>>>> reading the code.
>>>>>
>>>>> I added some comments to existing code. Please tell me if it is
>>>>> insufficient.
>>>>>
>>>>>
>>>>>>>> I'm also unsure why existing_map->fd would ever be something
>>>>>>>> other than the core file. Why would another library map the
>>>>>>>> same target_vaddr.
>>>>>>>
>>>>>>> When mmap() is called to read-only ELF segments / sections,
>>>>>>> Linux kernel seems to allocate other memory segments which has
>>>>>>> same top virtual memory address. I've not yet found out from the
>>>>>>> code of Linux kernel, but I confirmed this behavior on GDB.
>>>>>> Ok. Same comment as above. This should have been explained with
>>>>>> comments in the code.
>>>>>
>>>>> Added some comments.
>>>>>
>>>>>
>>>>>> As for your fix, if I understand correctly the issue is that a
>>>>>> single segment in the library is being split into two segments in
>>>>>> the process (and therefore in the core file) due to an mprotect
>>>>>> being done on part of the segment. Because of this the segment
>>>>>> size in the library does match the segment size in the core file.
>>>>>> So with your fix the library segment is used, but what about the
>>>>>> other half of the segment that is in the core file? Don't we now
>>>>>> have overlapping segments; the full original segment from the
>>>>>> library, and then a second segment that overlaps the tail end of
>>>>>> the library segment? Will that cause any confusion later on?
>>>>>
>>>>> As long as vaddr is valid, it doesn't matter even if it overlaps
>>>>> because SA would sort the map with vaddr, and would lookup with it.
>>>>> In Substrate VM, there are RO and RW sections in that order, so it
>>>>> is ok with webrev.00 . However it might not be appropriate because
>>>>> RW section might be top of PT_LOAD.
>>>>>
>>>>> To make it more generalized, I changed it to the commit on submit
>>>>> repo.
>>>>> It would check access flags between in coredump and in binary. If
>>>>> they are different, we respect current (loaded from coredump) map
>>>>> because it might be changed at runtime.
>>>>>
>>>>> The change for LabsJDK 11 is more simple because JDK 11 does not
>>>>> have ps_core_common.c .
>>>>> So I share you it. It may help you:
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be
>>>>>>>>> pushed as trivial. Although it is just a one line change, it
>>>>>>>>> takes an extensive knowledge to understand the impact. I'll
>>>>>>>>> read up on the filed graal issue and try to understand the ELF
>>>>>>>>> code a bit better.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Please review this trivial change:
>>>>>>>>>>
>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250826
>>>>>>>>>> webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/
>>>>>>>>>>
>>>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks
>>>>>>>>>> from coredump via jhsdb.
>>>>>>>>>>
>>>>>>>>>> I've reported this issue to GraalVM community [1], and I 've
>>>>>>>>>> found out the cause of this issue is .svm_heap would be
>>>>>>>>>> separated to RO and RW areas by mprotect() calls in run time
>>>>>>>>>> in spite of .svm_heap is RO section in ELF (please see [1]
>>>>>>>>>> for details).
>>>>>>>>>>
>>>>>>>>>> It is corner case, but we will see same problem on jhsdb when
>>>>>>>>>> we attempt to analyze coredump which comes from some
>>>>>>>>>> applications / libraries which would separate RO sections in
>>>>>>>>>> ELF like Substrate VM.
>>>>>>>>>>
>>>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue
>>>>>>>>>> [2], then community members suggested me to discuss in
>>>>>>>>>> serviceability-dev.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] https://github.com/oracle/graal/issues/2579
>>>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
More information about the serviceability-dev
mailing list