RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM
Chris Plummer
chris.plummer at oracle.com
Mon Aug 3 21:41:00 UTC 2020
Hi Yasumasa,
Your updated fix resulted in using the core file map whereas the
original fix used the library map. In both cases the assert is avoided,
which I think is the main goal. Does it matter which map is used?
42 #ifndef PF_R
43 #define PF_R 0x4
44 #endif
156 if ((map = allocate_init_map(ph->core->classes_jsa_fd,
157 offset, vaddr, memsz, PF_R)) == NULL) {
I'm not so sure this is appropriate for OSX. It uses mach-o files, not
elf files. The segment_command flags field comes from loader.h [1]. I
don't see anything in there that looks like the equivalent of ELF access
flags.
/* Constants for the flags field of the segment_command */
#define SG_HIGHVM 0x1 /* the file contents for this segment is for
the high part of the VM space, the low part
is zero filled (for stacks in core files) */
#define SG_FVMLIB 0x2 /* this segment is the VM that is
allocated by
a fixed VM library, for overlap checking in
the link editor */
#define SG_NORELOC 0x4 /* this segment has nothing that was
relocated
in it and nothing relocated to it, that is
it maybe safely replaced without relocation*/
#define SG_PROTECTED_VERSION_1 0x8 /* This segment is protected. If the
segment starts at file offset 0, the
first page of the segment is not
protected. All other pages of the
segment are protected. */
Since the flags don't matter for OSX, maybe you should just pass 0. You
can do something like:
#ifndef PF_R
#define MAP_R_FLAG PF_R
#else
#define MAP_R_FLAG 0
#endif
Some minor comment fixes are needed:
397 // Access flags fot this memory region is different
between the library
"fot" -> "for"
"is" -> "are"
399 // We should respect to coredump.
"to" -> "the"
404 // And head of ELF header might be included in coredump
(See JDK-7133122).
405 // Thus we need to replace PT_LOAD segments the library
version.
How about:
404 // Also the first page of the ELF header might be included
in the coredump (See JDK-7133122).
405 // Thus we need to replace the PT_LOAD segment with the
library version.
thanks,
Chris
[1]
https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html
On 8/2/20 12:18 AM, Yasumasa Suenaga wrote:
> Hi Chris,
>
> (Remove "trivial" from subject)
>
> Thanks for the information! I fixed errors in new webrev. It passed
> tests on submit repo
> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525)
>
> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/
>
>
> I tried to use elf.h instead of #define for PF_R, however it failed
> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335).
>
> http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d
>
> Thus I added #define for it in this webrev.
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2020/08/02 10:22, Chris Plummer wrote:
>> Hi Yasumasa,
>>
>> [2020-08-01T14:15:42,514Z] Creating
>> support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a from 8
>> file(s)
>> [2020-08-01T14:15:43,961Z]
>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8:
>> error: no member named 'flags' in 'struct map_info'
>> [2020-08-01T14:15:43,961Z] map->flags = flags;
>> [2020-08-01T14:15:43,961Z] ~~~ ^
>> [2020-08-01T14:15:43,963Z]
>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54:
>> error: use of undeclared identifier 'PF_R'
>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) {
>>
>> I'll look at the code changes later. No time at the moment.
>>
>> thanks,
>>
>> Chris
>>
>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source
>> 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa Suenaga
>> wrote:
>>> Hi Chris,
>>>
>>> Thanks for your comment!
>>> I pushed new change to submit repo, but the build failed on macOS.
>>> Could you share details?
>>> (I do not have Mac)
>>>
>>> commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297
>>> job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989
>>>
>>> On 2020/08/01 13:06, Chris Plummer wrote:
>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote:
>>>>> Hi Chris,
>>>>>
>>>>> On 2020/07/31 7:29, Chris Plummer wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> If I understand correctly we first call add_map_info() for all
>>>>>> the PT_LOAD segments in the core file. We then process all the
>>>>>> library segments, calling add_map_info() for them if the
>>>>>> target_vaddr has not already been addded. If has already been
>>>>>> added, which I assume is the case for any library segment that is
>>>>>> already in the core file, then the core file version is replaced
>>>>>> the the library version. I'm a little unclear of the purpose of
>>>>>> this replacing of the core PT_LOAD segments with those found in
>>>>>> the libraries. If you could explain this that would help me
>>>>>> understand your change.
>>>>>
>>>>> Read only segments in ELF should not be any different from PT_LOAD
>>>>> segments in the core.
>>>>> And head of ELF header might be included in coredump (See
>>>>> JDK-7133122). Thus we need to replace PT_LOAD segments the library
>>>>> version.
>>>> Ok. The code in the area really should have been commented better
>>>> when first written. The purpose is not understandable simply by
>>>> reading the code.
>>>
>>> I added some comments to existing code. Please tell me if it is
>>> insufficient.
>>>
>>>
>>>>>> I'm also unsure why existing_map->fd would ever be something
>>>>>> other than the core file. Why would another library map the same
>>>>>> target_vaddr.
>>>>>
>>>>> When mmap() is called to read-only ELF segments / sections, Linux
>>>>> kernel seems to allocate other memory segments which has same top
>>>>> virtual memory address. I've not yet found out from the code of
>>>>> Linux kernel, but I confirmed this behavior on GDB.
>>>> Ok. Same comment as above. This should have been explained with
>>>> comments in the code.
>>>
>>> Added some comments.
>>>
>>>
>>>> As for your fix, if I understand correctly the issue is that a
>>>> single segment in the library is being split into two segments in
>>>> the process (and therefore in the core file) due to an mprotect
>>>> being done on part of the segment. Because of this the segment size
>>>> in the library does match the segment size in the core file. So
>>>> with your fix the library segment is used, but what about the other
>>>> half of the segment that is in the core file? Don't we now have
>>>> overlapping segments; the full original segment from the library,
>>>> and then a second segment that overlaps the tail end of the library
>>>> segment? Will that cause any confusion later on?
>>>
>>> As long as vaddr is valid, it doesn't matter even if it overlaps
>>> because SA would sort the map with vaddr, and would lookup with it.
>>> In Substrate VM, there are RO and RW sections in that order, so it
>>> is ok with webrev.00 . However it might not be appropriate because
>>> RW section might be top of PT_LOAD.
>>>
>>> To make it more generalized, I changed it to the commit on submit repo.
>>> It would check access flags between in coredump and in binary. If
>>> they are different, we respect current (loaded from coredump) map
>>> because it might be changed at runtime.
>>>
>>> The change for LabsJDK 11 is more simple because JDK 11 does not
>>> have ps_core_common.c .
>>> So I share you it. It may help you:
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be
>>>>>>> pushed as trivial. Although it is just a one line change, it
>>>>>>> takes an extensive knowledge to understand the impact. I'll read
>>>>>>> up on the filed graal issue and try to understand the ELF code a
>>>>>>> bit better.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Please review this trivial change:
>>>>>>>>
>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250826
>>>>>>>> webrev:
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/
>>>>>>>>
>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java stacks
>>>>>>>> from coredump via jhsdb.
>>>>>>>>
>>>>>>>> I've reported this issue to GraalVM community [1], and I 've
>>>>>>>> found out the cause of this issue is .svm_heap would be
>>>>>>>> separated to RO and RW areas by mprotect() calls in run time in
>>>>>>>> spite of .svm_heap is RO section in ELF (please see [1] for
>>>>>>>> details).
>>>>>>>>
>>>>>>>> It is corner case, but we will see same problem on jhsdb when
>>>>>>>> we attempt to analyze coredump which comes from some
>>>>>>>> applications / libraries which would separate RO sections in
>>>>>>>> ELF like Substrate VM.
>>>>>>>>
>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this issue [2],
>>>>>>>> then community members suggested me to discuss in
>>>>>>>> serviceability-dev.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://github.com/oracle/graal/issues/2579
>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
More information about the serviceability-dev
mailing list