RFR: 8250826: jhsdb does not work with coredump which comes from Substrate VM
serguei.spitsyn at oracle.com
serguei.spitsyn at oracle.com
Tue Aug 4 16:27:26 UTC 2020
Hi Yasumasa,
It looks good.
I forgot to say there is no need in new webrev.
Thanks,
Serguei
On 8/4/20 05:22, Yasumasa Suenaga wrote:
> Hi Serguei,
>
> Thanks for your comment!
>
> On 2020/08/04 18:11, serguei.spitsyn at oracle.com wrote:
>> Hi Yasumasa,
>>
>> The fix looks good to me.
>> Thanks to Chris for discussing the details in previous emails.
>>
>> Just one suggestion:
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c.frames.html
>>
>>
>> 44 #ifdef PF_R
>> 45 #define MAP_R_FLAG PF_R
>> 46 #else
>> 47 #define MAP_R_FLAG 0
>> 48 #endif
>>
>> Could you, please add a small comment before?
>> Something like this would be enough, I think:
>> // Define a segment permission flag allowing read.
>
> I added it to new webrev:
> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.03/
>
>
> Thanks,
>
> Yasumasa
>
>
>> Thanks,
>> Serguei
>>
>>
>> On 8/3/20 17:54, Yasumasa Suenaga wrote:
>>> Thanks Chris!
>>> I will push it when I got second reviewer.
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/08/04 9:54, Chris Plummer wrote:
>>>> Hi Yasumasa,
>>>>
>>>> Your changes look good now.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 8/3/20 5:47 PM, Yasumasa Suenaga wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Thank you for the comment!
>>>>> I updated webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.02/
>>>>> Diff from webrev.01:
>>>>> http://hg.openjdk.java.net/jdk/submit/rev/e98dc25b69c2
>>>>>
>>>>> On 2020/08/04 6:41, Chris Plummer wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> Your updated fix resulted in using the core file map whereas the
>>>>>> original fix used the library map. In both cases the assert is
>>>>>> avoided, which I think is the main goal. Does it matter which map
>>>>>> is used?
>>>>>
>>>>> In GraalVM, read only segment is conflicted, thus it does not
>>>>> matter which map is used.
>>>>> However this webrev is more generalize, so segments in coredump
>>>>> should be used.
>>>>>
>>>>>> 42 #ifndef PF_R
>>>>>> 43 #define PF_R 0x4
>>>>>> 44 #endif
>>>>>>
>>>>>> 156 if ((map = allocate_init_map(ph->core->classes_jsa_fd,
>>>>>> 157 offset, vaddr, memsz, PF_R))
>>>>>> == NULL) {
>>>>>>
>>>>>> I'm not so sure this is appropriate for OSX. It uses mach-o
>>>>>> files, not elf files. The segment_command flags field comes from
>>>>>> loader.h [1]. I don't see anything in there that looks like the
>>>>>> equivalent of ELF access flags.
>>>>>>
>>>>>> /* Constants for the flags field of the segment_command */
>>>>>> #define SG_HIGHVM 0x1 /* the file contents for this
>>>>>> segment is for
>>>>>> the high part of the VM space, the low part
>>>>>> is zero filled (for stacks in core files) */
>>>>>> #define SG_FVMLIB 0x2 /* this segment is the VM that is
>>>>>> allocated by
>>>>>> a fixed VM library, for overlap checking in
>>>>>> the link editor */
>>>>>> #define SG_NORELOC 0x4 /* this segment has nothing that
>>>>>> was relocated
>>>>>> in it and nothing relocated to it, that is
>>>>>> it maybe safely replaced without relocation*/
>>>>>> #define SG_PROTECTED_VERSION_1 0x8 /* This segment is
>>>>>> protected. If the
>>>>>> segment starts at file offset 0, the
>>>>>> first page of the segment is not
>>>>>> protected. All other pages of the
>>>>>> segment are protected. */
>>>>>>
>>>>>> Since the flags don't matter for OSX, maybe you should just pass
>>>>>> 0. You can do something like:
>>>>>>
>>>>>> #ifndef PF_R
>>>>>> #define MAP_R_FLAG PF_R
>>>>>> #else
>>>>>> #define MAP_R_FLAG 0
>>>>>> #endif
>>>>>
>>>>> Thanks!
>>>>> I thought PF_R can be used PF_R from elf.h on macOS:
>>>>> https://opensource.apple.com/source/dtrace/dtrace-90/sys/elf.h
>>>>>
>>>>> I merged your code in this webrev.
>>>>>
>>>>>> Some minor comment fixes are needed:
>>>>>>
>>>>>> 397 // Access flags fot this memory region is different
>>>>>> between the library
>>>>>>
>>>>>> "fot" -> "for"
>>>>>> "is" -> "are"
>>>>>>
>>>>>> 399 // We should respect to coredump.
>>>>>>
>>>>>> "to" -> "the"
>>>>>>
>>>>>> 404 // And head of ELF header might be included in
>>>>>> coredump (See JDK-7133122).
>>>>>> 405 // Thus we need to replace PT_LOAD segments the
>>>>>> library version.
>>>>>>
>>>>>> How about:
>>>>>>
>>>>>> 404 // Also the first page of the ELF header might be
>>>>>> included in the coredump (See JDK-7133122).
>>>>>> 405 // Thus we need to replace the PT_LOAD segment with
>>>>>> the library version.
>>>>>
>>>>> Fixed them.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> [1]
>>>>>> https://opensource.apple.com/source/xnu/xnu-1456.1.26/EXTERNAL_HEADERS/mach-o/loader.h.auto.html
>>>>>>
>>>>>> On 8/2/20 12:18 AM, Yasumasa Suenaga wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> (Remove "trivial" from subject)
>>>>>>>
>>>>>>> Thanks for the information! I fixed errors in new webrev. It
>>>>>>> passed tests on submit repo
>>>>>>> (mach5-one-ysuenaga-JDK-8250826-1-20200802-0151-13109525)
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.01/
>>>>>>>
>>>>>>>
>>>>>>> I tried to use elf.h instead of #define for PF_R, however it
>>>>>>> failed (mach5-one-ysuenaga-JDK-8250826-1-20200802-0542-13111335).
>>>>>>>
>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/67baee1a1a1d
>>>>>>>
>>>>>>> Thus I added #define for it in this webrev.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2020/08/02 10:22, Chris Plummer wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> [2020-08-01T14:15:42,514Z] Creating
>>>>>>>> support/native/jdk.hotspot.agent/libsaproc/static/libsaproc.a
>>>>>>>> from 8 file(s)
>>>>>>>> [2020-08-01T14:15:43,961Z]
>>>>>>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:128:8:
>>>>>>>> error: no member named 'flags' in 'struct map_info'
>>>>>>>> [2020-08-01T14:15:43,961Z] map->flags = flags;
>>>>>>>> [2020-08-01T14:15:43,961Z] ~~~ ^
>>>>>>>> [2020-08-01T14:15:43,963Z]
>>>>>>>> ./open/src/jdk.hotspot.agent/share/native/libsaproc/ps_core_common.c:153:54:
>>>>>>>> error: use of undeclared identifier 'PF_R'
>>>>>>>> [2020-08-01T14:15:43,963Z] offset, vaddr, memsz, PF_R)) == NULL) {
>>>>>>>>
>>>>>>>> I'll look at the code changes later. No time at the moment.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> 2020-08-01-1405571.suenaga.source2020-08-01-1405571.suenaga.source
>>>>>>>> 2020-08-01-1405571.suenaga.source On 8/1/20 5:20 PM, Yasumasa
>>>>>>>> Suenaga wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> Thanks for your comment!
>>>>>>>>> I pushed new change to submit repo, but the build failed on
>>>>>>>>> macOS. Could you share details?
>>>>>>>>> (I do not have Mac)
>>>>>>>>>
>>>>>>>>> commit: http://hg.openjdk.java.net/jdk/submit/rev/0eb1c497f297
>>>>>>>>> job: mach5-one-ysuenaga-JDK-8250826-1-20200801-1407-13098989
>>>>>>>>>
>>>>>>>>> On 2020/08/01 13:06, Chris Plummer wrote:
>>>>>>>>>> On 7/30/20 6:18 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>
>>>>>>>>>>> On 2020/07/31 7:29, Chris Plummer wrote:
>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>
>>>>>>>>>>>> If I understand correctly we first call add_map_info() for
>>>>>>>>>>>> all the PT_LOAD segments in the core file. We then process
>>>>>>>>>>>> all the library segments, calling add_map_info() for them
>>>>>>>>>>>> if the target_vaddr has not already been addded. If has
>>>>>>>>>>>> already been added, which I assume is the case for any
>>>>>>>>>>>> library segment that is already in the core file, then the
>>>>>>>>>>>> core file version is replaced the the library version. I'm
>>>>>>>>>>>> a little unclear of the purpose of this replacing of the
>>>>>>>>>>>> core PT_LOAD segments with those found in the libraries. If
>>>>>>>>>>>> you could explain this that would help me understand your
>>>>>>>>>>>> change.
>>>>>>>>>>>
>>>>>>>>>>> Read only segments in ELF should not be any different from
>>>>>>>>>>> PT_LOAD segments in the core.
>>>>>>>>>>> And head of ELF header might be included in coredump (See
>>>>>>>>>>> JDK-7133122). Thus we need to replace PT_LOAD segments the
>>>>>>>>>>> library version.
>>>>>>>>>> Ok. The code in the area really should have been commented
>>>>>>>>>> better when first written. The purpose is not understandable
>>>>>>>>>> simply by reading the code.
>>>>>>>>>
>>>>>>>>> I added some comments to existing code. Please tell me if it
>>>>>>>>> is insufficient.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> I'm also unsure why existing_map->fd would ever be
>>>>>>>>>>>> something other than the core file. Why would another
>>>>>>>>>>>> library map the same target_vaddr.
>>>>>>>>>>>
>>>>>>>>>>> When mmap() is called to read-only ELF segments / sections,
>>>>>>>>>>> Linux kernel seems to allocate other memory segments which
>>>>>>>>>>> has same top virtual memory address. I've not yet found out
>>>>>>>>>>> from the code of Linux kernel, but I confirmed this behavior
>>>>>>>>>>> on GDB.
>>>>>>>>>> Ok. Same comment as above. This should have been explained
>>>>>>>>>> with comments in the code.
>>>>>>>>>
>>>>>>>>> Added some comments.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> As for your fix, if I understand correctly the issue is that
>>>>>>>>>> a single segment in the library is being split into two
>>>>>>>>>> segments in the process (and therefore in the core file) due
>>>>>>>>>> to an mprotect being done on part of the segment. Because of
>>>>>>>>>> this the segment size in the library does match the segment
>>>>>>>>>> size in the core file. So with your fix the library segment
>>>>>>>>>> is used, but what about the other half of the segment that is
>>>>>>>>>> in the core file? Don't we now have overlapping segments; the
>>>>>>>>>> full original segment from the library, and then a second
>>>>>>>>>> segment that overlaps the tail end of the library segment?
>>>>>>>>>> Will that cause any confusion later on?
>>>>>>>>>
>>>>>>>>> As long as vaddr is valid, it doesn't matter even if it
>>>>>>>>> overlaps because SA would sort the map with vaddr, and would
>>>>>>>>> lookup with it.
>>>>>>>>> In Substrate VM, there are RO and RW sections in that order,
>>>>>>>>> so it is ok with webrev.00 . However it might not be
>>>>>>>>> appropriate because RW section might be top of PT_LOAD.
>>>>>>>>>
>>>>>>>>> To make it more generalized, I changed it to the commit on
>>>>>>>>> submit repo.
>>>>>>>>> It would check access flags between in coredump and in binary.
>>>>>>>>> If they are different, we respect current (loaded from
>>>>>>>>> coredump) map because it might be changed at runtime.
>>>>>>>>>
>>>>>>>>> The change for LabsJDK 11 is more simple because JDK 11 does
>>>>>>>>> not have ps_core_common.c .
>>>>>>>>> So I share you it. It may help you:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/JDK-8250826-labsjdk11-0.patch
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/30/20 1:18 PM, Chris Plummer wrote:
>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm reviewing this RFR, and I'd like to ask that it not be
>>>>>>>>>>>>> pushed as trivial. Although it is just a one line change,
>>>>>>>>>>>>> it takes an extensive knowledge to understand the impact.
>>>>>>>>>>>>> I'll read up on the filed graal issue and try to
>>>>>>>>>>>>> understand the ELF code a bit better.
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/30/20 6:45 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please review this trivial change:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250826
>>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250826/webrev.00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I played Truffle NFI on GraalVM, but I cannot get Java
>>>>>>>>>>>>>> stacks from coredump via jhsdb.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've reported this issue to GraalVM community [1], and I
>>>>>>>>>>>>>> 've found out the cause of this issue is .svm_heap would
>>>>>>>>>>>>>> be separated to RO and RW areas by mprotect() calls in
>>>>>>>>>>>>>> run time in spite of .svm_heap is RO section in ELF
>>>>>>>>>>>>>> (please see [1] for details).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is corner case, but we will see same problem on jhsdb
>>>>>>>>>>>>>> when we attempt to analyze coredump which comes from some
>>>>>>>>>>>>>> applications / libraries which would separate RO sections
>>>>>>>>>>>>>> in ELF like Substrate VM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I sent PR to fix libsaproc.so in LabsJDK 11 for this
>>>>>>>>>>>>>> issue [2], then community members suggested me to discuss
>>>>>>>>>>>>>> in serviceability-dev.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1] https://github.com/oracle/graal/issues/2579
>>>>>>>>>>>>>> [2] https://github.com/graalvm/labs-openjdk-11/pull/9
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
More information about the serviceability-dev
mailing list