[PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686

Chris Plummer chris.plummer at oracle.com
Tue Aug 7 19:44:06 UTC 2018


Hi Severin,

The changes look fine. Thanks for adding the test case.

Chris

On 8/6/18 4:53 AM, Severin Gehwolf wrote:
> Hi,
>
> Latest webrev with JNI properly compiled with -fomit-frame-pointer and
> -O3. There was a bug in the test where the exception wasn't properly
> rethrown:
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.03/
>
> I'm going to run this through jdk-submit too. Adding build-dev for
> build changes.
>
> Thanks,
> Severin
>
> On Fri, 2018-08-03 at 18:45 +0200, Severin Gehwolf wrote:
>> Hi all,
>>
>> On Mon, 2018-07-30 at 21:23 -0700, Sharath Ballal wrote:
>>>> Ok. It looks like we don't even have a "jstack --mixed" test. Could
>>>> you add one? It would be even better if the test included a JNI lib
>>>> that wasn't compiled with -fno-omit-frame-pointer so you don't need
>>>> to rely on glibc to reproduce this issue (or is glibc  pretty much
>>>> always compiled without -fno-omit-frame-pointer)? Or if Sharath
>>>> agrees, file a bug to have a test added.
>>> That’s a good suggestion.  Severin you can either write a test or
>>> open a bug for it.
>> Latest webrev with a test:
>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.02/
>>
>> The test fails prior the test on affected systems and passes after.
>> There are still issues with getting the test's JNI properly compiled
>> the way it's supposed to. I've asked for help on build-dev[1]. Example
>> runs:
>>
>> Before patch:
>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/before_patch.txt
>>
>> After patch:
>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/after_patch.txt
>>
>> Thanks,
>> Severin
>>
>> [1] http://mail.openjdk.java.net/pipermail/build-dev/2018-August/022819.html
>>
>>> -----Original Message-----
>>> From: Chris Plummer
>>> Sent: Monday, July 30, 2018 10:03 PM
>>> To: Severin Gehwolf; Sharath Ballal; serviceability-dev
>>> Subject: Re: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
>>> UnmappedAddressException on i686
>>>
>>> Hi Severin,
>>>
>>> On 7/30/18 1:28 AM, Severin Gehwolf wrote:
>>>> Hi Chris,
>>>>
>>>> On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote:
>>>>> I had looked at this review when it came out, but was hesitant to
>>>>> ok
>>>>> it because I really don't know this code at all. If you can get
>>>>> another reviewer who does know the code, then I'll approve it.
>>>> Sharath Ballal reviewed it, but he's not a Reviewer as per the
>>>> OpenJDK
>>>> census. As to whether he knows the code, I don't know. He's on CC.
>>> Yes, but I was asking for a second reviewer (not counting me).
>>>>> This only impacts 32-bit, right? If so, keep in mind that it
>>>>> won't
>>>>> get tested by Oracle testing, including the submit repo, so make
>>>>> sure you do thorough testing.
>>>> It only impacts 32-bit, yes. I understand that Oracle isn't
>>>> testing
>>>> 32- bit x86 any more. The change itself should be fairly low risk
>>>> since it's changing only a 32-bit-x86-linux-only file and the
>>>> native
>>>> bits don't seem to match what the Java code does[1].
>>>> REG_INDEX(reg)
>>>> being defined as:
>>>>
>>>> #define REG_INDEX(reg)
>>>> sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg
>>>>
>>>> and being used as:
>>>>
>>>> REG_INDEX(SP)
>>>>
>>>> Thus, using
>>>>
>>>> sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP
>>>>
>>>> The Java code uses:
>>>>
>>>> sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP
>>>>
>>>>> Also, why is there any code being executed that was not compiled
>>>>> with
>>>>> -fno-omit-frame-pointer? The description in the CR just shows a
>>>>> simple java program reproducing this, so all the mixed stack
>>>>> traces
>>>>> belong to the JVM and libs, and I thought we made sure to compile
>>>>> all
>>>>> of them with -fno-omit-frame-pointer.
>>>> The JVM uses glibc and that simple program is enough to see some
>>>> thread's stack currently being in a glibc function when getting a
>>>> mixed stack trace. We've originally seen this in JDK 8 with jstack
>>>> -m
>>>> and was reported in [2]. That comment has more details. The
>>>> problem
>>>> here isn't that it's a JDK lib which gets compiled without
>>>> -fno-omit-frame- pointer. It's glibc not being compiled with that
>>>> option.
>>>>
>>>> An example stack trace for a system where this happens looks like
>>>> this:
>>>>
>>>> Thread 7 (Thread 0xa3863b40 (LWP 834)):
>>>> #0  0xf771f430 in __kernel_vsyscall ()
>>>> #1  0xf7703acc in futex_abstimed_wait (cancel=true,
>>>> private=<optimized
>>>> out>, abstime=0x0, expected=1, futex=0xf770f000) at
>>>> ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
>>>> #2  do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 <sig_sem>,
>>>> abstime=0x0) at
>>>> ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
>>>> #3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>,
>>>> abstime=0x0) at
>>>> ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
>>>> #4  0xf6cc18d4 in check_pending_signals (wait=true) at
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-
>>>> 8.b10.el7_5.i386/openjdk/h
>>>> otspot/src/os/linux/vm/os_linux.cpp:2522
>>>> #5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800,
>>>> __the_thread__=0xa37a4800) at
>>>> /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-
>>>> 8.b10.el7_5.i386/openjdk/h
>>>> otspot/src/share/vm/runtime/os.cpp:250
>>>>
>>>> That is, frames 0-3 are JDK foreign. This bug will happen on all
>>>> systems which use any native library which isn't compiled with
>>>> -fno-
>>>> omit-frame-pointer. Be it glibc or some other library.
>>> Ok. It looks like we don't even have a "jstack --mixed" test. Could
>>> you add one? It would be even better if the test included a JNI lib
>>> that wasn't compiled with -fno-omit-frame-pointer so you don't need
>>> to rely on glibc to reproduce this issue (or is glibc pretty much
>>> always compiled without -fno-omit-frame-pointer)? Or if Sharath
>>> agrees, file a bug to have a test added.
>>>
>>> thanks,
>>>
>>> Chris
>>>> Thanks,
>>>> Severin
>>>>
>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9
>>>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/26/18 10:11 AM, Severin Gehwolf wrote:
>>>>>> On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
>>>>>>> Changes looks good Severin.
>>>>>> Thanks for the review, Sharath!
>>>>>>
>>>>>>> I am not a reviewer though, so you still need a Reviewer to
>>>>>>> review.
>>>>>> Anyone?
>>>>>>
>>>>>> Thanks,
>>>>>> Severin
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Severin Gehwolf [mailto:sgehwolf at redhat.com]
>>>>>>> Sent: Thursday, July 26, 2018 1:04 PM
>>>>>>> To: serviceability-dev
>>>>>>> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed
>>>>>>> throws
>>>>>>> UnmappedAddressException on i686
>>>>>>>
>>>>>>> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Could I please get a review of this one-liner change
>>>>>>>> related to
>>>>>>>> jhsdb --mixed when attaching to a running Java process? The
>>>>>>>> issue
>>>>>>>> arises when threads are in native code and that native code
>>>>>>>> has
>>>>>>>> frame pointers not properly preserved. In such a case the
>>>>>>>> SA
>>>>>>>> performs a simple frame pointer valididy check: ebp >= esp
>>>>>>>>
>>>>>>>> However, the code of retrieving the value for esp is
>>>>>>>> incorrect in
>>>>>>>> as much as it's not in sync with native code in regards to
>>>>>>>> the
>>>>>>>> register
>>>>>>>> index:
>>>>>>>>
>>>>>>>> native code => X86ThreadContext.SP
>>>>>>>> Java code   => X86ThreadContext.ESP
>>>>>>>>
>>>>>>>> X86ThreadContext.ESP is never being set by the native code.
>>>>>>>> Since
>>>>>>>> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP)
>>>>>>>> then
>>>>>>>> returns null, ebp.lessThan(esp) wrongly returns false
>>>>>>>> causing the
>>>>>>>> issue. This webrev fixes it by using SP as index on the
>>>>>>>> Java side.
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>> webrev:
>>>>>>>>
>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01
>>>>>>>> /
>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8208091
>>>>>>> Anyone willing to review this one-liner?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Severin
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Severin
>>>
>>





More information about the build-dev mailing list