[PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686

Sharath Ballal sharath.ballal at oracle.com
Tue Jul 31 04:23:46 UTC 2018


> Ok. It looks like we don't even have a "jstack --mixed" test. Could you add one? It would be even better if the test included a JNI lib that wasn't compiled with -fno-omit-frame-pointer so you don't need to rely on glibc to reproduce this issue (or is glibc  pretty much always compiled without -fno-omit-frame-pointer)? Or if Sharath agrees, file a bug to have a test added.

That’s a good suggestion.  Severin you can either write a test or open a bug for it.

Thanks,
Sharath


-----Original Message-----
From: Chris Plummer 
Sent: Monday, July 30, 2018 10:03 PM
To: Severin Gehwolf; Sharath Ballal; serviceability-dev
Subject: Re: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686

Hi Severin,

On 7/30/18 1:28 AM, Severin Gehwolf wrote:
> Hi Chris,
>
> On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote:
>> I had looked at this review when it came out, but was hesitant to ok 
>> it because I really don't know this code at all. If you can get 
>> another reviewer who does know the code, then I'll approve it.
> Sharath Ballal reviewed it, but he's not a Reviewer as per the OpenJDK 
> census. As to whether he knows the code, I don't know. He's on CC.
Yes, but I was asking for a second reviewer (not counting me).
>
>> This only impacts 32-bit, right? If so, keep in mind that it won't 
>> get tested by Oracle testing, including the submit repo, so make sure you do thorough testing.
> It only impacts 32-bit, yes. I understand that Oracle isn't testing 
> 32- bit x86 any more. The change itself should be fairly low risk 
> since it's changing only a 32-bit-x86-linux-only file and the native 
> bits don't seem to match what the Java code does[1]. REG_INDEX(reg) 
> being defined as:
>
> #define REG_INDEX(reg) 
> sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg
>
> and being used as:
>
> REG_INDEX(SP)
>
> Thus, using
>
> sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP
>
> The Java code uses:
>
> sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP
>
>> Also, why is there any code being executed that was not compiled with 
>> -fno-omit-frame-pointer? The description in the CR just shows a 
>> simple java program reproducing this, so all the mixed stack traces 
>> belong to the JVM and libs, and I thought we made sure to compile all 
>> of them with -fno-omit-frame-pointer.
> The JVM uses glibc and that simple program is enough to see some 
> thread's stack currently being in a glibc function when getting a 
> mixed stack trace. We've originally seen this in JDK 8 with jstack -m 
> and was reported in [2]. That comment has more details. The problem 
> here isn't that it's a JDK lib which gets compiled without 
> -fno-omit-frame- pointer. It's glibc not being compiled with that option.
>
> An example stack trace for a system where this happens looks like this:
>
> Thread 7 (Thread 0xa3863b40 (LWP 834)):
> #0  0xf771f430 in __kernel_vsyscall ()
> #1  0xf7703acc in futex_abstimed_wait (cancel=true, private=<optimized 
> out>, abstime=0x0, expected=1, futex=0xf770f000) at 
> ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
> #2  do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 <sig_sem>, 
> abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
> #3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>, 
> abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
> #4  0xf6cc18d4 in check_pending_signals (wait=true) at 
> /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/h
> otspot/src/os/linux/vm/os_linux.cpp:2522
> #5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, 
> __the_thread__=0xa37a4800) at 
> /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/h
> otspot/src/share/vm/runtime/os.cpp:250
>
> That is, frames 0-3 are JDK foreign. This bug will happen on all 
> systems which use any native library which isn't compiled with -fno- 
> omit-frame-pointer. Be it glibc or some other library.
Ok. It looks like we don't even have a "jstack --mixed" test. Could you add one? It would be even better if the test included a JNI lib that wasn't compiled with -fno-omit-frame-pointer so you don't need to rely on glibc to reproduce this issue (or is glibc pretty much always compiled without -fno-omit-frame-pointer)? Or if Sharath agrees, file a bug to have a test added.

thanks,

Chris
>
> Thanks,
> Severin
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4
>
>> thanks,
>>
>> Chris
>>
>> On 7/26/18 10:11 AM, Severin Gehwolf wrote:
>>> On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
>>>> Changes looks good Severin.
>>> Thanks for the review, Sharath!
>>>
>>>> I am not a reviewer though, so you still need a Reviewer to review.
>>> Anyone?
>>>
>>> Thanks,
>>> Severin
>>>
>>>> -----Original Message-----
>>>> From: Severin Gehwolf [mailto:sgehwolf at redhat.com]
>>>> Sent: Thursday, July 26, 2018 1:04 PM
>>>> To: serviceability-dev
>>>> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws 
>>>> UnmappedAddressException on i686
>>>>
>>>> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
>>>>> Hi,
>>>>>
>>>>> Could I please get a review of this one-liner change related to 
>>>>> jhsdb --mixed when attaching to a running Java process? The issue 
>>>>> arises when threads are in native code and that native code has 
>>>>> frame pointers not properly preserved. In such a case the SA 
>>>>> performs a simple frame pointer valididy check: ebp >= esp
>>>>>
>>>>> However, the code of retrieving the value for esp is incorrect in 
>>>>> as much as it's not in sync with native code in regards to the 
>>>>> register
>>>>> index:
>>>>>
>>>>> native code => X86ThreadContext.SP
>>>>> Java code   => X86ThreadContext.ESP
>>>>>
>>>>> X86ThreadContext.ESP is never being set by the native code. Since
>>>>> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then 
>>>>> returns null, ebp.lessThan(esp) wrongly returns false causing the 
>>>>> issue. This webrev fixes it by using SP as index on the Java side.
>>>>> Thoughts?
>>>>>
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01
>>>>> /
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8208091
>>>> Anyone willing to review this one-liner?
>>>>
>>>> Thanks,
>>>> Severin
>>>>
>>>>> Thanks,
>>>>> Severin
>>



More information about the serviceability-dev mailing list