[PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686

Severin Gehwolf sgehwolf at redhat.com
Mon Jul 30 08:28:22 UTC 2018


Hi Chris,

On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote:
> I had looked at this review when it came out, but was hesitant to ok it 
> because I really don't know this code at all. If you can get another 
> reviewer who does know the code, then I'll approve it.

Sharath Ballal reviewed it, but he's not a Reviewer as per the OpenJDK
census. As to whether he knows the code, I don't know. He's on CC.

> This only impacts 32-bit, right? If so, keep in mind that it won't get tested by Oracle 
> testing, including the submit repo, so make sure you do thorough testing.

It only impacts 32-bit, yes. I understand that Oracle isn't testing 32-
bit x86 any more. The change itself should be fairly low risk since
it's changing only a 32-bit-x86-linux-only file and the native bits
don't seem to match what the Java code does[1]. REG_INDEX(reg) being
defined as:

#define REG_INDEX(reg) sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg

and being used as:

REG_INDEX(SP)

Thus, using

sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP

The Java code uses:

sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP

> Also, why is there any code being executed that was not compiled with 
> -fno-omit-frame-pointer? The description in the CR just shows a simple 
> java program reproducing this, so all the mixed stack traces belong to 
> the JVM and libs, and I thought we made sure to compile all of them with 
> -fno-omit-frame-pointer.

The JVM uses glibc and that simple program is enough to see some
thread's stack currently being in a glibc function when getting a mixed
stack trace. We've originally seen this in JDK 8 with jstack -m and was
reported in [2]. That comment has more details. The problem here isn't
that it's a JDK lib which gets compiled without -fno-omit-frame-
pointer. It's glibc not being compiled with that option.

An example stack trace for a system where this happens looks like this:

Thread 7 (Thread 0xa3863b40 (LWP 834)):
#0  0xf771f430 in __kernel_vsyscall ()
#1  0xf7703acc in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=1, futex=0xf770f000) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
#2  do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 <sig_sem>, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
#3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
#4  0xf6cc18d4 in check_pending_signals (wait=true) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:2522
#5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, __the_thread__=0xa37a4800) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/share/vm/runtime/os.cpp:250

That is, frames 0-3 are JDK foreign. This bug will happen on all
systems which use any native library which isn't compiled with -fno-
omit-frame-pointer. Be it glibc or some other library.

Thanks,
Severin

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4

> thanks,
> 
> Chris
> 
> On 7/26/18 10:11 AM, Severin Gehwolf wrote:
> > On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
> > > Changes looks good Severin.
> > 
> > Thanks for the review, Sharath!
> > 
> > > I am not a reviewer though, so you still need a Reviewer to review.
> > 
> > Anyone?
> > 
> > Thanks,
> > Severin
> > 
> > > -----Original Message-----
> > > From: Severin Gehwolf [mailto:sgehwolf at redhat.com]
> > > Sent: Thursday, July 26, 2018 1:04 PM
> > > To: serviceability-dev
> > > Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686
> > > 
> > > On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
> > > > Hi,
> > > > 
> > > > Could I please get a review of this one-liner change related to jhsdb
> > > > --mixed when attaching to a running Java process? The issue arises
> > > > when threads are in native code and that native code has frame
> > > > pointers not properly preserved. In such a case the SA performs a
> > > > simple frame pointer valididy check: ebp >= esp
> > > > 
> > > > However, the code of retrieving the value for esp is incorrect in as
> > > > much as it's not in sync with native code in regards to the register
> > > > index:
> > > > 
> > > > native code => X86ThreadContext.SP
> > > > Java code   => X86ThreadContext.ESP
> > > > 
> > > > X86ThreadContext.ESP is never being set by the native code. Since
> > > > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then
> > > > returns null, ebp.lessThan(esp) wrongly returns false causing the
> > > > issue. This webrev fixes it by using SP as index on the Java side.
> > > > Thoughts?
> > > > 
> > > > webrev:
> > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/
> > > > bug: https://bugs.openjdk.java.net/browse/JDK-8208091
> > > 
> > > Anyone willing to review this one-liner?
> > > 
> > > Thanks,
> > > Severin
> > > 
> > > > Thanks,
> > > > Severin
> 
> 



More information about the serviceability-dev mailing list