[PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686

Severin Gehwolf sgehwolf at redhat.com
Wed Aug 8 12:52:52 UTC 2018


On Tue, 2018-08-07 at 12:44 -0700, Chris Plummer wrote:
> Hi Severin,
> 
> The changes look fine. Thanks for adding the test case.

Thanks for the review, Chris!

Cheers,
Severin

> Chris
> 
> On 8/6/18 4:53 AM, Severin Gehwolf wrote:
> > Hi,
> > 
> > Latest webrev with JNI properly compiled with -fomit-frame-pointer and
> > -O3. There was a bug in the test where the exception wasn't properly
> > rethrown:
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.03/
> > 
> > I'm going to run this through jdk-submit too. Adding build-dev for
> > build changes.
> > 
> > Thanks,
> > Severin
> > 
> > On Fri, 2018-08-03 at 18:45 +0200, Severin Gehwolf wrote:
> > > Hi all,
> > > 
> > > On Mon, 2018-07-30 at 21:23 -0700, Sharath Ballal wrote:
> > > > > Ok. It looks like we don't even have a "jstack --mixed" test. Could
> > > > > you add one? It would be even better if the test included a JNI lib
> > > > > that wasn't compiled with -fno-omit-frame-pointer so you don't need
> > > > > to rely on glibc to reproduce this issue (or is glibc  pretty much
> > > > > always compiled without -fno-omit-frame-pointer)? Or if Sharath
> > > > > agrees, file a bug to have a test added.
> > > > 
> > > > That’s a good suggestion.  Severin you can either write a test or
> > > > open a bug for it.
> > > 
> > > Latest webrev with a test:
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.02/
> > > 
> > > The test fails prior the test on affected systems and passes after.
> > > There are still issues with getting the test's JNI properly compiled
> > > the way it's supposed to. I've asked for help on build-dev[1]. Example
> > > runs:
> > > 
> > > Before patch:
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/before_patch.txt
> > > 
> > > After patch:
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/after_patch.txt
> > > 
> > > Thanks,
> > > Severin
> > > 
> > > [1] http://mail.openjdk.java.net/pipermail/build-dev/2018-August/022819.html
> > > 
> > > > -----Original Message-----
> > > > From: Chris Plummer
> > > > Sent: Monday, July 30, 2018 10:03 PM
> > > > To: Severin Gehwolf; Sharath Ballal; serviceability-dev
> > > > Subject: Re: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
> > > > UnmappedAddressException on i686
> > > > 
> > > > Hi Severin,
> > > > 
> > > > On 7/30/18 1:28 AM, Severin Gehwolf wrote:
> > > > > Hi Chris,
> > > > > 
> > > > > On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote:
> > > > > > I had looked at this review when it came out, but was hesitant to
> > > > > > ok
> > > > > > it because I really don't know this code at all. If you can get
> > > > > > another reviewer who does know the code, then I'll approve it.
> > > > > 
> > > > > Sharath Ballal reviewed it, but he's not a Reviewer as per the
> > > > > OpenJDK
> > > > > census. As to whether he knows the code, I don't know. He's on CC.
> > > > 
> > > > Yes, but I was asking for a second reviewer (not counting me).
> > > > > > This only impacts 32-bit, right? If so, keep in mind that it
> > > > > > won't
> > > > > > get tested by Oracle testing, including the submit repo, so make
> > > > > > sure you do thorough testing.
> > > > > 
> > > > > It only impacts 32-bit, yes. I understand that Oracle isn't
> > > > > testing
> > > > > 32- bit x86 any more. The change itself should be fairly low risk
> > > > > since it's changing only a 32-bit-x86-linux-only file and the
> > > > > native
> > > > > bits don't seem to match what the Java code does[1].
> > > > > REG_INDEX(reg)
> > > > > being defined as:
> > > > > 
> > > > > #define REG_INDEX(reg)
> > > > > sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg
> > > > > 
> > > > > and being used as:
> > > > > 
> > > > > REG_INDEX(SP)
> > > > > 
> > > > > Thus, using
> > > > > 
> > > > > sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP
> > > > > 
> > > > > The Java code uses:
> > > > > 
> > > > > sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP
> > > > > 
> > > > > > Also, why is there any code being executed that was not compiled
> > > > > > with
> > > > > > -fno-omit-frame-pointer? The description in the CR just shows a
> > > > > > simple java program reproducing this, so all the mixed stack
> > > > > > traces
> > > > > > belong to the JVM and libs, and I thought we made sure to compile
> > > > > > all
> > > > > > of them with -fno-omit-frame-pointer.
> > > > > 
> > > > > The JVM uses glibc and that simple program is enough to see some
> > > > > thread's stack currently being in a glibc function when getting a
> > > > > mixed stack trace. We've originally seen this in JDK 8 with jstack
> > > > > -m
> > > > > and was reported in [2]. That comment has more details. The
> > > > > problem
> > > > > here isn't that it's a JDK lib which gets compiled without
> > > > > -fno-omit-frame- pointer. It's glibc not being compiled with that
> > > > > option.
> > > > > 
> > > > > An example stack trace for a system where this happens looks like
> > > > > this:
> > > > > 
> > > > > Thread 7 (Thread 0xa3863b40 (LWP 834)):
> > > > > #0  0xf771f430 in __kernel_vsyscall ()
> > > > > #1  0xf7703acc in futex_abstimed_wait (cancel=true,
> > > > > private=<optimized
> > > > > out>, abstime=0x0, expected=1, futex=0xf770f000) at
> > > > > ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
> > > > > #2  do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 <sig_sem>,
> > > > > abstime=0x0) at
> > > > > ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
> > > > > #3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>,
> > > > > abstime=0x0) at
> > > > > ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
> > > > > #4  0xf6cc18d4 in check_pending_signals (wait=true) at
> > > > > /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-
> > > > > 8.b10.el7_5.i386/openjdk/h
> > > > > otspot/src/os/linux/vm/os_linux.cpp:2522
> > > > > #5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800,
> > > > > __the_thread__=0xa37a4800) at
> > > > > /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-
> > > > > 8.b10.el7_5.i386/openjdk/h
> > > > > otspot/src/share/vm/runtime/os.cpp:250
> > > > > 
> > > > > That is, frames 0-3 are JDK foreign. This bug will happen on all
> > > > > systems which use any native library which isn't compiled with
> > > > > -fno-
> > > > > omit-frame-pointer. Be it glibc or some other library.
> > > > 
> > > > Ok. It looks like we don't even have a "jstack --mixed" test. Could
> > > > you add one? It would be even better if the test included a JNI lib
> > > > that wasn't compiled with -fno-omit-frame-pointer so you don't need
> > > > to rely on glibc to reproduce this issue (or is glibc pretty much
> > > > always compiled without -fno-omit-frame-pointer)? Or if Sharath
> > > > agrees, file a bug to have a test added.
> > > > 
> > > > thanks,
> > > > 
> > > > Chris
> > > > > Thanks,
> > > > > Severin
> > > > > 
> > > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9
> > > > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4
> > > > > 
> > > > > > thanks,
> > > > > > 
> > > > > > Chris
> > > > > > 
> > > > > > On 7/26/18 10:11 AM, Severin Gehwolf wrote:
> > > > > > > On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
> > > > > > > > Changes looks good Severin.
> > > > > > > 
> > > > > > > Thanks for the review, Sharath!
> > > > > > > 
> > > > > > > > I am not a reviewer though, so you still need a Reviewer to
> > > > > > > > review.
> > > > > > > 
> > > > > > > Anyone?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Severin
> > > > > > > 
> > > > > > > > -----Original Message-----
> > > > > > > > From: Severin Gehwolf [mailto:sgehwolf at redhat.com]
> > > > > > > > Sent: Thursday, July 26, 2018 1:04 PM
> > > > > > > > To: serviceability-dev
> > > > > > > > Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed
> > > > > > > > throws
> > > > > > > > UnmappedAddressException on i686
> > > > > > > > 
> > > > > > > > On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > Could I please get a review of this one-liner change
> > > > > > > > > related to
> > > > > > > > > jhsdb --mixed when attaching to a running Java process? The
> > > > > > > > > issue
> > > > > > > > > arises when threads are in native code and that native code
> > > > > > > > > has
> > > > > > > > > frame pointers not properly preserved. In such a case the
> > > > > > > > > SA
> > > > > > > > > performs a simple frame pointer valididy check: ebp >= esp
> > > > > > > > > 
> > > > > > > > > However, the code of retrieving the value for esp is
> > > > > > > > > incorrect in
> > > > > > > > > as much as it's not in sync with native code in regards to
> > > > > > > > > the
> > > > > > > > > register
> > > > > > > > > index:
> > > > > > > > > 
> > > > > > > > > native code => X86ThreadContext.SP
> > > > > > > > > Java code   => X86ThreadContext.ESP
> > > > > > > > > 
> > > > > > > > > X86ThreadContext.ESP is never being set by the native code.
> > > > > > > > > Since
> > > > > > > > > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP)
> > > > > > > > > then
> > > > > > > > > returns null, ebp.lessThan(esp) wrongly returns false
> > > > > > > > > causing the
> > > > > > > > > issue. This webrev fixes it by using SP as index on the
> > > > > > > > > Java side.
> > > > > > > > > Thoughts?
> > > > > > > > > 
> > > > > > > > > webrev:
> > > > > > > > > 
> > > 
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01
> > > > > > > > > /
> > > > > > > > > bug: https://bugs.openjdk.java.net/browse/JDK-8208091
> > > > > > > > 
> > > > > > > > Anyone willing to review this one-liner?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > Severin
> > > > > > > > 
> > > > > > > > > Thanks,
> > > > > > > > > Severin
> 
> 




More information about the build-dev mailing list