RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java stacks [v3]

Andrei Pangin apangin at openjdk.java.net
Sun Jul 11 22:24:58 UTC 2021


On Fri, 18 Jun 2021 08:56:32 GMT, Ludovic Henry <luhenry at openjdk.org> wrote:

>> When the signal sent for AsyncGetCallTrace or JFR would land on a runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled method,  it wouldn't be able to detect the sender (caller) frame for multiple reasons. This patch fixes these cases through adding CodeBlob-specific frame parser which are in the best position to know how a frame is setup.
>> 
>> The following examples have been profiled with honest-profiler which uses `AsyncGetCallTrace`.
>> 
>> # `Prof1`
>> 
>> public class Prof1 {
>> 
>>     public static void main(String[] args) {
>>         StringBuilder sb = new StringBuilder();
>>         for (int i = 0; i < 1000000; i++) {
>>             sb.append("ab");
>>             sb.delete(0, 1);
>>         }
>>         System.out.println(sb.length());
>>     }
>> }
>> 
>> 
>> - Baseline:
>> 
>> Flat Profile (by method):
>>         (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5]
>>         (t  0.5,s  0.2) Prof1::main
>>         (t  0.2,s  0.2) java.lang.AbstractStringBuilder::append
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::ensureCapacityInternal
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::shift
>>         (t  0.0,s  0.0) java.lang.String::getBytes
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>>         (t  0.0,s  0.0) java.lang.StringBuilder::delete
>>         (t  0.2,s  0.0) java.lang.StringBuilder::append
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::delete
>>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>> 
>> - With `StubRoutinesBlob::FrameParser`:
>> 
>> Flat Profile (by method):
>>         (t 98.7,s 98.7) java.lang.AbstractStringBuilder::ensureCapacityInternal
>>         (t  0.9,s  0.9) java.lang.AbstractStringBuilder::delete
>>         (t 99.8,s  0.2) Prof1::main
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) AGCT::Unknown Java[ERR=-5]
>>         (t 98.8,s  0.0) java.lang.AbstractStringBuilder::append
>>         (t 98.8,s  0.0) java.lang.StringBuilder::append
>>         (t  0.9,s  0.0) java.lang.StringBuilder::delete
>> 
>> 
>> # `Prof2`
>> 
>> import java.util.function.Supplier;
>> 
>> public class Prof2 {
>> 
>>     public static void main(String[] args) {
>>         var rand = new java.util.Random(0);
>>         Supplier[] suppliers = {
>>                 () -> 0,
>>                 () -> 1,
>>                 () -> 2,
>>                 () -> 3,
>>         };
>> 
>>         long sum = 0;
>>         for (int i = 0; i >= 0; i++) {
>>             sum += (int)suppliers[i % suppliers.length].get();
>>         }
>>     }
>> }
>> 
>> 
>> - Baseline:
>> 
>> Flat Profile (by method):
>>         (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5]
>>         (t 39.2,s 35.2) Prof2::main
>>         (t  1.4,s  1.4) Prof2::lambda$main$3
>>         (t  1.0,s  1.0) Prof2::lambda$main$2
>>         (t  0.9,s  0.9) Prof2::lambda$main$1
>>         (t  0.7,s  0.7) Prof2::lambda$main$0
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>>         (t  0.0,s  0.0) java.lang.Thread::exit
>>         (t  0.9,s  0.0) Prof2$$Lambda$2.0x0000000800c00c28::get
>>         (t  1.0,s  0.0) Prof2$$Lambda$3.0x0000000800c01000::get
>>         (t  1.4,s  0.0) Prof2$$Lambda$4.0x0000000800c01220::get
>>         (t  0.7,s  0.0) Prof2$$Lambda$1.0x0000000800c00a08::get
>> 
>> 
>> - With `VtableBlob::FrameParser` and `nmethod::FrameParser`:
>> 
>> Flat Profile (by method):
>>         (t 74.1,s 70.3) Prof2::main
>>         (t  6.5,s  5.5) Prof2$$Lambda$29.0x0000000800081220::get
>>         (t  6.6,s  5.4) Prof2$$Lambda$28.0x0000000800081000::get
>>         (t  5.7,s  5.0) Prof2$$Lambda$26.0x0000000800080a08::get
>>         (t  5.9,s  5.0) Prof2$$Lambda$27.0x0000000800080c28::get
>>         (t  4.9,s  4.9) AGCT::Unknown Java[ERR=-5]
>>         (t  1.2,s  1.2) Prof2::lambda$main$2
>>         (t  0.9,s  0.9) Prof2::lambda$main$3
>>         (t  0.9,s  0.9) Prof2::lambda$main$1
>>         (t  0.7,s  0.7) Prof2::lambda$main$0
>>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>
> Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix comments

Hi Ludovic,

Thank you for working on this long-standing bug.
I like the idea of the proposed solution, but unfortunately it cannot be applied as is. Since the stack walking code runs inside a signal handler, it is very limited in things it can do. In particular, it must not allocate, acquire locks, etc. In your implementation, FrameParser does allocate though.

The issue is not just theoretical: when I ran JDK with this patch with async-profiler, I immediately got the following deadlock:


(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fa2363ca025 in __GI___pthread_mutex_lock (mutex=0x7fa235da5440 <tc_mutex>)
    at ../nptl/pthread_mutex_lock.c:80
#2  0x00007fa235696cb6 in ThreadCritical::ThreadCritical() () from /usr/java/jdk-18/lib/server/libjvm.so
#3  0x00007fa234b6fe53 in Chunk::next_chop() () from /usr/java/jdk-18/lib/server/libjvm.so
#4  0x00007fa234e88523 in frame::safe_for_sender(JavaThread*) () from /usr/java/jdk-18/lib/server/libjvm.so
#5  0x00007fa234e838f2 in vframeStreamForte::forte_next() () from /usr/java/jdk-18/lib/server/libjvm.so
#6  0x00007fa2349fbb9b in forte_fill_call_trace_given_top(JavaThread*, ASGCT_CallTrace*, int, frame) [clone .isra.20]
    () from /usr/java/jdk-18/lib/server/libjvm.so
#7  0x00007fa234e8426e in AsyncGetCallTrace () from /usr/java/jdk-18/lib/server/libjvm.so
#8  0x00007fa228519312 in Profiler::getJavaTraceAsync(void*, ASGCT_CallFrame*, int) ()
   from /mnt/c/Users/Andrei/java/async-profiler/build/libasyncProfiler.so
#9  0x00007fa228519c72 in Profiler::recordSample(void*, unsigned long long, int, Event*) ()
   from /mnt/c/Users/Andrei/java/async-profiler/build/libasyncProfiler.so
#10 0x00007fa2285164f8 in WallClock::signalHandler(int, siginfo_t*, void*) ()
   from /mnt/c/Users/Andrei/java/async-profiler/build/libasyncProfiler.so
#11 <signal handler called>
#12 __pthread_mutex_unlock_usercnt (decr=1, mutex=0x7fa235da5440 <tc_mutex>) at pthread_mutex_unlock.c:41
#13 __GI___pthread_mutex_unlock (mutex=0x7fa235da5440 <tc_mutex>) at pthread_mutex_unlock.c:356
#14 0x00007fa235696d3b in ThreadCritical::~ThreadCritical() () from /usr/java/jdk-18/lib/server/libjvm.so
#15 0x00007fa234b6fe71 in Chunk::next_chop() () from /usr/java/jdk-18/lib/server/libjvm.so
#16 0x00007fa234d1ca62 in ClassFileParser::parse_method(ClassFileStream const*, bool, ConstantPool const*, AccessFlags*, JavaThread*) () from /usr/java/jdk-18/lib/server/libjvm.so
#17 0x00007fa234d1e338 in ClassFileParser::parse_methods(ClassFileStream const*, bool, AccessFlags*, bool*, bool*, JavaThread*) () from /usr/java/jdk-18/lib/server/libjvm.so
#18 0x00007fa234d22459 in ClassFileParser::parse_stream(ClassFileStream const*, JavaThread*) ()
   from /usr/java/jdk-18/lib/server/libjvm.so
#19 0x00007fa234d2291c in ClassFileParser::ClassFileParser(ClassFileStream*, Symbol*, ClassLoaderData*, ClassLoadInfo const*, ClassFileParser::Publicity, JavaThread*) () from /usr/java/jdk-18/lib/server/libjvm.so
#20 0x00007fa2351febb6 in KlassFactory::create_from_stream(ClassFileStream*, Symbol*, ClassLoaderData*, ClassLoadInfo const&, JavaThread*) ()
   from /usr/java/jdk-18/lib/server/libjvm.so
#21 0x00007fa235645b40 in SystemDictionary::resolve_class_from_stream(ClassFileStream*, Symbol*, Handle, ClassLoadInfo const&, JavaThread*) ()
   from /usr/java/jdk-18/lib/server/libjvm.so
#22 0x00007fa2350bad0a in jvm_define_class_common(char const*, _jobject*, signed char const*, int, _jobject*, char const*, JavaThread*) [clone .constprop.299] ()
   from /usr/java/jdk-18/lib/server/libjvm.so
#23 0x00007fa2350bae6d in JVM_DefineClassWithSource () from /usr/java/jdk-18/lib/server/libjvm.so
#24 0x00007fa236a0ee12 in Java_java_lang_ClassLoader_defineClass1 () from /usr/java/jdk-18/lib/libjava.so

-------------

PR: https://git.openjdk.java.net/jdk/pull/4436


More information about the serviceability-dev mailing list