[Preliminary Review]: Proposal for back-porting JFR to OpenJDK8u

Andrey Petushkov andrey at azul.com
Wed Mar 27 15:03:53 UTC 2019


Dear Guangyu,

Thank you very much for the fix. I've ran a few tests, including all those failed earlier, on linux x86 and don't see problem anymore. So I believe we should consider problem fixed. We'll run more tests in next few days, will let you know if we find anything

Regards,
Andrey

> On 27 Mar 2019, at 06:12, guangyu.zhu <guangyu.zhu at aliyun.com> wrote:
> 
> Hi Andrey,
> 
> We have fixed the crash in thread sampling and finished the g1 heap summary porting. See webrev:
> http://cr.openjdk.java.net/~luchsh/jfr_cr/
> 
> The crash fix is very small, one line is missing when porting from jdk11.
> --- old/src/share/vm/classfile/javaClasses.cpp  2019-03-25 19:20:33.985861477 +0800
> +++ new/src/share/vm/classfile/javaClasses.cpp  2019-03-25 19:20:33.824866124 +0800
> @@ -1047,7 +1047,10 @@
> 
>  // Read thread status value from threadStatus field in java.lang.Thread java class.
>  java_lang_Thread::ThreadStatus java_lang_Thread::get_thread_status(oop java_thread) {
> -  assert(Thread::current()->is_Watcher_thread() || Thread::current()->is_VM_thread() ||
> +  assert(Threads_lock->owned_by_self() || Thread::current()->is_Watcher_thread() ||
> +         Thread::current()->is_VM_thread() ||
>           JavaThread::current()->thread_state() == _thread_in_vm,
>           "Java Thread is not running in vm");
> 
> Thanks,
> Guangyu
> ------------------------------------------------------------------
> Sender:guangyu.zhu <guangyu.zhu at aliyun.com>
> Sent At:2019 Mar. 21 (Thu.) 16:30
> Recipient:Andrey Petushkov <andrey at azul.com>
> Cc:jdk8u-dev <jdk8u-dev at openjdk.java.net>; Ekaterina Vergizova <katya at azul.com>; denghui.ddh <denghui.ddh at antfin.com>
> Subject:Re: [Preliminary Review]: Proposal for back-porting JFR to OpenJDK8u
> 
> Hi Andrey,
> 
> Thank you for pointing out. I ran the jtreg tests on the release build, but didn't run on the debug version, so the bug was hidden. With Alibaba's codebase, I can't reproduce this bug. Maybe I missed some code when doing the merge. Let me check.
> 
> Thanks,
> Guangyu
> ------------------------------------------------------------------
> Sender:Andrey Petushkov <andrey at azul.com>
> Sent At:2019 Mar. 20 (Wed.) 01:18
> Recipient:guangyu.zhu <guangyu.zhu at aliyun.com>
> Cc:Mario Torre <neugens.limasoftware at gmail.com>; jdk8u-dev <jdk8u-dev at openjdk.java.net>; denghui.ddh <denghui.ddh at antfin.com>; Ekaterina Vergizova <katya at azul.com>
> Subject:Re: [Preliminary Review]: Proposal for back-porting JFR to OpenJDK8u
> 
> Hi Guangyu,
> 
> We have found that your thread sampling implementation causes VM crashes. E.g. test jdk/jfr/cmd/TestPrintDefault could be used to
> reproduce it (we've found 29 JFR tests in total which exhibit this problem).
> On debug build this manifests as:
> 
> #  Internal Error (/home/andrey/ws/zulu8-emb-dev/hotspot/src/share/vm/runtime/thread.hpp:1789), pid=25987, tid=0x00007fffc0be1700
> #  assert(thread != NULL && thread->is_Java_thread()) failed: just checking
> 
> with thread list like this:
> 
> Java Threads: ( => current thread )
>   0x00007fff8c001800 JavaThread "JFR Recording Scheduler" daemon [_thread_blocked, id=26020, stack(0x00007fffc09e0000,0x00007fffc0ae1000)]
>   0x00007fff942f0000 JavaThread "JFR Periodic Tasks" daemon [_thread_blocked, id=26018, stack(0x00007fffc0be2000,0x00007fffc0ce3000)]
>   0x00007fff9420f800 JavaThread "JFR Recorder Thread" daemon [_thread_blocked, id=26017, stack(0x00007fffc0f4b000,0x00007fffc104c000)]
>   0x00007ffff0205800 JavaThread "MainThread" [_thread_in_Java, id=26013, stack(0x00007fffc229a000,0x00007fffc239b000)]
>   0x00007ffff0190800 JavaThread "Service Thread" daemon [_thread_blocked, id=26011, stack(0x00007fffc255b000,0x00007fffc265c000)]
>   0x00007ffff017c000 JavaThread "C1 CompilerThread2" daemon [_thread_in_native, id=26010, stack(0x00007fffc265c000,0x00007fffc275d000)]
>   0x00007ffff017a000 JavaThread "C2 CompilerThread1" daemon [_thread_in_native, id=26009, stack(0x00007fffc275d000,0x00007fffc285e000)]
>   0x00007ffff0176800 JavaThread "C2 CompilerThread0" daemon [_thread_in_vm, id=26008, stack(0x00007fffc285e000,0x00007fffc295f000)]
>   0x00007ffff0173000 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=26007, stack(0x00007fffc295f000,0x00007fffc2a60000)]
>   0x00007ffff0122800 JavaThread "Finalizer" daemon [_thread_blocked, id=26006, stack(0x00007fffc2d38000,0x00007fffc2e39000)]
>   0x00007ffff011d800 JavaThread "Reference Handler" daemon [_thread_blocked, id=26005, stack(0x00007fffc2e39000,0x00007fffc2f3a000)]
>   0x00007ffff000f800 JavaThread "main" [_thread_blocked, id=25997, stack(0x00007ffff7eda000,0x00007ffff7fdb000)]
> 
> Other Threads:
>   0x00007ffff010e000 VMThread [stack: 0x00007fffc2f3a000,0x00007fffc303b000] [id=26004]
>   0x00007ffff018e000 WatcherThread [stack: 0x00007fffc245a000,0x00007fffc255b000] [id=26012]
> 
> =>0x00007fff942e5000 (exited) Thread [stack: 0x00007fffc0ae1000,0x00007fffc0be2000] [id=26021]
> 
> Looks like if thread exits during JFR thread sampling the latter still tries to traverse it. Which is, well, not something to be done on that stage
> Would you be able to check the issue, please?
> 
> Thank you,
> Andrey
> 
> > On 18 Mar 2019, at 04:18, guangyu.zhu <guangyu.zhu at aliyun.com> wrote:
> >
> > ok, we will update the patch based on your comments.
> >
> > Thanks,
> > Guangyu
> > ------------------------------------------------------------------
> > Sender:Andrey Petushkov <andrey at azul.com>
> > Sent At:2019 Mar. 15 (Fri.) 21:44
> > Recipient:guangyu.zhu <guangyu.zhu at aliyun.com>
> > Cc:Mario Torre <neugens.limasoftware at gmail.com>; jdk8u-dev <jdk8u-dev at openjdk.java.net>; denghui.ddh <denghui.ddh at antfin.com>
> > Subject:Re: [Preliminary Review]: Proposal for back-porting JFR to OpenJDK8u
> >
> > Hi Guangyu,
> >
> > The G1 heap region type reporting looks good to me as well. Thanks a lot for doing it!
> >
> > Regards,
> > Andrey
> >
> > > On 14 Mar 2019, at 14:44, Andrey Petushkov <andrey at azul.com> wrote:
> > >
> > > Hi Guangyu,
> > >
> > > By the moment I've read first two patches (thread sampling and biased locks). These look good to me
> > > One very small note, I'd rather be verbose with the value of _trace_flag in thread.hpp. Just to prevent occasional use of the same value for another item of this enum in the (almost impossible) case that someone adds it
> > >
> > > I need a bit more time to read through G1 heap region types one.
> > > BTW, it looks I've forgotten to mention that Azul is also missing G1 heap occupancy percent data which is indeed supported by Alibaba. Would you be able to integrate it also, please
> > >
> > > Thanks a lot,
> > > Andrey
> > >
> > >> On 14 Mar 2019, at 05:39, guangyu.zhu <guangyu.zhu at aliyun.com> wrote:
> > >>
> > >> Ok. I look forward to the feedbacks from both of you.
> > >>
> > >> Thanks,
> > >> Guangyu
> > >> ------------------------------------------------------------------
> > >> Sender:Mario Torre <neugens.limasoftware at gmail.com>
> > >> Sent At:2019 Mar. 13 (Wed.) 19:26
> > >> Recipient:Andrey Petushkov <andrey at azul.com>
> > >> Cc:guangyu.zhu <guangyu.zhu at aliyun.com>; jdk8u-dev <jdk8u-dev at openjdk.java.net>; denghui.ddh <denghui.ddh at antfin.com>
> > >> Subject:Re: [Preliminary Review]: Proposal for back-porting JFR to OpenJDK8u
> > >>
> > >> Thanks!
> > >>
> > >> I’m currently traveling but I’ll offer my review as soon as I get back next week.
> > >>
> > >> Cheers,
> > >> Mario
> > >> On Tue 12. Mar 2019 at 09:35, Andrey Petushkov <andrey at azul.com> wrote:
> > >> Hi Guangyu,
> > >>
> > >> Cool! Thank you so much! Will review changes ASAP
> > >> The backporting is still in progress. There are quite a lot of changes to consider so we'll likely finish some time after
> > >> April update release (mostly limited by testing resources capacity)
> > >>
> > >> Regards,
> > >> Andrey
> > >>
> > >>> On 12 Mar 2019, at 16:29, guangyu.zhu <guangyu.zhu at aliyun.com> wrote:
> > >>>
> > >>> Hi Andrey, Mario
> > >>>
> > >>> Is there any progress in backporting? We have completed the patch for the missing features. Please review.
> > >>>
> > >>> - thread sampling:
> > >>> http://cr.openjdk.java.net/~luchsh/thread_sampling/
> > >>>
> > >>> - biased locking events:
> > >>> http://cr.openjdk.java.net/~luchsh/hs_biasedlock/
> > >>> http://cr.openjdk.java.net/~luchsh/jdk_biasedlock/
> > >>>
> > >>> - G1 heap region (heap summary is still missing, Alibaba's patch does not support heap summary either):
> > >>> http://cr.openjdk.java.net/~luchsh/g1region_type_change_event
> > >>>
> > >>> Thanks,
> > >>> Guangyu
> > >>>
> > >>>
> > >>> ------------------------------------------------------------------
> > >>> Sender:Andrey Petushkov <andrey at azul.com>
> > >>> Sent At:2019 Mar. 6 (Wed.) 00:34
> > >>> Recipient:Mario Torre <neugens at redhat.com>
> > >>> Cc:guangyu.zhu <guangyu.zhu at aliyun.com>; jdk8u-dev <jdk8u-dev at openjdk.java.net>; denghui.ddh <denghui.ddh at antfin.com>
> > >>> Subject:Re: [Preliminary Review]: Proposal for back-porting JFR to OpenJDK8u
> > >>>
> > >>> Hi Mario,
> > >>>
> > >>>> On 4 Mar 2019, at 14:19, Mario Torre <neugens at redhat.com> wrote:
> > >>>>
> > >>>> On Fri, Mar 1, 2019 at 8:09 PM Andrey Petushkov <andrey at azul.com> wrote:
> > >>>>
> > >>>>>> I'm going through the patch right now, but yes, from what I see the
> > >>>>>> trace is removed. I had too a concern about this and was about to send
> > >>>>>> a note. I'm not quite sure what to do, because Trace has been removed
> > >>>>>> in 11 as far as I know, but removing mid stream in 8 may be a more
> > >>>>>> interesting issue, however this isn't a user facing API, it was always
> > >>>>>> meant to be internal to the JVM, so I don't quite know if there's
> > >>>>>> really a reason we shouldn't change it. This is one question for the
> > >>>>>> CSR group I think.
> > >>>>> Since trace was removed by the same commit as JFR was added to jdk11 my guess is that trace
> > >>>>> was used internally at Oracle to integrate closed implementation of JFR. With this sense I see no point
> > >>>>> to keep it. However if the guess is wrong and there some alternative implementation of trace event consumer
> > >>>>> I will be happy to return it back
> > >>>>
> > >>>> Yes, I tend to agree with you, I do believe this is mostly an internal
> > >>>> API for easy of patching with the JFR code (which is almost
> > >>>> identical). The only concern is in the way the logging would be
> > >>>> triggered externally and the compile time options for it (I still see
> > >>>> a couple of instance where INCLUDE_TRACE is being used). As for
> > >>>> triggering the logs, I don't recall that 8 has any means of doing
> > >>>> this, I think some infrastructure came with 9 with the -Xlog option (I
> > >>>> didn't follow this however, I'm not sure the option ever landed in 9)?
> > >>>> In that case I guess it's safe to go after all.
> > >>> Right, the new logging infrastructure is badly missing here. Both Alibaba and Azul have added means of
> > >>> some JFR logging but far from what jdk11 could do. Let me check the rest of INCLUDE_TRACE places,
> > >>> IMHO we should get rid of all of them, but cannot tell for sure now
> > >>>
> > >>> Andrey
> > >>>>
> > >>>> Cheers,
> > >>>> Mario
> > >>>> --
> > >>>> Mario Torre
> > >>>> Associate Manager, Software Engineering
> > >>>> Red Hat GmbH <https://www.redhat.com>
> > >>>> 9704 A60C B4BE A8B8 0F30  9205 5D7E 4952 3F65 7898
> > >>>
> > >>
> > >>
> > >
> >
> >
> 
> 
> 



More information about the jdk8u-dev mailing list