[11u] RFR: 8258414: OldObjectSample events too expensive
Doerr, Martin
martin.doerr at sap.com
Thu Apr 22 09:19:13 UTC 2021
Hi,
it did not only break the debug build.
We have seen a crash on Windows:
# Internal Error (./src/hotspot/share/runtime/safepoint.cpp:894), pid=11412, tid=12932
# fatal error: Deadlock in safepoint code. Should have called back to the VM before blocking.
V [jvm.dll+0x247c64] report_fatal+0x64 (debug.cpp:268)
V [jvm.dll+0x65e30b] SafepointSynchronize::block+0x14b (safepoint.cpp:893)
V [jvm.dll+0x101d44] SafepointMechanism::block_if_requested+0x34 (safepointmechanism.inline.hpp:73)
V [jvm.dll+0x6f0d9a] JavaThread::check_safepoint_and_suspend_for_native_trans+0xca (thread.cpp:2508)
V [jvm.dll+0x5e3175] ObjectSampleCheckpoint::on_rotation+0xa5 (objectsamplecheckpoint.cpp:301)
V [jvm.dll+0x397694] JfrRecorderService::pre_safepoint_write+0x64 (jfrrecorderservice.cpp:428)
V [jvm.dll+0x3978a4] JfrRecorderService::rotate+0x194 (jfrrecorderservice.cpp:325)
V [jvm.dll+0x398324] recorderthread_entry+0xd4 (jfrrecorderthreadloop.cpp:76)
V [jvm.dll+0x6f7649] JavaThread::run+0x139 (thread.cpp:1840)
V [jvm.dll+0x6f06b4] Thread::call_run+0x84 (thread.cpp:391)
V [jvm.dll+0x5f8e36] thread_native_entry+0xd6 (os_windows.cpp:460)
Test:
jdk/jfr/startupargs/TestOldObjectQueueSize.java
Best regards,
Martin
> -----Original Message-----
> From: jdk-updates-dev <jdk-updates-dev-retn at openjdk.java.net> On
> Behalf Of Hohensee, Paul
> Sent: Mittwoch, 21. April 2021 19:42
> To: Jaroslav Bachorík <jaroslav.bachorik at datadoghq.com>; Florian David
> <florian.david at datadoghq.com>
> Cc: jdk-updates-dev <jdk-updates-dev at openjdk.java.net>; Marcus Hirt
> <marcus.hirt at datadoghq.com>
> Subject: RE: [11u] RFR: 8258414: OldObjectSample events too expensive
>
> Hi, Jaroslav.
>
> This push (https://hg.openjdk.java.net/jdk-updates/jdk11u-
> dev/rev/b103107d7ceb) broke the debug builds, vis
>
> /home/hohensee/workspaces/jdk11u-
> dev/src/hotspot/share/jfr/leakprofiler/checkpoint/objectSampleCheckpoint
> .cpp:298:14: error: ‘JfrJavaSupport’ has not been declared
> DEBUG_ONLY(JfrJavaSupport::check_java_thread_in_native(thread);)
>
> The fix is
>
> diff -r b103107d7ceb
> src/hotspot/share/jfr/leakprofiler/checkpoint/objectSampleCheckpoint.cpp
> ---
> a/src/hotspot/share/jfr/leakprofiler/checkpoint/objectSampleCheckpoint.cp
> p Wed Apr 21 12:24:28 2021 +0200
> +++
> b/src/hotspot/share/jfr/leakprofiler/checkpoint/objectSampleCheckpoint.c
> pp Wed Apr 21 17:40:23 2021 +0000
> @@ -24,6 +24,7 @@
>
> #include "precompiled.hpp"
> #include "jfr/jfrEvents.hpp"
> +#include "jfr/jni/jfrJavaSupport.hpp"
> #include "jfr/leakprofiler/chains/edgeStore.hpp"
> #include "jfr/leakprofiler/chains/objectSampleMarker.hpp"
> #include "jfr/leakprofiler/checkpoint/objectSampleCheckpoint.hpp"
>
> Thanks,
> Paul
>
> -----Original Message-----
> From: jdk-updates-dev <jdk-updates-dev-retn at openjdk.java.net> on
> behalf of Jaroslav Bachorík <jaroslav.bachorik at datadoghq.com>
> Date: Monday, April 12, 2021 at 1:40 AM
> To: Florian David <florian.david at datadoghq.com>
> Cc: jdk-updates-dev <jdk-updates-dev at openjdk.java.net>, Marcus Hirt
> <marcus.hirt at datadoghq.com>
> Subject: RE: [11u] RFR: 8258414: OldObjectSample events too expensive
>
> Great, thanks!
>
> Reviewed!
>
> -JB-
>
> On Fri, Apr 9, 2021 at 11:49 PM Florian David
> <florian.david at datadoghq.com> wrote:
> >
> > After receiving feedback from Markus stating that:
> > > There is no need to attempt a lock replacement, because in 11, there is
> no concurrent class unloading. There, unloading only happens at a safepoint.
> > > With concurrent class unloading, there is a need to protect this list, but in
> 11 it is mutually exclusive and cannot be modified concurrently with the JFR
> Recorder thread calling "install_stack_traces(sampler").
> >
> > I removed the module lock and added an is_at_safepoint() assert.
> >
> > Update webrev link:
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8258414
> > webrev: http://cr.openjdk.java.net/~fdavid/8258414/webrev.02/
> > Original PR: https://github.com/openjdk/jdk/pull/2780
> >
> > Florian
> >
> > On Sun, Apr 4, 2021 at 8:33 PM Florian David
> <florian.david at datadoghq.com> wrote:
> >>
> >> Hi Jaroslav,
> >>
> >> Thanks for the review.
> >>
> >>>
> >>> -
> src/hotspot/share/jfr/leakprofiler/checkpoint/objectSampleCheckpoint.cpp
> >>> L287 - IMO, the TODO is not necessary
> >>
> >> It's removed.
> >>
> >>>
> >>> L293 - I think the comment `// caller needs ResourceMark` should
> >>> not be removed
> >>
> >> I added it back.
> >>
> >>> L303 - Not sure if using Module_lock instead of
> >>> ClassLoaderDataGraph_lock is correct. Also, this change seems to be
> >>> bringing in changes unrelated to the patch.
> >>> From what is happening in other places it would seem
> >>> that a safepoint should be asserted instead (or nothing should be
> >>> done).
> >>> I will let Markus weigh in on this.
> >>
> >> No change for the moment.
> >>
> >>>
> >>> - src/hotspot/share/jfr/support/jfrAllocationTracer.cpp
> >>> L38 - can this be completely removed?
> >>
> >> It's removed.
> >>
> >>>
> >>> - src/hotspot/share/jfr/support/jfrAllocationTracer.hpp
> >>> L30 - I think `class JfrThreadLocal;` can also be removed
> >>>
> >> It's removed.
> >>
> >> Update webrev link:
> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8258414
> >> webrev: http://cr.openjdk.java.net/~fdavid/8258414/webrev.01/
> >> Original PR: https://github.com/openjdk/jdk/pull/2780
> >>
> >> Florian
> >>
> >> On Mon, Mar 29, 2021 at 12:22 PM Jaroslav Bachorík
> <jaroslav.bachorik at datadoghq.com> wrote:
> >>>
> >>> Hi Florian,
> >>>
> >>> a few nits:
> >>> -
> src/hotspot/share/jfr/leakprofiler/checkpoint/objectSampleCheckpoint.cpp
> >>> L287 - IMO, the TODO is not necessary
> >>> L293 - I think the comment `// caller needs ResourceMark` should
> >>> not be removed
> >>> L303 - Not sure if using Module_lock instead of
> >>> ClassLoaderDataGraph_lock is correct. Also, this change seems to be
> >>> bringing in changes unrelated to the patch.
> >>> From what is happening in other places it would seem
> >>> that a safepoint should be asserted instead (or nothing should be
> >>> done).
> >>> I will let Markus weigh in on this.
> >>> - src/hotspot/share/jfr/support/jfrAllocationTracer.cpp
> >>> L38 - can this be completely removed?
> >>> - src/hotspot/share/jfr/support/jfrAllocationTracer.hpp
> >>> L30 - I think `class JfrThreadLocal;` can also be removed
> >>>
> >>> Cheers,
> >>>
> >>> -JB-
> >>>
> >>>
> >>> On Wed, Mar 24, 2021 at 8:42 PM Florian David
> >>> <florian.david at datadoghq.com> wrote:
> >>> >
> >>> > Hi,
> >>> >
> >>> > Please review this 11u backport. It's very similar to the original patch,
> >>> > except for the class loader data graph lock and
> JfrKlassUnloading::sort()
> >>> > method which are not available in 11u.
> >>> >
> >>> > The missing lock has been replaced by locking the module_lock
> instead,
> >>> > as it is done in jfrcheckpointManager.cpp:363.
> >>> > JfrKlassUnloading class does not exist and thus sort() method is not
> replaced,
> >>> > I added a TODO instead.
> >>> >
> >>> > Bug: https://bugs.openjdk.java.net/browse/JDK-8258414
> >>> > webrev: http://cr.openjdk.java.net/~fdavid/8258414/webrev.00/
> >>> > Original PR: https://github.com/openjdk/jdk/pull/2780
> >>> >
> >>> > Testing:
> >>> > - Linux x86_64 tier1 tests
> >>> > - SPECvm 2008 on a 96 cores/192 Gib server. JFR recording size is 75.12
> MB before the patch and 3.08 MB after the patch.
> >>> >
> >>> > Let me know what you think.
> >>> >
> >>> > Thanks,
> >>> > Florian
More information about the jdk-updates-dev
mailing list