Crash when dumping cds map file
Igor Veresov
igor.veresov at oracle.com
Sat Jul 6 06:02:27 UTC 2024
Yes, with the changes I’m working on there will be no symbolic refs, at least for now. We can re-add them later if necessary.
For now if a quick fix is necessary we can zero these fields in MethodTrainingData::cleanup() at the same place we currently do “_holder = nullptr”.
igor
> On Jun 28, 2024, at 4:19 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
>
> Probably, it's time to reconsider current behavior.
>
> Igor is refactoring TrainingData to get rid of symbolic lookups. In such case, there's no reason to keep stale training data around and it's better to simply prune it.
>
> Igor, what's your take on it?
>
> Best regards,
> Vladimir Ivanov
>
> On 6/28/24 15:54, Ashutosh Mehra wrote:
>> What's the exact reason of the crash? Is it due to dereferencing invalid
>> metadata pointer or simply encountering a nullptr?
>> It's nullptr. The klass pointer of the method is null.
>> *TrainingData::cleanup() was intended to clear stale metadata pointers,
>> but keep the training data around linked in symbolic form (holder ==
>> null).
>> Okay, so if we do want to keep the training data but just prune the stale pointers,
>> then just setting the _holder to null is not enough, because the method belonging
>> to an excluded class can be reached through MethodTrainingData::_final_profile->method() or
>> MethodTrainingData::_final_counters->method().
>> So probably MethodTrainingData::cleanup() should be clearing the _method field in
>> MethodCounters and MethodData as well, and link them back in MethodTrainingData::refresh_from(),
>> just like it is done for MethodTrainingData.
>> Does that make sense?
>> Thanks,
>> - Ashutosh Mehra
>> On Fri, Jun 28, 2024 at 5:33 PM Vladimir Ivanov <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>> What's the exact reason of the crash? Is it due to dereferencing
>> invalid
>> metadata pointer or simply encountering a nullptr?
>> *TrainingData::cleanup() was intended to clear stale metadata pointers,
>> but keep the training data around linked in symbolic form (holder ==
>> null).
>> Best regards,
>> Vladimir Ivanov
>> On 6/27/24 20:14, Ashutosh Mehra wrote:
>> > I encountered a crash when dumping the cds map with 1-step workflow.
>> > The crash happens in the forked JVM during the assembly phase of the
>> > training run.
>> > To recreate the crash, execute the training run with
>> > -Xlog:cds+map=trace:file=cds.map:none:filesize=0 option.
>> >
>> > #
>> > # A fatal error has been detected by the Java Runtime Environment:
>> > #
>> > # SIGSEGV (0xb) at pc=0x00007f4e8a209cb6, pid=152509, tid=152510
>> > #
>> > # JRE version: OpenJDK Runtime Environment (23.0) (slowdebug build
>> > 23-internal-adhoc.asmehra.leyden)
>> > # Java VM: OpenJDK 64-Bit Server VM (slowdebug
>> > 23-internal-adhoc.asmehra.leyden, mixed mode, sharing, tiered,
>> > compressed oops, compressed class ptrs, g1 gc, linux-amd64)
>> > # Problematic frame:
>> > # V [libjvm.so+0x409cb6] Klass::is_instance_klass() const+0x10
>> > #
>> > # Core dump will be written. Default location:
>> >
>> /home/asmehra/data/ashu-mehra/leyden/test/hotspot/jtreg/premain/quarkus-getting-started/core.152509
>> > #
>> > # An error report file with more information is saved as:
>> > #
>> >
>> /home/asmehra/data/ashu-mehra/leyden/test/hotspot/jtreg/premain/quarkus-getting-started/hs_err_pid152509.log
>> > #
>> > # If you would like to submit a bug report, please visit:
>> > # https://bugreport.java.com/bugreport/crash.jsp
>> <https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!ACWV5N9M2RV99hQ!Ls5umFY3l179hejC6K3i6zCDc3tx5W5e9zPaHr8MAtxR4iimfU3JHwyaW8ap3Ojy2h1J5YGITxMuLboYhUL6USw$>
>> > <https://bugreport.java.com/bugreport/crash.jsp
>> <https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!ACWV5N9M2RV99hQ!Ls5umFY3l179hejC6K3i6zCDc3tx5W5e9zPaHr8MAtxR4iimfU3JHwyaW8ap3Ojy2h1J5YGITxMuLboYhUL6USw$>>
>> > #
>> > [75.250s][error ][cds] Child process finished; status = 134
>> >
>> > Backtrace for the crashing thread:
>> >
>> > #11 0x00007f4e8a209cb6 in Klass::is_instance_klass (this=0x0) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/klass.hpp:683
>> > #12 0x00007f4e8afa8894 in Klass::external_name (this=0x0) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/klass.cpp:905
>> > #13 0x00007f4e8b126447 in Method::print_external_name
>> > (os=0x7f4e89dfd130, klass=0x0, method_name=0x8011e8588,
>> > signature=0x8011ab858) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/method.cpp:228
>> > #14 0x00007f4e8b1263b6 in Method::external_name (klass=0x0,
>> > method_name=0x8011e8588, signature=0x8011ab858) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/method.cpp:222
>> > #15 0x00007f4e8b1262e1 in Method::external_name
>> (this=0x800fd1920) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/method.cpp:213
>> > #16 0x00007f4e8a492b0d in ArchiveBuilder::CDSMapLogger::log_method
>> > (m=0x800fd1920, runtime_dest=0x801039cd8 "",
>> type_name=0x7f4e8b7d40fc
>> > "Method", bytes=128, current=0x7f4e8401d900)
>> > at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1194
>> > #17 0x00007f4e8a492d66 in
>> > ArchiveBuilder::CDSMapLogger::log_metaspace_objects
>> > (region=0x7f4e89dfe740, src_objs=0x7f4e89dfe860) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1228
>> > #18 0x00007f4e8a492a2b in
>> > ArchiveBuilder::CDSMapLogger::log_metaspace_region
>> (name=0x7f4e8b7d8af0
>> > "rw region", region=0x7f4e89dfe740, src_objs=0x7f4e89dfe860) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1182
>> > #19 0x00007f4e8a4940f4 in ArchiveBuilder::CDSMapLogger::log
>> > (builder=0x7f4e89dfe630, mapinfo=0x7f4e85017bb0,
>> > heap_info=0x7f4e89dfd4f0, bitmap=0x7f4e857bf850
>> > "\t\222\004I\222$\t\210\210\210\001\b\200",
>> bitmap_size_in_bytes=655824)
>> > at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1502
>> > #20 0x00007f4e8a48f2a5 in ArchiveBuilder::write_archive
>> > (this=0x7f4e89dfe630, mapinfo=0x7f4e85017bb0,
>> heap_info=0x7f4e89dfd4f0)
>> > at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1560
>> > #21 0x00007f4e8b11d249 in MetaspaceShared::write_static_archive
>> > (builder=0x7f4e89dfe630, mapinfo=0x7f4e85017bb0,
>> > heap_info=0x7f4e89dfd4f0) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/metaspaceShared.cpp:988
>> > #22 0x00007f4e8b11d1ac in MetaspaceShared::preload_and_dump_impl
>> > (builder=..., __the_thread__=0x7f4e8401d900) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/metaspaceShared.cpp:976
>> > #23 0x00007f4e8b11c5fd in MetaspaceShared::preload_and_dump () at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/metaspaceShared.cpp:767
>> > #24 0x00007f4e8b53bca2 in Threads::create_vm (args=0x7f4e89dfedd0,
>> > canTryAgain=0x7f4e89dfecd3) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/runtime/threads.cpp:900
>> > #25 0x00007f4e8ada2821 in JNI_CreateJavaVM_inner (vm=0x7f4e89dfee20,
>> > penv=0x7f4e89dfee28, args=0x7f4e89dfedd0) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/prims/jni.cpp:3581
>> > #26 0x00007f4e8ada2c81 in JNI_CreateJavaVM (vm=0x7f4e89dfee20,
>> > penv=0x7f4e89dfee28, args=0x7f4e89dfedd0) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/prims/jni.cpp:3672
>> > #27 0x00007f4e8ce0f84f in InitializeJVM (pvm=0x7f4e89dfee20,
>> > penv=0x7f4e89dfee28, ifn=0x7f4e89dfee70) at
>> >
>> /home/asmehra/data/ashu-mehra/leyden/src/java.base/share/native/libjli/java.c:1550
>> >
>> > Checking up the CDS map generated for the cds preimage shows some
>> > methods for which their InstanceKlass is null.
>> > This results in the crash seen above when such methods are
>> printed as
>> > part of the CDS map file during the assembly phase.
>> >
>> > These methods are of the form:
>> >
>> > java.lang.Object
>> > java.lang.invoke.LambdaForm$MH/0x800000090.invoke(java.lang.Object,
>> > java.lang.Object)
>> >
>> > Interestingly -Xlog:cds=info shows such classes are skipped when
>> > generating the preimage as they are hidden classes:
>> >
>> > Skipping java/lang/invoke/LambdaForm$MH+0x800000090: Hidden class
>> >
>> > In the CDS map file for the preimage I also noticed that such
>> methods
>> > are only referenced through MethodTrainingData -> _final_profile
>> -> _method.
>> > So it looks like although we excluded such classes from the CDS
>> archive,
>> > we don't exclude their training data.
>> > There is code for cleaning up the training data [0] , but it doesn't
>> > remove the training data for classes that have been excluded,
>> unless I
>> > misunderstood the code.
>> > Not sure if it is intentional or a bug.
>> > If we do need to keep the training data for such methods, then we
>> would
>> > need to handle the case of null InstanceKlass in the CDSMapLogger to
>> > avoid crashing.
>> >
>> > [0]
>> >
>> https://github.com/openjdk/leyden/blob/8716f47ef49c829e2384474577ff468a732b9c66/src/hotspot/share/oops/trainingData.cpp#L573 <https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/8716f47ef49c829e2384474577ff468a732b9c66/src/hotspot/share/oops/trainingData.cpp*L573__;Iw!!ACWV5N9M2RV99hQ!Ls5umFY3l179hejC6K3i6zCDc3tx5W5e9zPaHr8MAtxR4iimfU3JHwyaW8ap3Ojy2h1J5YGITxMuLboYP1sEOXU$> <https://github.com/openjdk/leyden/blob/8716f47ef49c829e2384474577ff468a732b9c66/src/hotspot/share/oops/trainingData.cpp#L573 <https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/8716f47ef49c829e2384474577ff468a732b9c66/src/hotspot/share/oops/trainingData.cpp*L573__;Iw!!ACWV5N9M2RV99hQ!Ls5umFY3l179hejC6K3i6zCDc3tx5W5e9zPaHr8MAtxR4iimfU3JHwyaW8ap3Ojy2h1J5YGITxMuLboYP1sEOXU$>>
>> >
>> > Thanks,
>> > - Ashutosh Mehra
More information about the leyden-dev
mailing list