Crash when dumping cds map file

Igor Veresov igor.veresov at oracle.com
Sat Jul 6 06:02:27 UTC 2024


Yes, with the changes I’m working on there will be no symbolic refs, at least for now. We can re-add them later if necessary. 
For now if a quick fix is necessary we can zero these fields in MethodTrainingData::cleanup() at the same place we currently do “_holder = nullptr”.

igor

> On Jun 28, 2024, at 4:19 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
> Probably, it's time to reconsider current behavior.
> 
> Igor is refactoring TrainingData to get rid of symbolic lookups. In such case, there's no reason to keep stale training data around and it's better to simply prune it.
> 
> Igor, what's your take on it?
> 
> Best regards,
> Vladimir Ivanov
> 
> On 6/28/24 15:54, Ashutosh Mehra wrote:
>>    What's the exact reason of the crash? Is it due to dereferencing invalid
>>    metadata pointer or simply encountering a nullptr?
>> It's nullptr. The klass pointer of the method is null.
>>    *TrainingData::cleanup() was intended to clear stale metadata pointers,
>>    but keep the training data around linked in symbolic form (holder ==
>>    null).
>> Okay, so if we do want to keep the training data but just prune the stale pointers,
>> then just setting the _holder to null is not enough, because the method belonging
>> to an excluded class can be reached through MethodTrainingData::_final_profile->method() or
>> MethodTrainingData::_final_counters->method().
>> So probably MethodTrainingData::cleanup() should be clearing the _method field in
>> MethodCounters and MethodData as well, and link them back in MethodTrainingData::refresh_from(),
>> just like it is done for MethodTrainingData.
>> Does that make sense?
>> Thanks,
>> - Ashutosh Mehra
>> On Fri, Jun 28, 2024 at 5:33 PM Vladimir Ivanov <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>>    What's the exact reason of the crash? Is it due to dereferencing
>>    invalid
>>    metadata pointer or simply encountering a nullptr?
>>    *TrainingData::cleanup() was intended to clear stale metadata pointers,
>>    but keep the training data around linked in symbolic form (holder ==
>>    null).
>>    Best regards,
>>    Vladimir Ivanov
>>    On 6/27/24 20:14, Ashutosh Mehra wrote:
>>     > I encountered a crash when dumping the cds map with 1-step workflow.
>>     > The crash happens in the forked JVM during the assembly phase of the
>>     > training run.
>>     > To recreate the crash, execute the training run with
>>     > -Xlog:cds+map=trace:file=cds.map:none:filesize=0 option.
>>     >
>>     > #
>>     > # A fatal error has been detected by the Java Runtime Environment:
>>     > #
>>     > #  SIGSEGV (0xb) at pc=0x00007f4e8a209cb6, pid=152509, tid=152510
>>     > #
>>     > # JRE version: OpenJDK Runtime Environment (23.0) (slowdebug build
>>     > 23-internal-adhoc.asmehra.leyden)
>>     > # Java VM: OpenJDK 64-Bit Server VM (slowdebug
>>     > 23-internal-adhoc.asmehra.leyden, mixed mode, sharing, tiered,
>>     > compressed oops, compressed class ptrs, g1 gc, linux-amd64)
>>     > # Problematic frame:
>>     > # V  [libjvm.so+0x409cb6]  Klass::is_instance_klass() const+0x10
>>     > #
>>     > # Core dump will be written. Default location:
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/test/hotspot/jtreg/premain/quarkus-getting-started/core.152509
>>     > #
>>     > # An error report file with more information is saved as:
>>     > #
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/test/hotspot/jtreg/premain/quarkus-getting-started/hs_err_pid152509.log
>>     > #
>>     > # If you would like to submit a bug report, please visit:
>>     > # https://bugreport.java.com/bugreport/crash.jsp
>>    <https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!ACWV5N9M2RV99hQ!Ls5umFY3l179hejC6K3i6zCDc3tx5W5e9zPaHr8MAtxR4iimfU3JHwyaW8ap3Ojy2h1J5YGITxMuLboYhUL6USw$>
>>     > <https://bugreport.java.com/bugreport/crash.jsp
>>    <https://urldefense.com/v3/__https://bugreport.java.com/bugreport/crash.jsp__;!!ACWV5N9M2RV99hQ!Ls5umFY3l179hejC6K3i6zCDc3tx5W5e9zPaHr8MAtxR4iimfU3JHwyaW8ap3Ojy2h1J5YGITxMuLboYhUL6USw$>>
>>     > #
>>     > [75.250s][error  ][cds] Child process finished; status = 134
>>     >
>>     > Backtrace for the crashing thread:
>>     >
>>     > #11 0x00007f4e8a209cb6 in Klass::is_instance_klass (this=0x0) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/klass.hpp:683
>>     > #12 0x00007f4e8afa8894 in Klass::external_name (this=0x0) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/klass.cpp:905
>>     > #13 0x00007f4e8b126447 in Method::print_external_name
>>     > (os=0x7f4e89dfd130, klass=0x0, method_name=0x8011e8588,
>>     > signature=0x8011ab858) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/method.cpp:228
>>     > #14 0x00007f4e8b1263b6 in Method::external_name (klass=0x0,
>>     > method_name=0x8011e8588, signature=0x8011ab858) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/method.cpp:222
>>     > #15 0x00007f4e8b1262e1 in Method::external_name
>>    (this=0x800fd1920) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/oops/method.cpp:213
>>     > #16 0x00007f4e8a492b0d in ArchiveBuilder::CDSMapLogger::log_method
>>     > (m=0x800fd1920, runtime_dest=0x801039cd8 "",
>>    type_name=0x7f4e8b7d40fc
>>     > "Method", bytes=128, current=0x7f4e8401d900)
>>     >      at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1194
>>     > #17 0x00007f4e8a492d66 in
>>     > ArchiveBuilder::CDSMapLogger::log_metaspace_objects
>>     > (region=0x7f4e89dfe740, src_objs=0x7f4e89dfe860) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1228
>>     > #18 0x00007f4e8a492a2b in
>>     > ArchiveBuilder::CDSMapLogger::log_metaspace_region
>>    (name=0x7f4e8b7d8af0
>>     > "rw region", region=0x7f4e89dfe740, src_objs=0x7f4e89dfe860) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1182
>>     > #19 0x00007f4e8a4940f4 in ArchiveBuilder::CDSMapLogger::log
>>     > (builder=0x7f4e89dfe630, mapinfo=0x7f4e85017bb0,
>>     > heap_info=0x7f4e89dfd4f0, bitmap=0x7f4e857bf850
>>     > "\t\222\004I\222$\t\210\210\210\001\b\200",
>>    bitmap_size_in_bytes=655824)
>>     >      at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1502
>>     > #20 0x00007f4e8a48f2a5 in ArchiveBuilder::write_archive
>>     > (this=0x7f4e89dfe630, mapinfo=0x7f4e85017bb0,
>>    heap_info=0x7f4e89dfd4f0)
>>     > at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/archiveBuilder.cpp:1560
>>     > #21 0x00007f4e8b11d249 in MetaspaceShared::write_static_archive
>>     > (builder=0x7f4e89dfe630, mapinfo=0x7f4e85017bb0,
>>     > heap_info=0x7f4e89dfd4f0) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/metaspaceShared.cpp:988
>>     > #22 0x00007f4e8b11d1ac in MetaspaceShared::preload_and_dump_impl
>>     > (builder=..., __the_thread__=0x7f4e8401d900) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/metaspaceShared.cpp:976
>>     > #23 0x00007f4e8b11c5fd in MetaspaceShared::preload_and_dump () at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/cds/metaspaceShared.cpp:767
>>     > #24 0x00007f4e8b53bca2 in Threads::create_vm (args=0x7f4e89dfedd0,
>>     > canTryAgain=0x7f4e89dfecd3) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/runtime/threads.cpp:900
>>     > #25 0x00007f4e8ada2821 in JNI_CreateJavaVM_inner (vm=0x7f4e89dfee20,
>>     > penv=0x7f4e89dfee28, args=0x7f4e89dfedd0) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/prims/jni.cpp:3581
>>     > #26 0x00007f4e8ada2c81 in JNI_CreateJavaVM (vm=0x7f4e89dfee20,
>>     > penv=0x7f4e89dfee28, args=0x7f4e89dfedd0) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/hotspot/share/prims/jni.cpp:3672
>>     > #27 0x00007f4e8ce0f84f in InitializeJVM (pvm=0x7f4e89dfee20,
>>     > penv=0x7f4e89dfee28, ifn=0x7f4e89dfee70) at
>>     >
>>    /home/asmehra/data/ashu-mehra/leyden/src/java.base/share/native/libjli/java.c:1550
>>     >
>>     > Checking up the CDS map generated for the cds preimage shows some
>>     > methods for which their InstanceKlass is null.
>>     > This results in the crash seen above when such methods are
>>    printed as
>>     > part of the CDS map file during the assembly phase.
>>     >
>>     > These methods are of the form:
>>     >
>>     > java.lang.Object
>>     > java.lang.invoke.LambdaForm$MH/0x800000090.invoke(java.lang.Object,
>>     > java.lang.Object)
>>     >
>>     > Interestingly -Xlog:cds=info shows such classes are skipped when
>>     > generating the preimage as they are hidden classes:
>>     >
>>     > Skipping java/lang/invoke/LambdaForm$MH+0x800000090: Hidden class
>>     >
>>     > In the CDS map file for the preimage I also noticed that such
>>    methods
>>     > are only referenced through MethodTrainingData -> _final_profile
>>    -> _method.
>>     > So it looks like although we excluded such classes from the CDS
>>    archive,
>>     > we don't exclude their training data.
>>     > There is code for cleaning up the training data [0] , but it doesn't
>>     > remove the training data for classes that have been excluded,
>>    unless I
>>     > misunderstood the code.
>>     > Not sure if it is intentional or a bug.
>>     > If we do need to keep the training data for such methods, then we
>>    would
>>     > need to handle the case of null InstanceKlass in the CDSMapLogger to
>>     > avoid crashing.
>>     >
>>     > [0]
>>     >
>>    https://github.com/openjdk/leyden/blob/8716f47ef49c829e2384474577ff468a732b9c66/src/hotspot/share/oops/trainingData.cpp#L573 <https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/8716f47ef49c829e2384474577ff468a732b9c66/src/hotspot/share/oops/trainingData.cpp*L573__;Iw!!ACWV5N9M2RV99hQ!Ls5umFY3l179hejC6K3i6zCDc3tx5W5e9zPaHr8MAtxR4iimfU3JHwyaW8ap3Ojy2h1J5YGITxMuLboYP1sEOXU$> <https://github.com/openjdk/leyden/blob/8716f47ef49c829e2384474577ff468a732b9c66/src/hotspot/share/oops/trainingData.cpp#L573 <https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/8716f47ef49c829e2384474577ff468a732b9c66/src/hotspot/share/oops/trainingData.cpp*L573__;Iw!!ACWV5N9M2RV99hQ!Ls5umFY3l179hejC6K3i6zCDc3tx5W5e9zPaHr8MAtxR4iimfU3JHwyaW8ap3Ojy2h1J5YGITxMuLboYP1sEOXU$>>
>>     >
>>     > Thanks,
>>     > - Ashutosh Mehra



More information about the leyden-dev mailing list