RFR: 8274196: Crashes in VM_HeapDumper::work after JDK-8252842 [v2]

Per Liden pliden at openjdk.java.net
Mon Sep 27 09:43:12 UTC 2021


On Sun, 26 Sep 2021 08:02:33 GMT, Lin Zang <lzang at openjdk.org> wrote:

>> The root cause for crash in ZGC is that the JNIHandles are processed before object iteration. And ZGC would update the JNIHandles at object iteration with read barrier. So the crash is cause by accessing the invalid address which can be dummy info after zgc, and hence crash.
>> 
>> The lock rank issue can be fixed because the related mutexes are acquired in safepoint. so the safepoint_check_required could be safepoint_check_always.
>> 
>> The Epsilon issue is caused by wrong _num_dumper_thread calculated when the gang==NULL.
>
> Lin Zang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits:
> 
>  - un-ProblemList BasicJMapTest.java
>  - Merge branch 'master' into pd-fix
>  - 8274196: Crashes in VM_HeapDumper::work after JDK-8252842

> The root cause for crash in ZGC is that the JNIHandles are processed before object iteration. And ZGC would update the JNIHandles at object iteration with read barrier. So the crash is cause by accessing the invalid address which can be dummy info after zgc, and hence crash.

The fix here should not be to change the order of stuff, so that heap iteration happens first, that will just hide the real bug. The real bug is that the `JNIGlobalsDumper::do_oop()` is missing a load barrier. In other words, keep the order and just make sure to add a load barrier, like this:


void JNIGlobalsDumper::do_oop(oop* obj_p) {
  oop o = NativeAccess<AS_NO_KEEPALIVE>::oop_load(obj_p);
  ...

-------------

Changes requested by pliden (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5681


More information about the serviceability-dev mailing list