[14] RFR(S): 8231501: VM crash in MethodData::clean_extra_data(CleanExtraDataClosure*): fatal error: unexpected tag 99

Christian Hagedorn christian.hagedorn at oracle.com
Wed Nov 20 14:14:31 UTC 2019


Hi

Please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8231501
http://cr.openjdk.java.net/~chagedorn/8231501/webrev.00/

The bug could be traced back to the concurrent cleaning of method data 
with its extra data in MethodData::clean_method_data() and the 
loading/copying of extra data for the ci method data in 
ciMethodData::load_extra_data(). I reproduced the bug by using the test 
[1] which extensively cleans method data by using the whitebox API [2].

Before loading and copying the extra data from the MDO to the ciMDO in 
ciMethodData::load_extra_data(), the metadata is prepared in a 
fixed-point iteration by cleaning all SpeculativeTrapData entries of 
methods whose klasses are unloaded [3]. If it encounters such a dead 
entry it releases the extra data lock (due to ranking issues) and tries 
again later [4]. This release of the lock triggers the bug: There can be 
cases where one thread A is waiting in the whitebox API method to get 
the extra data lock [2] to clean the extra data for the very same MDO 
for which another thread B just released the lock at [4]. If that MDO 
actually contained SpeculativeTrapData entries, then thread A cleaned 
those but the ciMDO, which thread B is preparing, still contains the 
uncleaned old MDO extra data (because thread B only made a snapshot of 
the MDO earlier at [5]). Things then go wrong when thread B can 
reacquire the lock after thread A. It tries to load the now cleaned 
extra data and immediately finishes at [6] since there are no 
SpeculativeTrapData entries anymore. It copied a single entry with tag 
DataLayout::no_tag [7] to the ciMDO which actually contained a 
SpeculativeTrapData entry. This results in a half way cleared entry 
(since a SpeculativeTrapData entry has an additional cell for the 
method) and possible other remaining SpeculativeTrapData entries:


Let's assume a little-endian ordering and that both 0x00007fff... 
addresses are real pointers to methods. Tag 13 (0x0d) is used for 
SpeculativeTrapData and dp points to the first extra data entry:

ciMDO extra data before thread B releases the lock at [4] (same extra 
data for MDO and ciMDO):
0x800000040011000d 0x00007fffd4993c63 0x800000040011000d 
0x00007fffd49b1a68 0x0000000000000000
dp: tag = 13 -> next entry = dp+16; dp+8: method 0x00007fffd4993c63
dp+16: tag = 13 -> next entry = dp+32; dp+24: method 0x00007fffd49b1a68
dp+32: tag = 0 -> end of extra data

MDO extra data after thread B reacquires the lock and thread A cleaned 
the MDO (ciMDO extra data is unchanged):
0x0000000000000000 0x0000000000000000 0x0000000000000000 
0x0000000000000000 0x0000000000000000
dp: tag = 0 -> end of extra data


Returning at [6] when the extra data loading from MDO to ciMDO is finished:
MDO extra data:
0x0000000000000000 0x0000000000000000 0x0000000000000000 
0x0000000000000000 0x0000000000000000
dp: tag = 0 -> end of extra data

ciMDO extra data, only copied the first no_tag entry from MDO at [7] (8 
bytes):
0x0000000000000000 0x00007fffd4993c63 0x800000040011000d 
0x00007fffd49b1a68 0x0000000000000000
dp: tag = 0 -> next entry = dp+8
dp+8: tag = 0x63 = 99 -> there is no tag 99 -> fatal...


The next time the ciMDO extra data is iterated, for example by using 
MethodData::next_extra(), it reads tag 99 after processing the first 
no_tag entry and jumping to the value at offset 8 which causes a crash 
since there is no tag 99 available.


The fix is to completely zero out the current and all following 
SpeculativeTrapData entries if we encounter a no_tag in the MDO but a 
speculative_trap_data_tag tag in the ciMDO. There are also other cases 
where the method data is cleaned. Thus the bug is not only related to 
the whitebox API usage but occurs very rarely.

Thank you!

Best regards,
Christian


[1] 
http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/test/hotspot/jtreg/compiler/types/correctness/CorrectnessTest.java
[2] 
http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/prims/whitebox.cpp#l1137
[3] 
http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l137
[4] 
http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l115
[5] 
http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l219
[6] 
http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l191
[7] 
http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l176


More information about the hotspot-compiler-dev mailing list