[14] RFR(S): 8231501: VM crash in MethodData::clean_extra_data(CleanExtraDataClosure*): fatal error: unexpected tag 99
Tobias Hartmann
tobias.hartmann at oracle.com
Mon Dec 2 13:07:34 UTC 2019
Hi Christian,
looks reasonable to me.
Best regards,
Tobias
On 20.11.19 15:14, Christian Hagedorn wrote:
> Hi
>
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8231501
> http://cr.openjdk.java.net/~chagedorn/8231501/webrev.00/
>
> The bug could be traced back to the concurrent cleaning of method data with its extra data in
> MethodData::clean_method_data() and the loading/copying of extra data for the ci method data in
> ciMethodData::load_extra_data(). I reproduced the bug by using the test [1] which extensively cleans
> method data by using the whitebox API [2].
>
> Before loading and copying the extra data from the MDO to the ciMDO in
> ciMethodData::load_extra_data(), the metadata is prepared in a fixed-point iteration by cleaning all
> SpeculativeTrapData entries of methods whose klasses are unloaded [3]. If it encounters such a dead
> entry it releases the extra data lock (due to ranking issues) and tries again later [4]. This
> release of the lock triggers the bug: There can be cases where one thread A is waiting in the
> whitebox API method to get the extra data lock [2] to clean the extra data for the very same MDO for
> which another thread B just released the lock at [4]. If that MDO actually contained
> SpeculativeTrapData entries, then thread A cleaned those but the ciMDO, which thread B is preparing,
> still contains the uncleaned old MDO extra data (because thread B only made a snapshot of the MDO
> earlier at [5]). Things then go wrong when thread B can reacquire the lock after thread A. It tries
> to load the now cleaned extra data and immediately finishes at [6] since there are no
> SpeculativeTrapData entries anymore. It copied a single entry with tag DataLayout::no_tag [7] to the
> ciMDO which actually contained a SpeculativeTrapData entry. This results in a half way cleared entry
> (since a SpeculativeTrapData entry has an additional cell for the method) and possible other
> remaining SpeculativeTrapData entries:
>
>
> Let's assume a little-endian ordering and that both 0x00007fff... addresses are real pointers to
> methods. Tag 13 (0x0d) is used for SpeculativeTrapData and dp points to the first extra data entry:
>
> ciMDO extra data before thread B releases the lock at [4] (same extra data for MDO and ciMDO):
> 0x800000040011000d 0x00007fffd4993c63 0x800000040011000d 0x00007fffd49b1a68 0x0000000000000000
> dp: tag = 13 -> next entry = dp+16; dp+8: method 0x00007fffd4993c63
> dp+16: tag = 13 -> next entry = dp+32; dp+24: method 0x00007fffd49b1a68
> dp+32: tag = 0 -> end of extra data
>
> MDO extra data after thread B reacquires the lock and thread A cleaned the MDO (ciMDO extra data is
> unchanged):
> 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000
> dp: tag = 0 -> end of extra data
>
>
> Returning at [6] when the extra data loading from MDO to ciMDO is finished:
> MDO extra data:
> 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000
> dp: tag = 0 -> end of extra data
>
> ciMDO extra data, only copied the first no_tag entry from MDO at [7] (8 bytes):
> 0x0000000000000000 0x00007fffd4993c63 0x800000040011000d 0x00007fffd49b1a68 0x0000000000000000
> dp: tag = 0 -> next entry = dp+8
> dp+8: tag = 0x63 = 99 -> there is no tag 99 -> fatal...
>
>
> The next time the ciMDO extra data is iterated, for example by using MethodData::next_extra(), it
> reads tag 99 after processing the first no_tag entry and jumping to the value at offset 8 which
> causes a crash since there is no tag 99 available.
>
>
> The fix is to completely zero out the current and all following SpeculativeTrapData entries if we
> encounter a no_tag in the MDO but a speculative_trap_data_tag tag in the ciMDO. There are also other
> cases where the method data is cleaned. Thus the bug is not only related to the whitebox API usage
> but occurs very rarely.
>
> Thank you!
>
> Best regards,
> Christian
>
>
> [1]
> http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/test/hotspot/jtreg/compiler/types/correctness/CorrectnessTest.java
>
> [2] http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/prims/whitebox.cpp#l1137
> [3] http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l137
> [4] http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l115
> [5] http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l219
> [6] http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l191
> [7] http://hg.openjdk.java.net/jdk/jdk/file/580fb715b29d/src/hotspot/share/ci/ciMethodData.cpp#l176
More information about the hotspot-compiler-dev
mailing list