[EXTERNAL]Mismatched ciMethodData in replay file.

Vladimir Kozlov vladimir.kozlov at oracle.com
Fri May 20 15:31:51 UTC 2022


On 5/19/22 10:14 PM, Liu, Xin wrote:
> hi, Vladimir,
> 
> Thanks you for taking a look at this.
> 
> 
> On 5/19/22 1:00 PM, Vladimir Kozlov wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Narrowed to hotspot-compiler list.
>>
>> You are right that it is weird. The dump is done from ciMethodData which is local (for compiler thread) clone of MDO
>> which should not be updated during compilation. Unless there is a place we still go into VM for get some numbers.
>> oh, I see! Compile constructor calls ciMethod::ensure_method_data(), It
> creates a snapshot of MDO at compiler's arena.
> 
>> One explanation is that during dump we hit safepoint and we lost part of output.
>>
> As you said, data are local. c2 thread sweeps profileData 2 rounds.
> 
> I am lost here. we write data to a fileStream. it looks like we don't
> yield for the safepoint synchronization between 2 rounds. how come we
> lost data here?
> 
>> I think we need a verification mode for replay dump to catch such case (separate count to catch such mismatch).
>>
>> Thanks,
>> Vladimir K
>>
> 
> Do you mean this mismatched replay data are useless?  I am still trying
> to exact what was wrong in C2CompilerThread.

No. I am trying to suggest how we can improve replay dump code to catch such case or avoid it.
May be we can do one path instead of 2 by collecting output in local buffer when counting.

And I agree with your suggestion in filed RFE.

Regards,
Vladimir K

> 
> Among 26 data fields, ciReplay manages to decode this sequence as
> "ciVirtualCallData" because header 0x70005 denotes bci=7 and tag =
> virtual_call_data_tag(5).
> 
> 
> 0x70005 0x4d55 0x0 0x7f6a5841c3c0 0xa3 0x7f6a5841c470
> 
> 
> _data->_cells[4] = 0x7f6a5841c470 is the second profiling receiver's
> ciKlass. it's an unmapped address.
> 
> This may lead us to the culprit of c2 thread crash. if we ditch
> ill-formed ciReplay files, we may miss the clue.
> 
> thanks,
> --lx
> 
> 
> 
>> On 5/18/22 11:14 PM, Liu, Xin wrote:
>>> hi,
>>>
>>> I get a weird replay, which was generated by 17.0.3+6-LTS. I don't see
>>> relevant code have changed since then, so I think it is still applicable
>>> to the tip of HotSpot.
>>>
>>> A customer shared the replay file
>>> with(https://github.com/corretto/corretto-17/issues/57#issuecomment-1130042063)
>>> and I am trying to reproduce his failure. it is written from
>>> VMError::report_and_die().
>>>
>>> One obstacle is that weird entries of ciMethodData. eg. line
>>> 14130, It declares that there will 2 non-null oops followed(see '2'
>>> after tag 'oops'. however, one only is recorded.
>>>
>>> ciMethodData kotlin/coroutines/jvm/internal/ContinuationImpl <init>
>>> (Lkotlin/coroutines/Continuation;)V 2 21538 orig 80 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 data
>>> 26 0x40007 0x402 0x70 0x4e9c 0x70005 0x4d55 0x0 0x7f6a5841c3c0 0xa3
>>> 0x7f6a5841c470 0xa4 0xc0003 0x4e9c 0x18 0x110002 0x529e 0x0 0x0 0x0 0x0
>>> 0x0 0x0 0x9 0x2 0x6 0x0 oops 2 7
>>> com/example/ProductAttRouter$withRequestLoggingContext$1 methods 0
>>>
>>> Another mismatched entry is at line 14203. it says there are 11
>>> oops but only 6 are there.
>>>
>>> Those mismatched entries leave uninitialized elements of rec->_classes
>>> and eventually crash ciReplay::initialize(). Have you seen them before?
>>> I can patch up hotspot to handle this mismatch, but I wonder how that
>>> happens?
>>>
>>> ciMethodData::dump_replay_data() iterates
>>> _data 2 rounds. The 1st round counts them and second round dumps them.
>>> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/ci/ciMethodData.cpp#L728
>>>
>>> Is that possible that the underlying data get updated on the fly? This
>>> case is kotlin coroutine. I am not
>>> sure whether it is same threading environment as classic Java.
>>>
>>> thanks,
>>> --lx
>> >


More information about the hotspot-compiler-dev mailing list