RFR 8151546: nsk/jvmti/RedefineClasses/StressRedefine fails in hs nightly

Wed Apr 13 01:43:44 UTC 2016

Hi Serguei,

On 4/12/16 9:35 PM, serguei.spitsyn at oracle.com wrote:
> On 4/12/16 12:29, Coleen Phillimore wrote:
>>
>> Hi Serguei,
>>
>> Thank you for looking at this change.
>>
>> On 4/11/16 9:17 PM, serguei.spitsyn at oracle.com wrote:
>>> Coleen,
>>>
>>> src/share/vm/prims/jvmtiRedefineClasses.cpp
>>> - // Update the version number of the constant pool
>>> + // Update the version number of the constant pools (may keep 
>>> scratch_cp)
>>>     merge_cp->increment_and_save_version(old_cp->version());
>>> + scratch_cp->increment_and_save_version(old_cp->version());
>>>
>>> Not sure, I understand the change above.
>>> Could you, please, explain why this change is needed?
>>> I suspect, the scratch_cp->version() is never used.
>>
>> scratch_cp is used if it's equivalent to the old constant pool (see 
>> the code below this with the comments).  But it could add entries.  
>> In this case, we want scratch_cp to have a new version number because 
>> scratch_class->_source_file_name_index may be an appended entry 
>> (old_cp->length() + n) which a parallel constant pool merge might 
>> append a different entry and be set to the constant pool after the 
>> safepoint.   So source_file_name_index won't point to the first 
>> appended entry. So I need to update the version also in scratch_cp to 
>> detect this.
>
> Thank you for the explanation.
> I'm Ok with this change, just wanted to understand.
>
> I think, we have to prevent multiple class redefinitions (prologues) 
> of the same class at the same time.
> Otherwise, it is hard to isolate and fix all potential issues in this 
> scenario.
> I doubt, the original goal was to allow this.
>
> I've never investigated this corner case.
> It is not clear what happens with two merged constant pools prepared 
> concurrently.
> We do not merge them again, right?
> Most likely,  the last redefinition wins with some side effects.
> If so, then there has to be a way to detect and prevent this kind of 
> concurrency.

Yes, if there are two constant pools prepared concurrently, the last one 
wins and will not contain any entries added by the other constant pool.

I tried a change to detect concurrency and give an error.  It's quite 
easy to detect with the constant pool versioning fixed.   It failed a 
lot of our tests which do the same redefinition concurrently.  I then 
tried a change to detect that the concurrent redefinition is 
incompatible.  This change worked and I almost sent it out but since I 
couldn't write a test to provoke a failure in this case (because the old 
methods running are using their older constant pools), I decided to 
stick with the simplest fix to not depend on the constant pool index to 
be correct for the InstanceKlass::_source_file_name_index.

So that's what happened.

The ultimate answer is to stop merging constant pools.  I have a 
repository with this change that passes all our tests but it's too big 
and risky of a change, ie needs more testing and verification, for this 
late in jdk9.  I'd like to prepare it once the next release opens.
>
>>
>> Actually, I made this change because I was going to make a bigger 
>> change that compared constant pool entries if they were the same 
>> version (ie both old_cp->version + 1), indicating parallel constant 
>> pool merging.  I decided this change was too much.
>>
>>>
>>>
>>> + // NOTE: this doesn't work because you can redefine the same class 
>>> in two
>>> + // threads, each getting their own constant pool data appended to the
>>> + // original constant pool. In order for the new methods to work 
>>> when they
>>> + // become old methods, they need to keep their updated copy of the 
>>> constant pool.
>>> +
>>> It feels like the statement in this note is too strong, and as such, 
>>> confusing.
>>> Would it be better to tell something like "not always work"?
>>> Otherwise, the question is: why do we need this block of code if it 
>>> doesn't work?
>>>
>>
>> The block of code is #if 0'ed out.   In my debugging I figured out 
>> why it wouldn't work, so I thought I'd comment it.
>
> Oh, I see.
> In this particular case, I looked at the Udiff that does not show all 
> the context.
>
> Please, consider it reviewed.

Thank you.  It's great that you know this code so well to review this!

Coleen

>
> Thanks,
> Serguei
>
>
>>
>> Thanks,
>> Coleen
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 4/11/16 13:06, Coleen Phillimore wrote:
>>>> Summary: Constant pool merging is not thread safe for 
>>>> source_file_name.
>>>>
>>>> This change includes the change for the following bug because they 
>>>> are tested together.
>>>>
>>>> 8148772: VM crash in nsk/jvmti/RedefineClasses/StressRedefine: 
>>>> assert failed: Corrupted constant pool
>>>> Summary: ConstantPool::resolve_constant_at_impl() isn't thread safe 
>>>> for MethodHandleInError and MethodTypeInError.
>>>>
>>>> The parallel constant pool merges are mostly harmless because the 
>>>> old methods constant pool pointers aren't updated.  The only case I 
>>>> found where it isn't harmless is that we rely on finding the 
>>>> source_file_name_index from the final merged constant pool, which 
>>>> could be any of the parallel merged constant pools.  The code to 
>>>> attempt to dig out the name from redefined classes is removed.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8151546.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8151546
>>>>
>>>> Tested with rbt, java/lang/instrument tests, com/sun/jdi tests.  I 
>>>> tried to write a test with all the conditions of the failure but 
>>>> couldn't make it fail (so noreg-hard).
>>>>
>>>> Thanks,
>>>> Coleen
>>>
>>
>