RFR: 8208278: [mlvm] Deadlocked threads are not always detected

Sat Feb 23 05:29:50 UTC 2019

Hi Igor,

jmpp updated and java file regenerated:

http://cr.openjdk.java.net/~dholmes/8208278/webrev.v2/

and _locks[i].isLocked() loop behaviour restored.

Thanks,
David

On 23/02/2019 9:16 am, David Holmes wrote:
> Hi Igor,
> 
> On 23/02/2019 5:04 am, Igor Ignatyev wrote:
>> Hi David,
>>
>> Overall looks reasonable. I have a couple of comments:
>>   - this INDIFY_Test.java was generated from INDIFY_Test.jmpp, so it'd 
>> be better to make changes in .jmpp file and regenerate .java
> 
> Oh! Did not realize that. How do you regenerate the .java file ??
> 
>>   - near L#18218, you changed for-loop to throw an exception as soon 
>> as we get 1st locked thread, this reduces amount of diagnostic 
>> information we would get in such failure scenarios, so I prefer 
>> checking _testFailed (or other bool) after the loop.
> 
> Note this loop is just a sanity check that the locks (not the threads) 
> are not locked before we start - which should always be the case if we 
> correctly join()'d all threads on previous iteration. There's very 
> little diagnostic information here (it doesn't even try to tell you 
> which thread owns the lock!) But I can move it back to after the loop.
> 
> Thanks,
> David
> 
>>> + // Sanity check that all the locks are available.
>>> + for (int i = 0; i < THREAD_NUM; i++ ) {
>>> + if (_locks[i].isLocked()) {
>>>                   Env.getLog().complain("Lock " + i + " is still 
>>> locked!");
>>> - _testFailed = true;
>>> + throw new Exception("Some locks are still locked");
>>>               }
>>>           }
>>> - if ( _testFailed )
>>> - throw new Exception("Some locks are still locked");
>>
>> Thanks,
>> -- Igor
>>
>>> On Feb 22, 2019, at 5:13 AM, David Holmes <david.holmes at oracle.com 
>>> <mailto:david.holmes at oracle.com>> wrote:
>>>
>>> webrev: http://cr.openjdk.java.net/~dholmes/8208278/webrev/
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8208278
>>>
>>> The test was failing to find the expected deadlocks on OS X but it 
>>> turns out that the test was simply racy and the race always lost on 
>>> OS X. With logging enabled the test started failing on different 
>>> platforms and in different ways.
>>>
>>> The main logic of the test is restructured so that we don't assume 
>>> things will happen within a certain time but instead we loop (or 
>>> block) until they do and rely on the overall test timeout to detect 
>>> there may be a problem.
>>>
>>> Thanks,
>>> David
>>>
>>>
>>