[9] RFR(S): 8075214: SIGSEGV in nmethod sweeping

Fri Mar 27 09:30:09 UTC 2015

David, thanks again for looking into this.

Best,
Tobias

On 27.03.2015 09:51, David Holmes wrote:
> Correction ...
> 
> On 27/03/2015 3:25 PM, David Holmes wrote:
>> Hi Tobias,
>>
>> On 26/03/2015 10:52 PM, Tobias Hartmann wrote:
>>> Hi Vladimir,
>>>
>>> On 25.03.2015 18:10, Vladimir Kozlov wrote:
>>>> This is messy. Tobias, can you investigate if we can have the same
>>>> functionality (sweeping on request) in already running sweeper thread
>>>> without crating new thread for that? It may require additional
>>>> synchronization which should be fine since it is testing - no need to
>>>> worry about performance.
>>>
>>> I agree. I removed creation of a new sweeper thread and added the
>>> method NMethodSweeper::force_sweep() to enforce a sweep by setting
>>> '_force_sweep', notifying the sweeper and waiting for completion.
>>> Synchronization is done by using the CodeCache_lock.
>>>
>>> Here are the new webrevs:
>>> http://cr.openjdk.java.net/~thartmann/8075214/top/webrev.00/
>>> http://cr.openjdk.java.net/~thartmann/8075214/webrev.01/
>>
>> This won't quite work. Suppose the sweeper thread is active in
>> possible_sweep, _should_sweep is true and forced/_force_sweep are false,
>> and sweeping is in progress. Now a force_sweep request comes in: sets
>> _force_sweep to true, performs a notify (which does nothing) and then
>> does a wait(1000). The sweeper thread completes the sweep, sees
>> forced==false and so doesn't do a notify, and itself goes back to
>> waiting until it either times out (after 24 hours!) or is notified again
>> (when codecache fills up enough). The force_sweep caller will keep
>> looping checking _forced_sweep, which is still true, and doing a wait
>> for 1 second until the next actual sweep occurs.
> 
> I overlooked the fact that when it wakes up after 1 second it will notify the sweeper thread again. So no problem. Thanks for pointing that out Tobias.
> 
> David
> 
>> I think you need the sweeper_thread to check _forced_sweep before
>> waiting ie:
>>
>>   251 void NMethodSweeper::sweeper_loop() {
>>   252   bool timeout;
>>   253   while (true) {
>>   254     {
>>   255       ThreadBlockInVM tbivm(JavaThread::current());
>>   256       MutexLockerEx waiter(CodeCache_lock,
>> Mutex::_no_safepoint_check_flag);
>> +          if (!_forced_sweep) {
>>     257       const long wait_time = 60*60*24 * 1000;
>>     258       timeout =
>> CodeCache_lock->wait(Mutex::_no_safepoint_check_flag, wait_time);
>> +          }
>>   259     }
>> ! 260     if (!timeout || _forced_sweep) {
>>   261       possibly_sweep();
>>   262     }
>>   263   }
>>   264 }
>>
>> That would trigger two back-to-back sweeps. Alternatively, if two
>> back-to-backs sweeps is not ideal then in possibly_sweep you'd need to
>> unconditionally grab the lock and check _forced_sweep and clear it and
>> notify. But you must check _forced_sweep with the lock hold.
>>
>> Also under the current approach concurrent calls to force_sweep wouldn't
>> force a sweep per call, but would simply block until A forced sweep had
>> occurred (or just a sweep depending on what you do above). That's
>> probably what you want but I wanted to flag it just in case.
>>
>> Cheers,
>> David
>>
>>> I executed 1k runs of the failing test.
>>>
>>> Thanks,
>>> Tobias
>>>
>>>> I looked in reviews of original changes for 8064669 and we did not
>>>> ask about that.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 3/25/15 6:53 AM, Tobias Hartmann wrote:
>>>>> Hi,
>>>>>
>>>>> please review the following patch.
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8075214
>>>>> http://cr.openjdk.java.net/~thartmann/8075214/webrev.00/
>>>>>
>>>>> Problem:
>>>>> The test uses the Whitebox API to enforce sweeping by creating and
>>>>> starting a 'CodeCacheSweeperThread'. During creation of the thread,
>>>>> the interpreter crashes in j.l.ThreadGroup.add(Thread t) [1] while
>>>>> executing a subtype check to validate that 't' is a subtype of
>>>>> j.l.Thread [2]. The problem is that we pass
>>>>> 'JavaThread->threadObj()' to 'ThreadGroup.add' which is invalid due
>>>>> to a GC that moved the object. The GC does not know about the thread
>>>>> because it was not yet added to the threads list and therefore does
>>>>> not update the oop.
>>>>>
>>>>> Solution:
>>>>> Instead of calling 'JavaThread::allocate_threadObj', the
>>>>> initialization is moved to the caller to make sure that setting the
>>>>> thread oop is done together with adding the thread to the threads
>>>>> list. I also fixed the missing oom handling described as one of the
>>>>> problems in JDK-8072377 [3].
>>>>>
>>>>> Testing:
>>>>> - 1k runs of failing testcase
>>>>> - JPRT
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>
>>>>> [1]
>>>>> http://hg.openjdk.java.net/jdk9/hs-comp/jdk/file/tip/src/java.base/share/classes/java/lang/ThreadGroup.java#l896
>>>>>
>>>>> [2] see '__ gen_subtype_check' in 'TemplateTable::aastore'
>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8072377
>>>>>