[9] RFR(S): 8075214: SIGSEGV in nmethod sweeping

Fri Mar 27 08:51:38 UTC 2015

Correction ...

On 27/03/2015 3:25 PM, David Holmes wrote:
> Hi Tobias,
>
> On 26/03/2015 10:52 PM, Tobias Hartmann wrote:
>> Hi Vladimir,
>>
>> On 25.03.2015 18:10, Vladimir Kozlov wrote:
>>> This is messy. Tobias, can you investigate if we can have the same
>>> functionality (sweeping on request) in already running sweeper thread
>>> without crating new thread for that? It may require additional
>>> synchronization which should be fine since it is testing - no need to
>>> worry about performance.
>>
>> I agree. I removed creation of a new sweeper thread and added the
>> method NMethodSweeper::force_sweep() to enforce a sweep by setting
>> '_force_sweep', notifying the sweeper and waiting for completion.
>> Synchronization is done by using the CodeCache_lock.
>>
>> Here are the new webrevs:
>> http://cr.openjdk.java.net/~thartmann/8075214/top/webrev.00/
>> http://cr.openjdk.java.net/~thartmann/8075214/webrev.01/
>
> This won't quite work. Suppose the sweeper thread is active in
> possible_sweep, _should_sweep is true and forced/_force_sweep are false,
> and sweeping is in progress. Now a force_sweep request comes in: sets
> _force_sweep to true, performs a notify (which does nothing) and then
> does a wait(1000). The sweeper thread completes the sweep, sees
> forced==false and so doesn't do a notify, and itself goes back to
> waiting until it either times out (after 24 hours!) or is notified again
> (when codecache fills up enough). The force_sweep caller will keep
> looping checking _forced_sweep, which is still true, and doing a wait
> for 1 second until the next actual sweep occurs.

I overlooked the fact that when it wakes up after 1 second it will 
notify the sweeper thread again. So no problem. Thanks for pointing that 
out Tobias.

David

> I think you need the sweeper_thread to check _forced_sweep before
> waiting ie:
>
>   251 void NMethodSweeper::sweeper_loop() {
>   252   bool timeout;
>   253   while (true) {
>   254     {
>   255       ThreadBlockInVM tbivm(JavaThread::current());
>   256       MutexLockerEx waiter(CodeCache_lock,
> Mutex::_no_safepoint_check_flag);
> +          if (!_forced_sweep) {
>     257       const long wait_time = 60*60*24 * 1000;
>     258       timeout =
> CodeCache_lock->wait(Mutex::_no_safepoint_check_flag, wait_time);
> +          }
>   259     }
> ! 260     if (!timeout || _forced_sweep) {
>   261       possibly_sweep();
>   262     }
>   263   }
>   264 }
>
> That would trigger two back-to-back sweeps. Alternatively, if two
> back-to-backs sweeps is not ideal then in possibly_sweep you'd need to
> unconditionally grab the lock and check _forced_sweep and clear it and
> notify. But you must check _forced_sweep with the lock hold.
>
> Also under the current approach concurrent calls to force_sweep wouldn't
> force a sweep per call, but would simply block until A forced sweep had
> occurred (or just a sweep depending on what you do above). That's
> probably what you want but I wanted to flag it just in case.
>
> Cheers,
> David
>
>> I executed 1k runs of the failing test.
>>
>> Thanks,
>> Tobias
>>
>>> I looked in reviews of original changes for 8064669 and we did not
>>> ask about that.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 3/25/15 6:53 AM, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> please review the following patch.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8075214
>>>> http://cr.openjdk.java.net/~thartmann/8075214/webrev.00/
>>>>
>>>> Problem:
>>>> The test uses the Whitebox API to enforce sweeping by creating and
>>>> starting a 'CodeCacheSweeperThread'. During creation of the thread,
>>>> the interpreter crashes in j.l.ThreadGroup.add(Thread t) [1] while
>>>> executing a subtype check to validate that 't' is a subtype of
>>>> j.l.Thread [2]. The problem is that we pass
>>>> 'JavaThread->threadObj()' to 'ThreadGroup.add' which is invalid due
>>>> to a GC that moved the object. The GC does not know about the thread
>>>> because it was not yet added to the threads list and therefore does
>>>> not update the oop.
>>>>
>>>> Solution:
>>>> Instead of calling 'JavaThread::allocate_threadObj', the
>>>> initialization is moved to the caller to make sure that setting the
>>>> thread oop is done together with adding the thread to the threads
>>>> list. I also fixed the missing oom handling described as one of the
>>>> problems in JDK-8072377 [3].
>>>>
>>>> Testing:
>>>> - 1k runs of failing testcase
>>>> - JPRT
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> [1]
>>>> http://hg.openjdk.java.net/jdk9/hs-comp/jdk/file/tip/src/java.base/share/classes/java/lang/ThreadGroup.java#l896
>>>>
>>>> [2] see '__ gen_subtype_check' in 'TemplateTable::aastore'
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8072377
>>>>