[9] RFR(S): 8075214: SIGSEGV in nmethod sweeping

David Holmes david.holmes at oracle.com
Fri Mar 27 05:25:15 UTC 2015


Hi Tobias,

On 26/03/2015 10:52 PM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> On 25.03.2015 18:10, Vladimir Kozlov wrote:
>> This is messy. Tobias, can you investigate if we can have the same functionality (sweeping on request) in already running sweeper thread without crating new thread for that? It may require additional synchronization which should be fine since it is testing - no need to worry about performance.
>
> I agree. I removed creation of a new sweeper thread and added the method NMethodSweeper::force_sweep() to enforce a sweep by setting '_force_sweep', notifying the sweeper and waiting for completion. Synchronization is done by using the CodeCache_lock.
>
> Here are the new webrevs:
> http://cr.openjdk.java.net/~thartmann/8075214/top/webrev.00/
> http://cr.openjdk.java.net/~thartmann/8075214/webrev.01/

This won't quite work. Suppose the sweeper thread is active in 
possible_sweep, _should_sweep is true and forced/_force_sweep are false, 
and sweeping is in progress. Now a force_sweep request comes in: sets 
_force_sweep to true, performs a notify (which does nothing) and then 
does a wait(1000). The sweeper thread completes the sweep, sees 
forced==false and so doesn't do a notify, and itself goes back to 
waiting until it either times out (after 24 hours!) or is notified again 
(when codecache fills up enough). The force_sweep caller will keep 
looping checking _forced_sweep, which is still true, and doing a wait 
for 1 second until the next actual sweep occurs.

I think you need the sweeper_thread to check _forced_sweep before 
waiting ie:

  251 void NMethodSweeper::sweeper_loop() {
  252   bool timeout;
  253   while (true) {
  254     {
  255       ThreadBlockInVM tbivm(JavaThread::current());
  256       MutexLockerEx waiter(CodeCache_lock, 
Mutex::_no_safepoint_check_flag);
+          if (!_forced_sweep) {
    257       const long wait_time = 60*60*24 * 1000;
    258       timeout = 
CodeCache_lock->wait(Mutex::_no_safepoint_check_flag, wait_time);
+          }
  259     }
! 260     if (!timeout || _forced_sweep) {
  261       possibly_sweep();
  262     }
  263   }
  264 }

That would trigger two back-to-back sweeps. Alternatively, if two 
back-to-backs sweeps is not ideal then in possibly_sweep you'd need to 
unconditionally grab the lock and check _forced_sweep and clear it and 
notify. But you must check _forced_sweep with the lock hold.

Also under the current approach concurrent calls to force_sweep wouldn't 
force a sweep per call, but would simply block until A forced sweep had 
occurred (or just a sweep depending on what you do above). That's 
probably what you want but I wanted to flag it just in case.

Cheers,
David

> I executed 1k runs of the failing test.
>
> Thanks,
> Tobias
>
>> I looked in reviews of original changes for 8064669 and we did not ask about that.
>>
>> Thanks,
>> Vladimir
>>
>> On 3/25/15 6:53 AM, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8075214
>>> http://cr.openjdk.java.net/~thartmann/8075214/webrev.00/
>>>
>>> Problem:
>>> The test uses the Whitebox API to enforce sweeping by creating and starting a 'CodeCacheSweeperThread'. During creation of the thread, the interpreter crashes in j.l.ThreadGroup.add(Thread t) [1] while executing a subtype check to validate that 't' is a subtype of j.l.Thread [2]. The problem is that we pass 'JavaThread->threadObj()' to 'ThreadGroup.add' which is invalid due to a GC that moved the object. The GC does not know about the thread because it was not yet added to the threads list and therefore does not update the oop.
>>>
>>> Solution:
>>> Instead of calling 'JavaThread::allocate_threadObj', the initialization is moved to the caller to make sure that setting the thread oop is done together with adding the thread to the threads list. I also fixed the missing oom handling described as one of the problems in JDK-8072377 [3].
>>>
>>> Testing:
>>> - 1k runs of failing testcase
>>> - JPRT
>>>
>>> Thanks,
>>> Tobias
>>>
>>> [1] http://hg.openjdk.java.net/jdk9/hs-comp/jdk/file/tip/src/java.base/share/classes/java/lang/ThreadGroup.java#l896
>>> [2] see '__ gen_subtype_check' in 'TemplateTable::aastore'
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8072377
>>>


More information about the hotspot-compiler-dev mailing list