RFR(s): 8185278: TestGreyReclaimedHumongousObjects.java fails guarantee(index != trim_index(_head_index + 1)) failed: should not go past head

sangheon.kim sangheon.kim at oracle.com
Thu Oct 12 23:19:01 UTC 2017


Hi all,

Could I have some reviews for fixing G1 MMU concurrency problem?

* Background
As this bug occurred with DetGC option enabled, it became a confidential 
bug.
The short description is:
------------------------------------
gc/g1/TestGreyReclaimedHumongousObjects.java
     Test failed the following assert on Linux X64 64-bit Server VM:

     # Internal Error (g1MMUTracker.cpp:155), pid=9433, tid=9449
     # guarantee(index != trim_index(_head_index + 1)) failed: should 
not go past head
------------------------------------

* Analysis
Considering G1MMUTrackerQueue::_head_index and _tail_index couldn't be 
same with _no_entries==QueueLength, the data is corrupted.
G1MMUTrackerQueue::add_pause() is only called from VMThread but 
G1MMUTrackerQueue::when_sec() which internally calls 
remove_expired_entries() can be called from ConcurrentMarkThread 
concurrently. And when_sec() is guarded by MMUTracker_lock while 
add_pause() is not guarded.

* Proposal
Instead of adding MMUTracker_lock at add_pause(), it would be better to 
use SuspendibleThreadSetJoiner as there are 2 additional benefits.
1. If there is running young gc but not yet updated its gc time for MMU, 
its gc time will be reflected at this MMU calculation as STS will 
suspend ConcurrentMarkThread.
2. ConcurrentMarkThread will not sleep if there is young gc which makes 
MMU more accurate.

CR: https://bugs.openjdk.java.net/browse/JDK-8185278
Webrev: http://cr.openjdk.java.net/~sangheki/8185278/webrev.0/
Testing: JPRT

Thanks,
Sangheon

[Bonus]
** Core dump analysis
I guess it happened near index 35 as G1MMUTrackerQueue::_array[35] is 
recorded for "Pause Cleanup".
In this case, there are 2 gcs which meet the condition of 'end time > 
limit' at G1MMUTrackerQueue::when_internal(). i.e.
(gdb) p _array[35] <= _mark_cleanup_start_sec
$106 = {_start_time = 48.060288561, _end_time = 48.060476459}
(gdb) p _array[36]
$107 = {_start_time = 48.061733350000004, _end_time = 48.062563427000001}
(gdb) p _array[37] <= _head_index == _tail_index == 37
$108 = {_start_time = 48.062946299000004, _end_time = 48.063457630000002}
(gdb) p _array[38] <= The oldest gc record
$109 = {_start_time = 47.888210037, _end_time = 47.888929198}

I think _tail_index should be 38.



More information about the hotspot-gc-dev mailing list