RFR(s): 8185278: TestGreyReclaimedHumongousObjects.java fails guarantee(index != trim_index(_head_index + 1)) failed: should not go past head
sangheon.kim
sangheon.kim at oracle.com
Thu Oct 12 23:19:01 UTC 2017
Hi all,
Could I have some reviews for fixing G1 MMU concurrency problem?
* Background
As this bug occurred with DetGC option enabled, it became a confidential
bug.
The short description is:
------------------------------------
gc/g1/TestGreyReclaimedHumongousObjects.java
Test failed the following assert on Linux X64 64-bit Server VM:
# Internal Error (g1MMUTracker.cpp:155), pid=9433, tid=9449
# guarantee(index != trim_index(_head_index + 1)) failed: should
not go past head
------------------------------------
* Analysis
Considering G1MMUTrackerQueue::_head_index and _tail_index couldn't be
same with _no_entries==QueueLength, the data is corrupted.
G1MMUTrackerQueue::add_pause() is only called from VMThread but
G1MMUTrackerQueue::when_sec() which internally calls
remove_expired_entries() can be called from ConcurrentMarkThread
concurrently. And when_sec() is guarded by MMUTracker_lock while
add_pause() is not guarded.
* Proposal
Instead of adding MMUTracker_lock at add_pause(), it would be better to
use SuspendibleThreadSetJoiner as there are 2 additional benefits.
1. If there is running young gc but not yet updated its gc time for MMU,
its gc time will be reflected at this MMU calculation as STS will
suspend ConcurrentMarkThread.
2. ConcurrentMarkThread will not sleep if there is young gc which makes
MMU more accurate.
CR: https://bugs.openjdk.java.net/browse/JDK-8185278
Webrev: http://cr.openjdk.java.net/~sangheki/8185278/webrev.0/
Testing: JPRT
Thanks,
Sangheon
[Bonus]
** Core dump analysis
I guess it happened near index 35 as G1MMUTrackerQueue::_array[35] is
recorded for "Pause Cleanup".
In this case, there are 2 gcs which meet the condition of 'end time >
limit' at G1MMUTrackerQueue::when_internal(). i.e.
(gdb) p _array[35] <= _mark_cleanup_start_sec
$106 = {_start_time = 48.060288561, _end_time = 48.060476459}
(gdb) p _array[36]
$107 = {_start_time = 48.061733350000004, _end_time = 48.062563427000001}
(gdb) p _array[37] <= _head_index == _tail_index == 37
$108 = {_start_time = 48.062946299000004, _end_time = 48.063457630000002}
(gdb) p _array[38] <= The oldest gc record
$109 = {_start_time = 47.888210037, _end_time = 47.888929198}
I think _tail_index should be 38.
More information about the hotspot-gc-dev
mailing list