RFR: 8139424: SIGSEGV, Problematic frame: # V [libjvm.so+0xd0c0cc] void InstanceKlass::oop_oop_iterate_oop_maps_specialized<true, oopDesc*, MarkAndPushClosure>

Thomas Schatzl thomas.schatzl at oracle.com
Thu Nov 12 08:28:00 UTC 2015


Hi Stefan,

On Wed, 2015-11-11 at 16:41 +0100, Stefan Johansson wrote:
> Hi,
> 
> Please review this fix for:
> https://bugs.openjdk.java.net/browse/JDK-8139424
> 
> Webrev:
> http://cr.openjdk.java.net/~sjohanss/8139424/hotspot.00/
> 
> Summary:
> The crash was caused by a faulty eager humongous reclaim. The reason for 
> reclaiming a live humongous object was an overlooked remembered set 
> entry when the object was treated as a candidate for humongous 
> reclamation. If the remembered set was expanded during the previous GC, 
> the code handling reclaim candidates would look at the old view of the 
> remembered set and due to that miss some entries. This was caused by 
> checking for eager reclaim candidates before calling cleanupHRRS, which 
> takes care of updating the remembered sets to be ready for iteration.
> 
> The fix was to simply move the call to rem_set()->cleanupHRRS() to 
> before register_humongous_regions_with_cset(), which is were we check 
> for reclaim candidates.
> 
> I also added a test that provokes this and asserts when run with a 
> fastdebug build, the test is not 100% deterministic and requires to be 
> run with some special parameters set in the JTREG header. The test will 
> never fail intermittently, and could possibly pass even though there are 
> problems in the code. I still think it is worth adding, to avoid doing 
> the same error again in the future.
> 
> Testing:
> Failure was reproducable in 1-2 hours on the sparc-host where it 
> occurred, and with this fix a 24 hour run was fine. The JTREG test also 
> fails without the fix but passes when it is applied.

Good catch!

As for the added guarantee, please either remove or make it an assert:
HeapRegionRemSet::occupied() can be very slow. Doing that calculation
twice adds insult to injury :)

Others already commented about the additional "s" in the added comment.
Could you also clarify in the comment why we need an up-to-date
remembered set in register_..., i.e. that that method iterates over the
remembered sets? That seems to be missing in the reasoning chain...

There is also a typo in the changeset comment, "obejct" instead of
"object".

Now cleanupHRRS() is executed twice during a single gc for no apparent
reason to me. Could you either remove it from
G1RemSet::prepare_for_oops_into_collection_set, or (I think) better move
that call up?

Actually I think that now with
G1CollectedHeap::pre_evacuate_collection_set(), both the call to
register_humongous_regions_with_cset() and
prepare_for_oops_into_collection_set() are better placed there. Both
methods seem to be preparation for the actual evacuation.

Did you check the runtime of this test on the very slow machines?

I also think we should make sure this change also ends up in the next 8u
(whatever kind of) update. Can you take care of this?

Thanks a lot for looking into this,
  Thomas





More information about the hotspot-gc-dev mailing list