RFR(S): 7192128: G1: Extend fix for 6948537 to G1's BOT

John Cuthbertson john.cuthbertson at oracle.com
Thu Aug 16 23:09:51 UTC 2012


Hi Everyone,

Can I have a couple of volunteers to review the fix for this CR? The 
webrev can be found at: http://cr.openjdk.java.net/~johnc/7192128/webrev.0/

Summary:
A while back, Ramki discovered an issue on Niagara systems where 
concurrent readers of the block offset table could see spurious zero 
entries as a result of the implementation of memset using the SPARC BIS 
instruction. At the time it was thought that G1 was not affected by this 
issue.

During testing of the perm-gen removal changes, the development 
engineers started to see assertion failures and crashes in G1's block 
offset table. The assertions were about erroneous contents of the offset 
array. As a result the values used in the assertion were printed and the 
value being displayed should not have failed the assertion:

> --- a/src/share/vm/gc_implementation/g1/g1BlockOffsetTable.cpp
> +++ b/src/share/vm/gc_implementation/g1/g1BlockOffsetTable.cpp
> @@ -546,7 +546,10 @@
>      assert(_array->offset_array(j) > 0 &&
>             _array->offset_array(j) <=
>               (u_char) (N_words+BlockOffsetArray::N_powers-1),
> -           "offset array should have been set");
> +           err_msg("offset array should have been set "
> +           SIZE_FORMAT " not > 0 OR " SIZE_FORMAT " not <= "
> +           SIZE_FORMAT, _array->offset_array(j), 
> _array->offset_array(j),
> +           (N_words+BlockOffsetArray::N_powers-1)));
>    }
>  #endif
>  }

# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/g1BlockOffsetTable.cpp:552
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/tmp/jprt/P1/173804.cphillim/s/src/share/vm/gc_implementation/g1/g1BlockOffsetTable.cpp:552), pid=5210, tid=27
#  assert(_array->offset_array(j) > 0 && _array->offset_array(j) <= (u_char) (N_words+BlockOffsetArray::N_powers-1)) failed: offset array should have been set 65 not > 0 OR 65 not <= 77
#

So we had a value which failed the assertion check that, when it was 
re-read for the error message, had a value which should have passed the 
assertion check.

It seems that G1 is not immune to the problem seen in 6948537 and I 
believe that the recent PLAB resizing change has increased the 
likelihood of hitting the issue as it can increase the amount of 
concurrent refinement of BlockOffsetTable entries.

Testing: jprt runs with the perm-gen removal changes, command line tests 
on sparc and x86 systems, GCOld (which was the failing test) on SPARC 
systems.

Thanks,

JohnC



More information about the hotspot-gc-dev mailing list