Request for review (S): 7102044 G1: VM crashes with assert(old_end != new_end) failed: don't call this otherwise

Fri Oct 28 00:06:16 PDT 2011

Hi David,

Thanks for looking at this!

On 2011-10-28 08:53, David Holmes wrote:
> Hi Bengt,
>
> On 27/10/2011 10:37 PM, Bengt Rutisson wrote:
>>
>> Hi again,
>>
>> An updated webrev based on comments from Stefan and Vladimir
>> http://cr.openjdk.java.net/~brutisso/7102044/webrev.02/
>
> It took me a little while to get my head around the calculations here. 
> I think this:
>
> 113     const size_t max_words_per_size_t = 
> align_size_down((SIZE_MAX/HeapWordSize - header_size(type)), 
> MinObjAlignment);
>  114     const size_t max_elements_per_size_t = HeapWordSize * 
> max_words_per_size_t / type2aelembytes(type);
>  115     if ((size_t)max_jint < max_elements_per_size_t) {
>  116       return max_jint;
>  117     }
>  118     return (int32_t)max_elements_per_size_t;
>
> would be a little clearer if that first variable was called 
> max_element_words_per_size_t in the style of the original code.

Sure. I'll fix that.

> 121 // for unit testing
>  122 #ifndef PRODUCT
>  123   static bool check_max_length_overflow(BasicType type);
>  124   static int32_t old_max_array_length(BasicType type);
>  125   static bool test_max_array_length();
>  126 #endif
>
> After 6 years on Hotspot I was not aware of the existence of 
> ExecuteInternalVMTests. :) I wonder when that existing quickSort test 
> was last run? That said does this really warrant these internal tests 
> over a simple assert in the function itself?

Actually I added the ExecuteInternalVMTests earlier this summer when I 
was adding the quickSort implementation. But at the time I was not sure 
what the best way to get them run automatically was. Now that I wanted 
to add more unit tests I thought I'd make sure they are run and asked 
around for how to do it.

> That said does this really warrant these internal tests over a simple 
> assert in the function itself?

I think so, since these are very statical calculations. There is really 
no need to execute an assert (in debug builds) every time we do the 
calculations. I prefer to have this as a separate test. That also makes 
it more obvious that all types are tested.

> test/Makefile
>
> With regard to only executing the tests for debug/fastdebug, I think 
> JPRT may set JPRT_BUILD_FLAVOR which you could then check for the 
> presence of "debug". I'm not certain - it might only do that when 
> building, not when running the tests.

Not quite sure how you mean that I can make use of JPRT_BUILD_FLAVOR. 
Did you mean that I could avoid having a separate build target if I 
could check whether or not it is a debug build and then piggy back the 
test execution on the server or client targets?

Thanks,
Bengt
>
> Cheers,
> David
>
>> Bengt
>>
>> On 2011-10-27 13:44, Bengt Rutisson wrote:
>>>
>>> Hi Vladimir,
>>>
>>> Thanks for looking at this change!
>>>
>>> On 2011-10-26 17:44, Vladimir Kozlov wrote:
>>>> Hi Bengt,
>>>>
>>>> I thought VM should throw OOM exception in such cases (as your
>>>> example) since you can't allocate such arrays in 32-bit VM. Does the
>>>> size check for OOM happened after heap resize calculation?
>>>
>>> You are right. The VM will throw an OOME both before and after my
>>> change. But for different reasons.
>>>
>>> With my change we will detect already in
>>> typeArrayKlass::allocate_common() that length <= max_length() and
>>> throw an OOME. Before this change we could pass that test and go on in
>>> the allocation path. This would result in a size_t overflow when we
>>> want to expand the heap. We would do a lot of extra work to expand to
>>> the wrong size. Then the allocation would fail and we would throw OOME.
>>>
>>> I think it is better to detect this early and avoid the extra work.
>>> This way the assert that G1 does (which I think makes sense) will
>>> hold. We already have the length check so I assume that this was the
>>> intention from the beginning.
>>>
>>>> Is SIZE_MAX defined on all platforms?
>>>
>>> I did try to look around the hotspot code for any uses of max size_t
>>> but I could not find it. So, I reverted to looking at standard ways of
>>> doing it. SIZE_MAX seems to be defined on all our platforms, since my
>>> code passes JPRT. And it seems to be part of the C99 standard:
>>>
>>> http://en.wikipedia.org/wiki/Size_t
>>>
>>> I thought it would be better to rely on this than for example
>>> "(size_t)-1". But I am happy to use any other define.
>>>
>>>> You still have problem with short[] and char[] since
>>>> 2*max_jint+header_size will overflow.
>>>
>>> Right. I did add some unit tests to catch this (but as you point out
>>> below those tests did not detect this overflow). Thanks for catching
>>> it. I will fix it and send out an updated webrev.
>>>
>>>> Also max_array_length() puts unneeded limitation on double[] and
>>>> long[] max length in 64 but VM, it should be max_jint. I think we can
>>>> do something like next:
>>>>
>>>> const size_t max_words_per_size_t =
>>>> align_size_down((SIZE_MAX/HeapWordSize - header_size(type)),
>>>> MinObjAlignment);
>>>> const size_t max_elements_per_size_t = HeapWordSize *
>>>> max_words_per_size_t / type2aelembytes(type);
>>>> if ((size_t)max_jint < max_elements_per_size_t);
>>>> return max_jint ;
>>>> return (int32_t)max_elements_per_size_t;
>>>
>>> I did not intend for my change to limit anything for 64 bits. That's
>>> why I added the MIN2((size_t)max_jint, max_words_per_size_t) statement
>>> and included the old calculation in the unit tests to verify that
>>> nothing has changed on 64 bits.
>>>
>>> I can change the code as you suggest, but I would like to understand
>>> what the problem is with my code first. I think it handles 64 bits
>>> correctly.
>>>
>>>> check_overflow() should use julong type for expressions instead of
>>>> size_t to catch overflow:
>>>>
>>>> +bool arrayOopDesc::check_overflow(BasicType type) {
>>>> + julong length = (julong)max_array_length(type);
>>>> + julong bytes_per_element = (julong)type2aelembytes(type);
>>>> + julong bytes = length * bytes_per_element +
>>>> (julong)header_size_in_bytes();
>>>> + return (julong)(size_t)bytes == bytes;
>>>> +}
>>>
>>> Right. This is why I didn't catch the problem above. Thanks. I'll fix.
>>>
>>>> Rename check_overflow() to check_max_length_overflow() or something.
>>>
>>> Done.
>>>
>>>> I don't think we need incorrect old_max_array_length() method and
>>>> corresponding checks for LP64.
>>>
>>> Ok. I'll keep it for some more testing to make sure that I don't
>>> unintentionally change anything for 64 bit. But I'll remove it before
>>> I push.
>>>
>>>> Initialization and execution of additional JPRT tests will increase
>>>> jobs time. I think you could add InternalVMTests to existing client
>>>> and server tests:
>>>>
>>>> < $(PRODUCT_HOME)/bin/java $(JAVA_OPTIONS) -version
>>>> ---
>>>> > $(PRODUCT_HOME)/bin/java $(JAVA_OPTIONS)
>>>> -XX:+IgnoreUnrecognizedVMOptions -XX:+ExecuteInternalVMTests -version
>>>>
>>>
>>> I see your point regarding the increased job times. But I think it is
>>> much clearer to have the internal unit tests as a separate target.
>>> That way they can be included and excluded with a JPRT command line.
>>>
>>> Also, I don't really like the IgnoreUnrecognizedVMOptions solution. I
>>> would like to be more explicit about that this is only a (fast)debug
>>> build flag.
>>>
>>> Thanks,
>>> Bengt
>>>
>>>> regards,
>>>> Vladimir
>>>>
>>>> On 10/26/11 6:18 AM, Bengt Rutisson wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Can I get a couple of reviews to this fairly small change?
>>>>> http://cr.openjdk.java.net/~brutisso/7102044/webrev.01/
>>>>>
>>>>> Sorry for the wide distribution, but the bug is reported on GC, I
>>>>> think the actual change (in arrayOop.hpp) is runtime
>>>>> responsibility and the method that is being changed is used on a few
>>>>> occasions in C2. So, I figured I'd like everyone to
>>>>> know that this is about to go in.
>>>>>
>>>>> CR:
>>>>> 7102044 G1: VM crashes with assert(old_end != new_end) failed: don't
>>>>> call this otherwise
>>>>> http://monaco.sfbay.sun.com/detail.jsf?cr=7102044
>>>>>
>>>>> Background
>>>>> arrayOopDesc::max_array_length() returns the maximum length an array
>>>>> can have based on the type of the array elements.
>>>>> This is passed on through the allcocation path as a word size. When
>>>>> we need to expand the heap we have to convert the
>>>>> word size to a byte size. The problem is that on 32 bit platforms
>>>>> this conversion may overflow a size_t. Which will
>>>>> result in that we call expand() with a much smaller size than
>>>>> required size or even with a size of 0.
>>>>>
>>>>> Here is a small reproducer that will trigger the assert mentioned in
>>>>> the CR if we run it with -XX:+UseG1GC:
>>>>>
>>>>> public class LargeObj {
>>>>> public static void main(String[] args) {
>>>>> int[] a = new int[2147483639];
>>>>> }
>>>>> }
>>>>>
>>>>> It is not only G1 that has this problem. ParallelGC has the same
>>>>> issue, it just doesn't have an assert to detect it. It
>>>>> can be argued that the GCs should check for overflows, but I think
>>>>> that since arrayOopDesc::max_array_length() will
>>>>> catch the problem earlier and solve it for all GCs this is a better
>>>>> place to fix the problem.
>>>>>
>>>>> I added some unit tests to test the new implementation. To make sure
>>>>> that these tests are run I had to update some more
>>>>> files. The actual change is in arrayOop.hpp. The rest of the files
>>>>> have just been changed to allow the unit tests to run.
>>>>>
>>>>> A few questions:
>>>>>
>>>>> (1) In arrayOopDesc::max_array_length() I use MIN2 to make sure that
>>>>> we don't overflow a size_t. This is only needed on
>>>>> 32 bit platforms. I prefer this general code compared to #ifndef
>>>>> _LP64. Any thoughts?
>>>>>
>>>>> (2) I made the unit tests run as JPRT make targets. I guess it can
>>>>> also be done as jtreg tests, but that way it would
>>>>> only be executed for PIT etc. Any thoughts on this?
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>
>>