Request for review (S): 7102044 G1: VM crashes with assert(old_end != new_end) failed: don't call this otherwise

Thu Oct 27 09:04:10 PDT 2011

Bengt,

Please, send to hotspot-dev alias only. I'm sending this mail there.

You need to use cast to narrow type (size_t) in final compare in check_max_length_overflow() as I suggested before. 
Using reverse calculation will not catch integer overflow since all values are longs.

 > This actually changes the implementation for 64 bits. We will always use max_jint (even for double, long, etc) where 
we used to use an aligned value with the header subtracted. I think this is more correct, but is is different  than before.

Yes, it is different than before. But I don't think we should treat long[] and double[] differently (in 64 bit VM) than 
other types for which we return max_jint.

 > Also an alernative to your implementation would be:

In general we prefer to avoid platform specific (64 bit) ifdefs.

 >> Is SIZE_MAX defined on all platforms?

I am fine with SIZE_MAX if it passed JPRT (including embedded) as you said. If you can, please, verify if it exist in 
bsd and mac. May be Dan or Christian can help.

 > I see your point regarding the increased job times. But I think it is much clearer to have the internal unit tests as 
a  separate target. That way they can be included and excluded with a JPRT command line.

Did you measure the additional total time they added to JPRT job? If it is around 5 mins I am fine with it.

Thanks,
Vladimir

On 10/27/11 5:37 AM, Bengt Rutisson wrote:
>
> Hi again,
>
> An updated webrev based on comments from Stefan and Vladimir
> http://cr.openjdk.java.net/~brutisso/7102044/webrev.02/
>
> Bengt
>
> On 2011-10-27 13:44, Bengt Rutisson wrote:
>>
>> Hi Vladimir,
>>
>> Thanks for looking at this change!
>>
>> On 2011-10-26 17:44, Vladimir Kozlov wrote:
>>> Hi Bengt,
>>>
>>> I thought VM should throw OOM exception in such cases (as your example) since you can't allocate such arrays in
>>> 32-bit VM. Does the size check for OOM happened after heap resize calculation?
>>
>> You are right. The VM will throw an OOME both before and after my change. But for different reasons.
>>
>> With my change we will detect already in typeArrayKlass::allocate_common() that length <= max_length() and throw an
>> OOME. Before this change we could pass that test and go on in the allocation path. This would result in a size_t
>> overflow when we want to expand the heap. We would do a lot of extra work to expand to the wrong size. Then the
>> allocation would fail and we would throw OOME.
>>
>> I think it is better to detect this early and avoid the extra work. This way the assert that G1 does (which I think
>> makes sense) will hold. We already have the length check so I assume that this was the intention from the beginning.
>>
>>> Is SIZE_MAX defined on all platforms?
>>
>> I did try to look around the hotspot code for any uses of max size_t but I could not find it. So, I reverted to
>> looking at standard ways of doing it. SIZE_MAX seems to be defined on all our platforms, since my code passes JPRT.
>> And it seems to be part of the C99 standard:
>>
>> http://en.wikipedia.org/wiki/Size_t
>>
>> I thought it would be better to rely on this than for example "(size_t)-1". But I am happy to use any other define.
>>
>>> You still have problem with short[] and char[] since 2*max_jint+header_size will overflow.
>>
>> Right. I did add some unit tests to catch this (but as you point out below those tests did not detect this overflow).
>> Thanks for catching it. I will fix it and send out an updated webrev.
>>
>>> Also max_array_length() puts unneeded limitation on double[] and long[] max length in 64 but VM, it should be
>>> max_jint. I think we can do something like next:
>>>
>>> const size_t max_words_per_size_t = align_size_down((SIZE_MAX/HeapWordSize - header_size(type)), MinObjAlignment);
>>> const size_t max_elements_per_size_t = HeapWordSize * max_words_per_size_t / type2aelembytes(type);
>>> if ((size_t)max_jint < max_elements_per_size_t);
>>> return max_jint ;
>>> return (int32_t)max_elements_per_size_t;
>>
>> I did not intend for my change to limit anything for 64 bits. That's why I added the MIN2((size_t)max_jint,
>> max_words_per_size_t) statement and included the old calculation in the unit tests to verify that nothing has changed
>> on 64 bits.
>>
>> I can change the code as you suggest, but I would like to understand what the problem is with my code first. I think
>> it handles 64 bits correctly.
>>
>>> check_overflow() should use julong type for expressions instead of size_t to catch overflow:
>>>
>>> +bool arrayOopDesc::check_overflow(BasicType type) {
>>> + julong length = (julong)max_array_length(type);
>>> + julong bytes_per_element = (julong)type2aelembytes(type);
>>> + julong bytes = length * bytes_per_element + (julong)header_size_in_bytes();
>>> + return (julong)(size_t)bytes == bytes;
>>> +}
>>
>> Right. This is why I didn't catch the problem above. Thanks. I'll fix.
>>
>>> Rename check_overflow() to check_max_length_overflow() or something.
>>
>> Done.
>>
>>> I don't think we need incorrect old_max_array_length() method and corresponding checks for LP64.
>>
>> Ok. I'll keep it for some more testing to make sure that I don't unintentionally change anything for 64 bit. But I'll
>> remove it before I push.
>>
>>> Initialization and execution of additional JPRT tests will increase jobs time. I think you could add InternalVMTests
>>> to existing client and server tests:
>>>
>>> < $(PRODUCT_HOME)/bin/java $(JAVA_OPTIONS) -version
>>> ---
>>> > $(PRODUCT_HOME)/bin/java $(JAVA_OPTIONS) -XX:+IgnoreUnrecognizedVMOptions -XX:+ExecuteInternalVMTests -version
>>>
>>
>> I see your point regarding the increased job times. But I think it is much clearer to have the internal unit tests as
>> a separate target. That way they can be included and excluded with a JPRT command line.
>>
>> Also, I don't really like the IgnoreUnrecognizedVMOptions solution. I would like to be more explicit about that this
>> is only a (fast)debug build flag.
>>
>> Thanks,
>> Bengt
>>
>>> regards,
>>> Vladimir
>>>
>>> On 10/26/11 6:18 AM, Bengt Rutisson wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Can I get a couple of reviews to this fairly small change?
>>>> http://cr.openjdk.java.net/~brutisso/7102044/webrev.01/
>>>>
>>>> Sorry for the wide distribution, but the bug is reported on GC, I think the actual change (in arrayOop.hpp) is runtime
>>>> responsibility and the method that is being changed is used on a few occasions in C2. So, I figured I'd like
>>>> everyone to
>>>> know that this is about to go in.
>>>>
>>>> CR:
>>>> 7102044 G1: VM crashes with assert(old_end != new_end) failed: don't call this otherwise
>>>> http://monaco.sfbay.sun.com/detail.jsf?cr=7102044
>>>>
>>>> Background
>>>> arrayOopDesc::max_array_length() returns the maximum length an array can have based on the type of the array elements.
>>>> This is passed on through the allcocation path as a word size. When we need to expand the heap we have to convert the
>>>> word size to a byte size. The problem is that on 32 bit platforms this conversion may overflow a size_t. Which will
>>>> result in that we call expand() with a much smaller size than required size or even with a size of 0.
>>>>
>>>> Here is a small reproducer that will trigger the assert mentioned in the CR if we run it with -XX:+UseG1GC:
>>>>
>>>> public class LargeObj {
>>>> public static void main(String[] args) {
>>>> int[] a = new int[2147483639];
>>>> }
>>>> }
>>>>
>>>> It is not only G1 that has this problem. ParallelGC has the same issue, it just doesn't have an assert to detect it. It
>>>> can be argued that the GCs should check for overflows, but I think that since arrayOopDesc::max_array_length() will
>>>> catch the problem earlier and solve it for all GCs this is a better place to fix the problem.
>>>>
>>>> I added some unit tests to test the new implementation. To make sure that these tests are run I had to update some more
>>>> files. The actual change is in arrayOop.hpp. The rest of the files have just been changed to allow the unit tests to
>>>> run.
>>>>
>>>> A few questions:
>>>>
>>>> (1) In arrayOopDesc::max_array_length() I use MIN2 to make sure that we don't overflow a size_t. This is only needed on
>>>> 32 bit platforms. I prefer this general code compared to #ifndef _LP64. Any thoughts?
>>>>
>>>> (2) I made the unit tests run as JPRT make targets. I guess it can also be done as jtreg tests, but that way it would
>>>> only be executed for PIT etc. Any thoughts on this?
>>>>
>>>> Thanks,
>>>> Bengt
>>
>