Request for review (S): 7103665 HeapWord*ParallelScavengeHeap::failed_mem_allocate(unsigned long, bool)+0x97

Jon Masamitsu jon.masamitsu at oracle.com
Fri Mar 23 17:29:18 UTC 2012


Bengt,

Replies in-line.


On 3/23/2012 10:11 AM, Bengt Rutisson wrote:
>
> Jon,
>
> On 2012-03-23 16:57, Jon Masamitsu wrote:
>> Bengt,
>>
>> The changes you made are correct and an improvement, so if this
>> is time critical, I'm good with this version.  I do still have some
>> questions below.
>
> OK. Thanks.
>
>>
>> On 3/23/2012 8:31 AM, Bengt Rutisson wrote:
>>> Jon,
>>>
>>> Thanks for looking at this!
>>>
>>> 23 mar 2012 kl. 15:46 skrev Jon Masamitsu<jon.masamitsu at oracle.com>:
>>>
>>>> Bengt,
>>>>
>>>> Is it correct that there is still the possibility that during the 
>>>> final
>>>> iteration of the loop to do the filling, that the space to be filled
>>>> can be too small to fit an object?  I note the assertion
>>>>
>>>>>    98           assert(words_to_fill>= 
>>>>> CollectedHeap::min_fill_size(),
>>>>>    99             err_msg("Remaining siz ("SIZE_FORMAT ") is too 
>>>>> small to fill (based on " SIZE_FORMAT " and " SIZE_FORMAT ")",
>>>>>   100             words_to_fill, words_left_to_fill, 
>>>>> CollectedHeap::filler_array_max_size()));
>>>> You could fix that by checking for the situation in the next to last
>>>> iteration and shortening the filler for the next to last iteration?
>>> I think this should be safe. The heap size is object aligned and we 
>>> allocate aligned objects, so the hole should be object aligned, right?
>>>
>>> The max array size is also aligned. So I think we are safe. But to 
>>> be sure I added the assert.
>>>
>> This should be safe in 32bit but I'm not sure about 64bit.  The hole 
>> can be objected aligned
>> but still an odd number of words?
>
> Good point. But I think it is unlikely that you get holes that are 
> larger than what can be covered with one Integer array on 32 bit 
> plaforms, right?

Extremely unlikely but probably 5-10 lines of code to fix it?  I'll let your
conscience be your guide :-)  But you don't have to do anything  right now.

>
> Can it be an issue with compressed oops on 64 bit? I think we are safe 
> as long as we still don't have compressed headers. But please correct 
> me if I am wrong. My head is spinning a bit from all the 
> word/byte/jint conversions.
>
>>
>>>> Also a question below.
>>>>
>>>>
>>>> On 3/23/2012 5:56 AM, Bengt Rutisson wrote:
>>>>> Hi all,
>>>>>
>>>>> Could I have a couple of reviews for this change? This is on the 
>>>>> critical-watch list for hs23, so it would be great if I could get 
>>>>> some reviews already today.
>>>>>
>>>>> http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/
>>>>>
>>>>> Some background:
>>>>>
>>>>> There are two problems that this fix solves:
>>>>>
>>>>> (1) When we run with -XX:+UseNUMA we have to make the heap 
>>>>> parsable since we leave some holes in the heap. We do this by 
>>>>> faking Integer arrays in those places. However, on a large heap 
>>>>> those holes can be larger than what can be covered by a single 
>>>>> Integer array. The fix is to allocate several arrays in those cases.
>>>>>
>>>>> The easy fix would have been to call 
>>>>> CollectedHeap::fill_with_objects() instead of 
>>>>> CollectedHeap::fill_with_object(). Note the extra "s" in the 
>>>>> method name. But MutableNUMASpace::ensure_parsability(), where the 
>>>>> change is needed, need to know about all the objects that are 
>>>>> allocated so I saw no better solution than to implement my own 
>>>>> loop. Any ideas on how I could re-use fill_with_objects() instead 
>>>>> are welcome.
>>>> Do you need to know all the objects that are allocated so that you can
>>>> do the invalid region detection?
>>>>
>>>> 113             if (crossing_start != crossing_end) {
>>>> 114               // If object header crossed a small page boundary 
>>>> we mark the area
>>>> 115               // as invalid rounding it to a page_size().
>>>>
>>> Yes, that's why.
>>
>> If you used fill_with_objects()  (with an "s"), you do know where 
>> first filler object is so
>> you could step through the objects using the object sizes to find the 
>> next filler object.
>> Then it seems like you could do this check (which I can't say I 
>> understand what's going
>> on there) in a second loop.
>
> Right. That's probably a nicer way of doing it. Especially since it is 
> only needed on Solaris (I think).
>
> However, I'd like to get this in to have a theoretical chance of 
> porting it to hs23. So, since you stated that you are ok with it as it 
> is I'll go ahead and push what I have now.

I have absolutely no problem with this.

Jon
>
> Thanks again for the review and the good comments!
> Bengt
>
>>
>> Jon
>>> Bengt
>>>
>>>
>>>> Jon
>>>>> (2) CollectedHeap::_filler_array_max_size should be the maximum 
>>>>> size that can be covered by a fake Integer array. This value is in 
>>>>> words, but since the word size is different on different platforms 
>>>>> but the Integer size is always the same we need to convert between 
>>>>> word and sizeof(jint) at some point. Unfortunately we do the 
>>>>> conversion in the wrong direction, which means that on 64 bit 
>>>>> platforms we end up with a max size that is 4 times larger than 
>>>>> intended. This in turn makes us overflow an int when we convert 
>>>>> from a word size to the length of the array that we are setting up.
>>>>>
>>>>> Here is the code that overflows:
>>>>>
>>>>> 328 void
>>>>> 329 CollectedHeap::fill_with_array(HeapWord* start, size_t words, 
>>>>> bool zap)
>>>>> 330 {
>>>>> 331   assert(words>= filler_array_min_size(), "too small for an 
>>>>> array");
>>>>> 332   assert(words<= filler_array_max_size(), "too big for a 
>>>>> single object");
>>>>> 333
>>>>> 334   const size_t payload_size = words - filler_array_hdr_size();
>>>>> 335   const size_t len = payload_size * HeapWordSize / sizeof(jint);
>>>>> 336
>>>>> 337   // Set the length first for concurrent GC.
>>>>> 338   ((arrayOop)start)->set_length((int)len);
>>>>> 339   post_allocation_setup_common(Universe::intArrayKlassObj(), 
>>>>> start, words);
>>>>> 340   DEBUG_ONLY(zap_filler_array(start, words, zap);)
>>>>> 341 }
>>>>>
>>>>> If filler_array_max_size() on line 332, which returns 
>>>>> CollectedHeap::_filler_array_max_size, has a too large value we 
>>>>> will overflow the int conversation on line 338.
>>>>>
>>>>> Testing:
>>>>> This fix solves the issue that was found in the reproducer that I 
>>>>> could set up on a Linux x64 machine. I have asked the one who 
>>>>> initially reported the issue to run on their Solaris amd64 setup. 
>>>>> I will also try to set up a reproducer on a Solaris machine.
>>>>>
>>>>> I also ran the fix through JPRT. It did not fail, but there are no 
>>>>> NUMA tests in there as far as I know. But the change for issue (2) 
>>>>> was hopefully tested.
>>>>>
>>>>> Thanks,
>>>>> Bengt
>



More information about the hotspot-gc-dev mailing list