Request for review (S): 7103665 HeapWord*ParallelScavengeHeap::failed_mem_allocate(unsigned long,bool)+0x97

Fri Mar 23 12:56:11 UTC 2012

Hi all,

Could I have a couple of reviews for this change? This is on the 
critical-watch list for hs23, so it would be great if I could get some 
reviews already today.

http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

Some background:

There are two problems that this fix solves:

(1) When we run with -XX:+UseNUMA we have to make the heap parsable 
since we leave some holes in the heap. We do this by faking Integer 
arrays in those places. However, on a large heap those holes can be 
larger than what can be covered by a single Integer array. The fix is to 
allocate several arrays in those cases.

The easy fix would have been to call CollectedHeap::fill_with_objects() 
instead of CollectedHeap::fill_with_object(). Note the extra "s" in the 
method name. But MutableNUMASpace::ensure_parsability(), where the 
change is needed, need to know about all the objects that are allocated 
so I saw no better solution than to implement my own loop. Any ideas on 
how I could re-use fill_with_objects() instead are welcome.

(2) CollectedHeap::_filler_array_max_size should be the maximum size 
that can be covered by a fake Integer array. This value is in words, but 
since the word size is different on different platforms but the Integer 
size is always the same we need to convert between word and sizeof(jint) 
at some point. Unfortunately we do the conversion in the wrong 
direction, which means that on 64 bit platforms we end up with a max 
size that is 4 times larger than intended. This in turn makes us 
overflow an int when we convert from a word size to the length of the 
array that we are setting up.

Here is the code that overflows:

  328 void
  329 CollectedHeap::fill_with_array(HeapWord* start, size_t words, bool 
zap)
  330 {
  331   assert(words >= filler_array_min_size(), "too small for an array");
  332   assert(words <= filler_array_max_size(), "too big for a single 
object");
  333
  334   const size_t payload_size = words - filler_array_hdr_size();
  335   const size_t len = payload_size * HeapWordSize / sizeof(jint);
  336
  337   // Set the length first for concurrent GC.
  338   ((arrayOop)start)->set_length((int)len);
  339   post_allocation_setup_common(Universe::intArrayKlassObj(), 
start, words);
  340   DEBUG_ONLY(zap_filler_array(start, words, zap);)
  341 }

If filler_array_max_size() on line 332, which returns 
CollectedHeap::_filler_array_max_size, has a too large value we will 
overflow the int conversation on line 338.

Testing:
This fix solves the issue that was found in the reproducer that I could 
set up on a Linux x64 machine. I have asked the one who initially 
reported the issue to run on their Solaris amd64 setup. I will also try 
to set up a reproducer on a Solaris machine.

I also ran the fix through JPRT. It did not fail, but there are no NUMA 
tests in there as far as I know. But the change for issue (2) was 
hopefully tested.

Thanks,
Bengt