RFR (S): G1: Use ArrayAllocator for BitMaps

Bengt Rutisson bengt.rutisson at oracle.com
Thu Jun 13 07:53:32 PDT 2013


Hi everyone,

Could I have a couple of review for this small change?
http://cr.openjdk.java.net/~brutisso/8016556/webrev.00/

Sending this request to the broad hotspot dev mailing list since it 
touches code in vm/utilities.

Background:

In the constructor for the ConcurrentMark class in G1 we set up one bit 
map per worker thread:

   for (uint i = 0; i < _max_worker_id; ++i) {
...
     _count_card_bitmaps[i] = BitMap(card_bm_size, false);
     ...
   }

Each of these bitmaps are malloced, which means that the amount of 
C-heap we require grows with the number of GC worker threads we have. On 
a machine with many CPUs we get many worker threads by default. For 
example, on scaaa007 I get 79 GC worker threads. The size of the bitmaps 
also scale with the heap size. Since this large machine has a lot of 
memory we get a large default heap as well.

Here is the output from just running java -version with G1 on scaaa007:

$ java -d64 -XX:+UseG1GC -XX:+PrintMallocStatistics -version
java version "1.8.0-ea-fastdebug"
Java(TM) SE Runtime Environment (build 1.8.0-ea-fastdebug-b92)
Java HotSpot(TM) 64-Bit Server VM (build 25.0-b34-fastdebug, mixed mode)
allocation stats: 35199 mallocs (221MB), 13989 frees (0MB), 35MB resrc

We malloc 221MB just by starting the VM. Most of the large allocations 
are due to the BitMap allocations. My patch changes the BitMap 
allocations to use the ArrayAllocator instead. That class uses mmap on 
Solaris if the size is larger than 64K.

With this patch the output looks like this:

$ java -d64 -XX:+UseG1GC -XX:+PrintMallocStatistics -version
java version "1.8.0-ea-fastdebug"
Java(TM) SE Runtime Environment (build 1.8.0-ea-fastdebug-b93)
Java HotSpot(TM) 64-Bit Server VM (build 
25.0-b34-internal-201306130943.brutisso.hs-gc-g1-mmap-fastdebug, mixed mode)
allocation stats: 35217 mallocs (31MB), 14031 frees (0MB), 35MB resrc

We are down to 31MB.

Note that the ArrayAllocator only has this effect on Solaris machines. 
Also note that I have not reduced the total amount of memory, just moved 
it from the C-heap to mapped memory.

One complication with the fix is that the BitMap data structures get 
copied around quite a bit. The copies are shallow copies, so we don't 
risk re-doing the allocation. But since I am now embedding an 
ArrayAllocator in the BitMap structure the destructor for copies of the 
ArrayAllocator gets called every now and then. The BitMap explicitly 
frees up the allocated memory when it thinks it is necessary. So, rather 
than trying to refactor the code to avoid copying I made it optional to 
free the allocated memory in the ArrayAllocator desctructor.

I do think it would be good to review the BitMap code. It seems a bit 
fishy that we pass around shallow copies. But I think my current change 
keeps the same behavior as before.

Thanks,
Bengt


More information about the hotspot-dev mailing list