RFR: JDK-8061259: ParNew promotion failed is serialized on a lock

Jungwoo Ha jwha at google.com
Wed Oct 22 21:03:46 UTC 2014


Bug: https://bugs.openjdk.java.net/browse/JDK-8061259

hotspot code: http://cr.openjdk.java.net/~rasbold/8061259/webrev.00/

I created the patch for JDK9, but I can also observe this on JDK7 and 8.

We are seeing several cases where GC worker threads are serialized on
GCRareEvent_lock
causing 2 digit seconds pause on moderate sized heap.

I have a test code that can reproduce it and shows it is solved.

import java.util.LinkedList;
class PromoFail {

  static class Container {

    Container p;
    byte[] a;

    public Container(int size) {
      if (size > 0) {
        p = new Container(size / 2);
      } else {
        p = null;
      }
      a = new byte[size];
    }
  }

  public static void main(String args[]) {
    if (args.length < 1) {
      System.err.println("@ 1st argument must be size in MB.");
      System.exit(1);
    }
    int size = 0;
    try {
      size = Integer.parseInt(args[0]) * 1024 * 1024;
    } catch (NumberFormatException e) {
      System.err.println("@ Cannot parse the size(=" + args[0] + ")");
      System.exit(1);
    }

    // LinkedList will have more unbalanced workload.
    LinkedList<Container> list = new LinkedList<Container>();

    // 1st iteration adds element without removal.
    // These are all live objects.
    for (int i = 0; i < size / 4; i++) {
      list.add(new Container(1));
    }
    // Promote to the old gen.
    System.gc();

    for (int containerSize = 2; container_size < 512; container_size *= 3) {
      for (int i = 0; i < size / 4; i++) {
        // Most likely removing an old object due to System.gc() from
previous iteration.
        // This will cause fragmentation.
        list.remove();
        list.add(new Container(containerSize));
      }

      {
        System.gc();
        Runtime runtime = Runtime.getRuntime();
        System.out.println("@ Current Used: "
            + (runtime.totalMemory() - runtime.freeMemory()) / 1024 / 1024);
      }
    }
  }
}

You can run it with the following parameters.

$ java -Xmx2g -Xms2g -Xmn1g -XX:+UseCMSFastPromotionFailure
-XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=6
PromoFail 4

Without UseCMSFastPromotionFailure
#7: [GC (Allocation Failure) #7: [ParNew#6:
[CMS-concurrent-abortable-preclean: 0.003/0.203 secs] [Times: user=0.20
sys=0.20 real=0.20 secs]
 (promotion failed): 838912K->943744K(943744K), 62.0419534 secs]#8: [CMS
(concurrent mode failure): 1048441K->1048575K(1048576K), 1.7731336 secs]
1609551K->1170596K(1992320K), [Metaspace: 3547K->3547K(1056768K)],
*63.8151607*secs] [Times: user=93.50 sys=22.12 real=63.82 secs]

With UseCMSFastPromotionFailure
#7: [GC (Allocation Failure) #7: [ParNew#6:
[CMS-concurrent-abortable-preclean: 0.004/0.204 secs] [Times: user=0.30
sys=0.02 real=0.20 secs]
 (promotion failed): 838912K->943744K(943744K), 2.0949545 secs]#8: [CMS
(concurrent mode failure): 1048363K->1048575K(1048576K), 1.7517250 secs]
1609551K->1170595K(1992320K), [Metaspace: 3546K->3546K(1056768K)],
*3.8467384*secs] [Times: user=10.85 sys=1.04 real=3.85 secs]

I also ran it on Dacapo benchmarks. Please see attached results.
Those are subset of DaCapo that shows any promotion failed pause.
You can see that some speed ups and no performance regressions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20141022/4fd33ade/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cms-fast-promotion.png
Type: image/png
Size: 29019 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20141022/4fd33ade/cms-fast-promotion.png>


More information about the hotspot-gc-dev mailing list