RFR: 8007806: Need a Throwables performance counter

Peter Levart peter.levart at gmail.com
Sat Feb 23 19:39:23 UTC 2013


Hi Nils,

If the counters are updated frequently from multiple threads, there 
might be contention/scalability issues. Instead of synchronization on 
updates, you might consider using atomic updates provided by 
sun.misc.Unsafe, like for example:


Index: jdk/src/share/classes/sun/misc/PerfCounter.java
===================================================================
--- jdk/src/share/classes/sun/misc/PerfCounter.java
+++ jdk/src/share/classes/sun/misc/PerfCounter.java
@@ -25,6 +25,8 @@

  package sun.misc;

+import sun.nio.ch.DirectBuffer;
+
  import java.nio.ByteBuffer;
  import java.nio.ByteOrder;
  import java.nio.LongBuffer;
@@ -50,6 +52,8 @@
  public class PerfCounter {
      private static final Perf perf =
          AccessController.doPrivileged(new Perf.GetPerfAction());
+    private static final Unsafe unsafe =
+        Unsafe.getUnsafe();

      // Must match values defined in 
hotspot/src/share/vm/runtime/perfdata.hpp
      private final static int V_Constant  = 1;
@@ -59,12 +63,14 @@

      private final String name;
      private final LongBuffer lb;
+    private final DirectBuffer db;

      private PerfCounter(String name, int type) {
          this.name = name;
          ByteBuffer bb = perf.createLong(name, U_None, type, 0L);
          bb.order(ByteOrder.nativeOrder());
          this.lb = bb.asLongBuffer();
+        this.db = bb instanceof DirectBuffer ? (DirectBuffer) bb : null;
      }

      static PerfCounter newPerfCounter(String name) {
@@ -79,23 +85,44 @@
      /**
       * Returns the current value of the perf counter.
       */
-    public synchronized long get() {
+    public long get() {
+        if (db != null) {
+            return unsafe.getLongVolatile(null, db.address());
+        }
+        else {
+            synchronized (this) {
-        return lb.get(0);
-    }
+                return lb.get(0);
+            }
+        }
+    }

      /**
       * Sets the value of the perf counter to the given newValue.
       */
-    public synchronized void set(long newValue) {
+    public void set(long newValue) {
+        if (db != null) {
+            unsafe.putOrderedLong(null, db.address(), newValue);
+        }
+        else {
+            synchronized (this) {
-        lb.put(0, newValue);
-    }
+                lb.put(0, newValue);
+            }
+        }
+    }

      /**
       * Adds the given value to the perf counter.
       */
-    public synchronized void add(long value) {
-        long res = get() + value;
+    public void add(long value) {
+        if (db != null) {
+            unsafe.getAndAddLong(null, db.address(), value);
+        }
+        else {
+            synchronized (this) {
+                long res = lb.get(0) + value;
-        lb.put(0, res);
+                lb.put(0, res);
+            }
+        }
      }

      /**



Testing the PerfCounter.increment() method in a loop on multiple threads 
sharing the same PerfCounter instance, for example, on a 4-core Intel i7 
machine produces the following results:

#
# PerfCounter_increment: run duration:  5,000 ms, #of logical CPUS: 8
#
            1 threads, Tavg =     19.02 ns/op (? =   0.00 ns/op)
            2 threads, Tavg =    109.93 ns/op (? =   6.17 ns/op)
            3 threads, Tavg =    136.64 ns/op (? =   2.99 ns/op)
            4 threads, Tavg =    293.26 ns/op (? =   5.30 ns/op)
            5 threads, Tavg =    316.94 ns/op (? =   6.28 ns/op)
            6 threads, Tavg =    686.96 ns/op (? =   7.09 ns/op)
            7 threads, Tavg =    793.28 ns/op (? =  10.57 ns/op)
            8 threads, Tavg =    898.15 ns/op (? =  14.63 ns/op)


With the presented patch, the results are a little better:

#
# PerfCounter_increment: run duration:  5,000 ms, #of logical CPUS: 8
#
# Measure:
            1 threads, Tavg =      5.22 ns/op (? =   0.00 ns/op)
            2 threads, Tavg =     34.51 ns/op (? =   0.60 ns/op)
            3 threads, Tavg =     54.85 ns/op (? =   1.42 ns/op)
            4 threads, Tavg =     74.67 ns/op (? =   1.71 ns/op)
            5 threads, Tavg =     94.71 ns/op (? =  41.68 ns/op)
            6 threads, Tavg =    114.80 ns/op (? =  32.10 ns/op)
            7 threads, Tavg =    136.70 ns/op (? =  26.80 ns/op)
            8 threads, Tavg =    158.48 ns/op (? =   9.93 ns/op)


The scalability is not much better, but the raw speed is, so it might 
present less contention when used in real-world code. If you wanted even 
better scalability, there is a new class in JDK8, the 
java.util.concurrent.LongAdder. But that doesn't buy atomic "set()" - 
only "add()". And it can't update native-memory variables, so it could 
only be used for add-only counters and in conjunction with a background 
thread that would periodically flush the sum to the native memory....

Regards, Peter


On 02/08/2013 06:10 PM, Nils Loodin wrote:
> It would be interesting to know the number of thrown throwables in the 
> JVM, to be able to do some high level application diagnostics / 
> statistics. A good way to put this number would be a performance 
> counter, since it is accessible both from Java and from the VM.
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007806
> http://cr.openjdk.java.net/~nloodin/8007806/webrev.00/
>
> Regards,
> Nils Loodin




More information about the core-libs-dev mailing list