RFR: 8007806: Need a Throwables performance counter
David Holmes
david.holmes at oracle.com
Sat Feb 23 23:39:54 UTC 2013
Peter,
In your use of Unsafe you pass "null" as the object. I'm pretty certain
you can't pass null here. Unsafe operates on fields or array elements.
David
On 24/02/2013 5:39 AM, Peter Levart wrote:
> Hi Nils,
>
> If the counters are updated frequently from multiple threads, there
> might be contention/scalability issues. Instead of synchronization on
> updates, you might consider using atomic updates provided by
> sun.misc.Unsafe, like for example:
>
>
> Index: jdk/src/share/classes/sun/misc/PerfCounter.java
> ===================================================================
> --- jdk/src/share/classes/sun/misc/PerfCounter.java
> +++ jdk/src/share/classes/sun/misc/PerfCounter.java
> @@ -25,6 +25,8 @@
>
> package sun.misc;
>
> +import sun.nio.ch.DirectBuffer;
> +
> import java.nio.ByteBuffer;
> import java.nio.ByteOrder;
> import java.nio.LongBuffer;
> @@ -50,6 +52,8 @@
> public class PerfCounter {
> private static final Perf perf =
> AccessController.doPrivileged(new Perf.GetPerfAction());
> + private static final Unsafe unsafe =
> + Unsafe.getUnsafe();
>
> // Must match values defined in
> hotspot/src/share/vm/runtime/perfdata.hpp
> private final static int V_Constant = 1;
> @@ -59,12 +63,14 @@
>
> private final String name;
> private final LongBuffer lb;
> + private final DirectBuffer db;
>
> private PerfCounter(String name, int type) {
> this.name = name;
> ByteBuffer bb = perf.createLong(name, U_None, type, 0L);
> bb.order(ByteOrder.nativeOrder());
> this.lb = bb.asLongBuffer();
> + this.db = bb instanceof DirectBuffer ? (DirectBuffer) bb : null;
> }
>
> static PerfCounter newPerfCounter(String name) {
> @@ -79,23 +85,44 @@
> /**
> * Returns the current value of the perf counter.
> */
> - public synchronized long get() {
> + public long get() {
> + if (db != null) {
> + return unsafe.getLongVolatile(null, db.address());
> + }
> + else {
> + synchronized (this) {
> - return lb.get(0);
> - }
> + return lb.get(0);
> + }
> + }
> + }
>
> /**
> * Sets the value of the perf counter to the given newValue.
> */
> - public synchronized void set(long newValue) {
> + public void set(long newValue) {
> + if (db != null) {
> + unsafe.putOrderedLong(null, db.address(), newValue);
> + }
> + else {
> + synchronized (this) {
> - lb.put(0, newValue);
> - }
> + lb.put(0, newValue);
> + }
> + }
> + }
>
> /**
> * Adds the given value to the perf counter.
> */
> - public synchronized void add(long value) {
> - long res = get() + value;
> + public void add(long value) {
> + if (db != null) {
> + unsafe.getAndAddLong(null, db.address(), value);
> + }
> + else {
> + synchronized (this) {
> + long res = lb.get(0) + value;
> - lb.put(0, res);
> + lb.put(0, res);
> + }
> + }
> }
>
> /**
>
>
>
> Testing the PerfCounter.increment() method in a loop on multiple threads
> sharing the same PerfCounter instance, for example, on a 4-core Intel i7
> machine produces the following results:
>
> #
> # PerfCounter_increment: run duration: 5,000 ms, #of logical CPUS: 8
> #
> 1 threads, Tavg = 19.02 ns/op (? = 0.00 ns/op)
> 2 threads, Tavg = 109.93 ns/op (? = 6.17 ns/op)
> 3 threads, Tavg = 136.64 ns/op (? = 2.99 ns/op)
> 4 threads, Tavg = 293.26 ns/op (? = 5.30 ns/op)
> 5 threads, Tavg = 316.94 ns/op (? = 6.28 ns/op)
> 6 threads, Tavg = 686.96 ns/op (? = 7.09 ns/op)
> 7 threads, Tavg = 793.28 ns/op (? = 10.57 ns/op)
> 8 threads, Tavg = 898.15 ns/op (? = 14.63 ns/op)
>
>
> With the presented patch, the results are a little better:
>
> #
> # PerfCounter_increment: run duration: 5,000 ms, #of logical CPUS: 8
> #
> # Measure:
> 1 threads, Tavg = 5.22 ns/op (? = 0.00 ns/op)
> 2 threads, Tavg = 34.51 ns/op (? = 0.60 ns/op)
> 3 threads, Tavg = 54.85 ns/op (? = 1.42 ns/op)
> 4 threads, Tavg = 74.67 ns/op (? = 1.71 ns/op)
> 5 threads, Tavg = 94.71 ns/op (? = 41.68 ns/op)
> 6 threads, Tavg = 114.80 ns/op (? = 32.10 ns/op)
> 7 threads, Tavg = 136.70 ns/op (? = 26.80 ns/op)
> 8 threads, Tavg = 158.48 ns/op (? = 9.93 ns/op)
>
>
> The scalability is not much better, but the raw speed is, so it might
> present less contention when used in real-world code. If you wanted even
> better scalability, there is a new class in JDK8, the
> java.util.concurrent.LongAdder. But that doesn't buy atomic "set()" -
> only "add()". And it can't update native-memory variables, so it could
> only be used for add-only counters and in conjunction with a background
> thread that would periodically flush the sum to the native memory....
>
> Regards, Peter
>
>
> On 02/08/2013 06:10 PM, Nils Loodin wrote:
>> It would be interesting to know the number of thrown throwables in the
>> JVM, to be able to do some high level application diagnostics /
>> statistics. A good way to put this number would be a performance
>> counter, since it is accessible both from Java and from the VM.
>>
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007806
>> http://cr.openjdk.java.net/~nloodin/8007806/webrev.00/
>>
>> Regards,
>> Nils Loodin
>
More information about the core-libs-dev
mailing list