RFR: 8007806: Need a Throwables performance counter
David Holmes
david.holmes at oracle.com
Sun Feb 24 10:31:01 UTC 2013
On 24/02/2013 6:50 PM, Peter Levart wrote:
> Hi David,
>
> I thought it was ok to pass null, but I don't know the "portability"
> issues in-depth. The javadoc for Unsafe says:
>
> /"This method refers to a variable by means of two parameters, and so it
> provides (in effect) a double-register addressing mode for Java
> variables. When the object reference is null, this method uses its
> offset as an absolute address. This is similar in operation to methods
> such as getInt(long), which provide (in effect) a single-register
> addressing mode for non-Java variables. However, because Java variables
> may have a different layout in memory from non-Java variables,
> programmers should not assume that these two addressing modes are ever
> equivalent. Also, programmers should remember that offsets from the
> double-register addressing mode cannot be portably confused with longs
> used in the single-register addressing mode."/
That is the doc for getXXX but not for getAndAddXXX or
compareAndSwapXXX. You can't have null here:
UNSAFE_ENTRY(jboolean, Unsafe_CompareAndSwapLong(JNIEnv *env, jobject
unsafe, jobject obj, jlong offset, jlong e, jlong x))
UnsafeWrapper("Unsafe_CompareAndSwapLong");
Handle p (THREAD, JNIHandles::resolve(obj));
jlong* addr = (jlong*)(index_oop_from_field_offset_long(p(), offset));
if (VM_Version::supports_cx8())
return (jlong)(Atomic::cmpxchg(x, addr, e)) == e;
else {
jboolean success = false;
ObjectLocker ol(p, THREAD);
if (*addr == e) { *addr = x; success = true; }
return success;
}
UNSAFE_END
David
-----
> Does anybody know the in-depth interpretation of the above? Is it only
> the particular Java/native type differences (for example, endianess of
> variables) that these two addressing modes might interpret differently
> or something else too?
>
> Regards, Peter
>
>
> On 02/24/2013 12:39 AM, David Holmes wrote:
>> Peter,
>>
>> In your use of Unsafe you pass "null" as the object. I'm pretty
>> certain you can't pass null here. Unsafe operates on fields or array
>> elements.
>>
>> David
>>
>> On 24/02/2013 5:39 AM, Peter Levart wrote:
>>> Hi Nils,
>>>
>>> If the counters are updated frequently from multiple threads, there
>>> might be contention/scalability issues. Instead of synchronization on
>>> updates, you might consider using atomic updates provided by
>>> sun.misc.Unsafe, like for example:
>>>
>>>
>>> Index: jdk/src/share/classes/sun/misc/PerfCounter.java
>>> ===================================================================
>>> --- jdk/src/share/classes/sun/misc/PerfCounter.java
>>> +++ jdk/src/share/classes/sun/misc/PerfCounter.java
>>> @@ -25,6 +25,8 @@
>>>
>>> package sun.misc;
>>>
>>> +import sun.nio.ch.DirectBuffer;
>>> +
>>> import java.nio.ByteBuffer;
>>> import java.nio.ByteOrder;
>>> import java.nio.LongBuffer;
>>> @@ -50,6 +52,8 @@
>>> public class PerfCounter {
>>> private static final Perf perf =
>>> AccessController.doPrivileged(new Perf.GetPerfAction());
>>> + private static final Unsafe unsafe =
>>> + Unsafe.getUnsafe();
>>>
>>> // Must match values defined in
>>> hotspot/src/share/vm/runtime/perfdata.hpp
>>> private final static int V_Constant = 1;
>>> @@ -59,12 +63,14 @@
>>>
>>> private final String name;
>>> private final LongBuffer lb;
>>> + private final DirectBuffer db;
>>>
>>> private PerfCounter(String name, int type) {
>>> this.name = name;
>>> ByteBuffer bb = perf.createLong(name, U_None, type, 0L);
>>> bb.order(ByteOrder.nativeOrder());
>>> this.lb = bb.asLongBuffer();
>>> + this.db = bb instanceof DirectBuffer ? (DirectBuffer) bb :
>>> null;
>>> }
>>>
>>> static PerfCounter newPerfCounter(String name) {
>>> @@ -79,23 +85,44 @@
>>> /**
>>> * Returns the current value of the perf counter.
>>> */
>>> - public synchronized long get() {
>>> + public long get() {
>>> + if (db != null) {
>>> + return unsafe.getLongVolatile(null, db.address());
>>> + }
>>> + else {
>>> + synchronized (this) {
>>> - return lb.get(0);
>>> - }
>>> + return lb.get(0);
>>> + }
>>> + }
>>> + }
>>>
>>> /**
>>> * Sets the value of the perf counter to the given newValue.
>>> */
>>> - public synchronized void set(long newValue) {
>>> + public void set(long newValue) {
>>> + if (db != null) {
>>> + unsafe.putOrderedLong(null, db.address(), newValue);
>>> + }
>>> + else {
>>> + synchronized (this) {
>>> - lb.put(0, newValue);
>>> - }
>>> + lb.put(0, newValue);
>>> + }
>>> + }
>>> + }
>>>
>>> /**
>>> * Adds the given value to the perf counter.
>>> */
>>> - public synchronized void add(long value) {
>>> - long res = get() + value;
>>> + public void add(long value) {
>>> + if (db != null) {
>>> + unsafe.getAndAddLong(null, db.address(), value);
>>> + }
>>> + else {
>>> + synchronized (this) {
>>> + long res = lb.get(0) + value;
>>> - lb.put(0, res);
>>> + lb.put(0, res);
>>> + }
>>> + }
>>> }
>>>
>>> /**
>>>
>>>
>>>
>>> Testing the PerfCounter.increment() method in a loop on multiple threads
>>> sharing the same PerfCounter instance, for example, on a 4-core Intel i7
>>> machine produces the following results:
>>>
>>> #
>>> # PerfCounter_increment: run duration: 5,000 ms, #of logical CPUS: 8
>>> #
>>> 1 threads, Tavg = 19.02 ns/op (? = 0.00 ns/op)
>>> 2 threads, Tavg = 109.93 ns/op (? = 6.17 ns/op)
>>> 3 threads, Tavg = 136.64 ns/op (? = 2.99 ns/op)
>>> 4 threads, Tavg = 293.26 ns/op (? = 5.30 ns/op)
>>> 5 threads, Tavg = 316.94 ns/op (? = 6.28 ns/op)
>>> 6 threads, Tavg = 686.96 ns/op (? = 7.09 ns/op)
>>> 7 threads, Tavg = 793.28 ns/op (? = 10.57 ns/op)
>>> 8 threads, Tavg = 898.15 ns/op (? = 14.63 ns/op)
>>>
>>>
>>> With the presented patch, the results are a little better:
>>>
>>> #
>>> # PerfCounter_increment: run duration: 5,000 ms, #of logical CPUS: 8
>>> #
>>> # Measure:
>>> 1 threads, Tavg = 5.22 ns/op (? = 0.00 ns/op)
>>> 2 threads, Tavg = 34.51 ns/op (? = 0.60 ns/op)
>>> 3 threads, Tavg = 54.85 ns/op (? = 1.42 ns/op)
>>> 4 threads, Tavg = 74.67 ns/op (? = 1.71 ns/op)
>>> 5 threads, Tavg = 94.71 ns/op (? = 41.68 ns/op)
>>> 6 threads, Tavg = 114.80 ns/op (? = 32.10 ns/op)
>>> 7 threads, Tavg = 136.70 ns/op (? = 26.80 ns/op)
>>> 8 threads, Tavg = 158.48 ns/op (? = 9.93 ns/op)
>>>
>>>
>>> The scalability is not much better, but the raw speed is, so it might
>>> present less contention when used in real-world code. If you wanted even
>>> better scalability, there is a new class in JDK8, the
>>> java.util.concurrent.LongAdder. But that doesn't buy atomic "set()" -
>>> only "add()". And it can't update native-memory variables, so it could
>>> only be used for add-only counters and in conjunction with a background
>>> thread that would periodically flush the sum to the native memory....
>>>
>>> Regards, Peter
>>>
>>>
>>> On 02/08/2013 06:10 PM, Nils Loodin wrote:
>>>> It would be interesting to know the number of thrown throwables in the
>>>> JVM, to be able to do some high level application diagnostics /
>>>> statistics. A good way to put this number would be a performance
>>>> counter, since it is accessible both from Java and from the VM.
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007806
>>>> http://cr.openjdk.java.net/~nloodin/8007806/webrev.00/
>>>>
>>>> Regards,
>>>> Nils Loodin
>>>
>
More information about the core-libs-dev
mailing list