RFR: 8187033: [PPC] Imporve performance of ObjectStreamClass.getClassDataLayout()
Kazunori Ogata
OGATAK at jp.ibm.com
Fri Sep 1 06:53:23 UTC 2017
Hi Aleksey and Hans,
Thank you for your comments. I'll try to see how much Unsafe approach
improves performance.
I'm now thinking the approach to use final that Aleksey mentioned in the
first reply is a good one. I checked the JIT generated code. It puts
MEMBAR-store-store and MEMBAR-release before storing ClassDataSlot array
to the final field "slots", and there is no MEMBAR when reading dataLayout
or dataLayout.slots. Since dataLayout is heavily read than written, I
think it is preferable if we can put all overhead into the writing side.
But I'll see how performance changes with lwsync on reading it.
private DataLayout dataLayout;
/**
* Class to ensure elements of ClassDataSlot[] are visible to other
* threads. The "final" qualifier of the variable slots is necessary.
*/
private static class DataLayout {
final ClassDataSlot[] slots;
DataLayout(ClassDataSlot[] s) {
slots = s;
}
}
ClassDataSlot[] getClassDataLayout() throws InvalidClassException {
// REMIND: synchronize instead of relying on volatile?
if (dataLayout == null) {
ClassDataSlot[] slots = getClassDataLayout0();
dataLayout = new DataLayout(slots);
}
return dataLayout.slots;
}
Regards,
Ogata
From: Aleksey Shipilev <shade at redhat.com>
To: Hans Boehm <hboehm at google.com>
Cc: Kazunori Ogata <OGATAK at jp.ibm.com>, core-libs-dev
<core-libs-dev at openjdk.java.net>
Date: 2017/09/01 15:14
Subject: Re: RFR: 8187033: [PPC] Imporve performance of
ObjectStreamClass.getClassDataLayout()
On 08/31/2017 09:39 PM, Hans Boehm wrote:
>> I guess you can make VarHandle.fullFence() between
getClassDataLayout0() and storing it to the
>> non-volatile field...
>
> ... with the usual warning about this sort of thing:
>
> According to the JMM, it's not guaranteed to work, because the
reader-side guarantees are not
> there. In practice, you're relying on dependency-based ordering, which
the compiler is currently
> unlikely to break in this case. But future implementations may.
Right.
> I presume the real concern here is the cost on the reader side?
Presumably that could be reduced
> with a VarHandle getAcquire(). I believe that eliminates the
heavy-weight sync, and just keeps
> an lwsync. Imperfect, but better.
Oh right! This is exactly why acq/rel exist. Since OSC is a heavily used
class, the Unsafe
counterparts might be better:
private static final Unsafe U = ...;
private static final long CDS_OFFSET = U.objectFieldOffset(...);
private volatile ClassDataSlot[] dataLayout; // volatile for safety of
naked reads
ClassDataSlot[] getClassDataLayout() throws InvalidClassException {
ClassDataSlot[] slots = U.getObjectAcquire(this, CDS_OFFSET);
if (slots == null) {
slots = getClassDataLayout0();
U.putObjectRelease(this, CDS_OFFSET, slots);
}
return slots;
}
Ogata, please try if that indeed helps on POWER?
Thanks,
-Aleksey
[attachment "signature.asc" deleted by Kazunori Ogata/Japan/IBM]
More information about the core-libs-dev
mailing list