RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java
Peter Levart
peter.levart at gmail.com
Wed Jun 29 11:22:55 UTC 2016
Hi Kim,
Let me chime-in although it's a bit late...
I think this is a good change to finally get rid of OOME problems in
this area.
On 06/28/2016 07:45 PM, Kim Barrett wrote:
>> On Jun 28, 2016, at 5:33 AM, Per Liden <per.liden at oracle.com> wrote:
>> Patch looks good. The only thing I don't feel qualified to review is the initialization order change in thread.cpp, so I'll let others comments on that.
> Thanks. I’ll be following up on that area.
>
>> I like the pop-one-reference-at-a-time semantics, which simplifies things a lot and keeps the interface nice and clean. I was previously afraid that it might cause a noticeable performance degradation compared to lifting the whole list into Java in one go, but your testing seem to prove that's not the case.
> I was concerned about that too, and had tried a different approach that also still supported the existing "some callers wait and others don’t" API, but it was a bit messy. Coleen convinced me to try this (since it was easy) and do the measurement, and it worked out well.
>
The repeated JNI invocations do not present a big overhead, but it is
measurable. For example, the following benchmark measures the time it
takes after a GC cycle for a bunch of references to be transfered to
Java side, enqueued into ReferenceQueue and dequeued from it:
http://cr.openjdk.java.net/~plevart/misc/PendingReferenceHandling/ReferenceEnqueueBench.java
The results on my i7-4771 PC:
Original JDK 9:
Benchmark (refCount) Mode Cnt
Score Error Units
ReferenceEnqueueBench.dequeueReferences 100000 ss 100 38410.515
± 1011.769 us/op
Patched (by Kim):
Benchmark (refCount) Mode Cnt
Score Error Units
ReferenceEnqueueBench.dequeueReferences 100000 ss 100 42197.522
± 1161.451 us/op
So, about 10% worse for this benchmark.
Transfering the whole list in one JNI invocation has the potential for
further optimizations on the Java side (like handling the whole popped
list privately without additional synchronization - if we ever find a
way for java.nio.Bits to wait for it reliably - or even enqueue-ing a
chunk of consecutive references destined for the same queue using a
single synchronized action on the queue, etc...)
If the JNI API was something like the following:
/* Atomically pop the pending reference list if wholeList is true,
* or just next pending reference if wholeList is false.
* If wait is true and the pending reference list is empty, blocks
until
* it becomes non-empty, or returns null if wait is false.
*/
private static native Reference<?> popReferencePendingList(boolean
wholeList, boolean wait);
Then not too much complication is needed in Reference.java (diff to
current JDK 9 sources) to consume this API and already have some benefit
from it:
http://cr.openjdk.java.net/~plevart/misc/PendingReferenceHandling/webrev/
There is a possible race here between ReferenceHandler calling
tryHandlePending(true) and java.nio.Bits calling tryHandlePending(false)
that can make the method return false to java.nio.Bits when
ReferenceHandler has just popped the whole list and not yet installed it
in the private pendingList field, but java.nio.Bits makes several
retries in this case with exponentially increasing delays, so that does
not currently present a problem.
java/nio/Buffer/DirectBufferAllocTest.java test passes with this change.
And the above benchmark shows improvement instead of regression:
Proposed (my Peter):
Benchmark (refCount) Mode Cnt
Score Error Units
ReferenceEnqueueBench.dequeueReferences 100000 ss 100 34134.977
± 1274.753 us/op
So what do you think? Is it worth the little additional logic on the
Java side?
Regards, Peter
More information about the core-libs-dev
mailing list