Hi Kim, On 06/29/2016 01:22 PM, Peter Levart wrote:
Transfering the whole list in one JNI invocation has the potential for further optimizations on the Java side (like handling the whole popped list privately without additional synchronization - if we ever find a way for java.nio.Bits to wait for it reliably - or even enqueue-ing a chunk of consecutive references destined for the same queue using a single synchronized action on the queue, etc...)
Just to show what I mean, here's a simple optimization that doesn't use a private pendingList of references shared among callers of tryHandlePanding(true/false), but instead uses course-grained synchronization and waiting for tryHandlePanding(false) callers while ReferenceHandler is privately processing the whole list of pending references: http://cr.openjdk.java.net/~plevart/misc/PendingReferenceHandling/webrev.02/ This further improves benchmark results and it still passes the DirectBufferAllocTest: Original JDK 9: Benchmark (refCount) Mode Cnt Score Error Units ReferenceEnqueueBench.dequeueReferences 100000 ss 100 38410.515 ± 1011.769 us/op Patched (by Kim): Benchmark (refCount) Mode Cnt Score Error Units ReferenceEnqueueBench.dequeueReferences 100000 ss 100 42197.522 ± 1161.451 us/op Proposed (by Peter, webrev): Benchmark (refCount) Mode Cnt Score Error Units ReferenceEnqueueBench.dequeueReferences 100000 ss 100 34134.977 ± 1274.753 us/op Proposed (by Peter, webrev.02): Benchmark (refCount) Mode Cnt Score Error Units ReferenceEnqueueBench.dequeueReferences 100000 ss 100 27935.929 ± 1128.678 us/op Regards, Peter