Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

Peter Levart peter.levart at gmail.com
Thu Mar 24 16:17:07 UTC 2016


Hi Kim,

On 03/23/2016 09:40 PM, Kim Barrett wrote:
> I don't think there's any throughput penalty for a long timeout.  The
> proper response to waitForCleanups returning false (assuming the epoch
> was obtained early and passed as an argument) is OOME.  I really doubt
> the latency for reporting OOME is of critical importance.

The above assumption is not entirely correct. The correct response to 
waitForCleanups returning false should be at least one attempt to 
trigger GC reference discovery 1st and only after that it should be 
OOME. Suppose a program tries to allocate direct memory above the limit. 
Waiting for cleanups to happen might be very long if there's no heap 
memory pressure although there might be already lots of unreachable 
direct buffers on the heap.

So guessing the right timeout before attempting to trigger GC is not 
trivial. If you make it to small, there will be excessive GCs triggered 
and throughput will suffer. If you make it to long, throughput will 
suffer again.

Nevertheless I managed to create a variant that self-adjusts the timeout 
based on the last successful wait time. At least with the 
DirectBufferAllocTest using 16 or 32 allocating threads (on 4-core CPU) 
the throughput is comparable as before and what's important, the test 
passes:

java -XX:MaxDirectMemorySize=128m -cp out DirectBufferAllocTest -r 600 
-t 16 -p 5000
Allocating direct ByteBuffers with capacity 1048576 bytes, using 16 
threads for 600 seconds, printing the average per-thread latency of 5000 
consecutive allocations...
Thread 11:  1.94 ms/allocation
Thread  6:  1.97 ms/allocation
Thread 12:  2.05 ms/allocation
Thread  0:  2.10 ms/allocation
Thread  7:  2.15 ms/allocation
Thread  3:  2.16 ms/allocation
Thread  1:  2.26 ms/allocation
Thread  5:  2.32 ms/allocation
Thread  2:  2.33 ms/allocation
Thread  4:  2.34 ms/allocation
Thread 13:  2.36 ms/allocation
Thread  9:  2.38 ms/allocation
Thread 14:  2.40 ms/allocation
Thread 10:  2.40 ms/allocation
Thread  8:  2.42 ms/allocation
Thread 15:  2.44 ms/allocation
Thread  6:  1.72 ms/allocation
Thread 11:  1.75 ms/allocation
Thread 12:  1.86 ms/allocation
Thread  0:  1.86 ms/allocation
Thread  3:  1.94 ms/allocation
Thread  7:  2.07 ms/allocation
Thread  1:  2.08 ms/allocation
Thread  2:  2.12 ms/allocation
Thread  4:  2.14 ms/allocation
Thread  5:  2.16 ms/allocation
Thread  9:  2.13 ms/allocation

Here's the webrev:

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.10.part2/

So what do you think?

Regards, Peter

>
> That is, the caller looks something like (not even pretending to write
> Java)
>
>    alloc = tryAllocatation(allocSize)
>    if alloc != NULL
>      return alloc
>    endif
>    // Maybe add a retry+wait with a short timeout here,
>    // to allow existing cleanups to run before requesting
>    // another gc.  Not clear that's really worthwhile, as
>    // it only comes up when we get here just after a gc
>    // and the resulting cleanups are not yet all processed.
>    System.gc()
>    while true
>      epoch = getEpoch()
>      alloc = tryAllocation(allocSize)
>      if alloc != NULL
>        return alloc
>      elif !waitForCleanup(epoch)
>        throw OOME  // No cleanup progress for a while
>      endif
>    end




More information about the core-libs-dev mailing list