Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently
Peter Levart
peter.levart at gmail.com
Thu Mar 24 16:17:07 UTC 2016
Hi Kim,
On 03/23/2016 09:40 PM, Kim Barrett wrote:
> I don't think there's any throughput penalty for a long timeout. The
> proper response to waitForCleanups returning false (assuming the epoch
> was obtained early and passed as an argument) is OOME. I really doubt
> the latency for reporting OOME is of critical importance.
The above assumption is not entirely correct. The correct response to
waitForCleanups returning false should be at least one attempt to
trigger GC reference discovery 1st and only after that it should be
OOME. Suppose a program tries to allocate direct memory above the limit.
Waiting for cleanups to happen might be very long if there's no heap
memory pressure although there might be already lots of unreachable
direct buffers on the heap.
So guessing the right timeout before attempting to trigger GC is not
trivial. If you make it to small, there will be excessive GCs triggered
and throughput will suffer. If you make it to long, throughput will
suffer again.
Nevertheless I managed to create a variant that self-adjusts the timeout
based on the last successful wait time. At least with the
DirectBufferAllocTest using 16 or 32 allocating threads (on 4-core CPU)
the throughput is comparable as before and what's important, the test
passes:
java -XX:MaxDirectMemorySize=128m -cp out DirectBufferAllocTest -r 600
-t 16 -p 5000
Allocating direct ByteBuffers with capacity 1048576 bytes, using 16
threads for 600 seconds, printing the average per-thread latency of 5000
consecutive allocations...
Thread 11: 1.94 ms/allocation
Thread 6: 1.97 ms/allocation
Thread 12: 2.05 ms/allocation
Thread 0: 2.10 ms/allocation
Thread 7: 2.15 ms/allocation
Thread 3: 2.16 ms/allocation
Thread 1: 2.26 ms/allocation
Thread 5: 2.32 ms/allocation
Thread 2: 2.33 ms/allocation
Thread 4: 2.34 ms/allocation
Thread 13: 2.36 ms/allocation
Thread 9: 2.38 ms/allocation
Thread 14: 2.40 ms/allocation
Thread 10: 2.40 ms/allocation
Thread 8: 2.42 ms/allocation
Thread 15: 2.44 ms/allocation
Thread 6: 1.72 ms/allocation
Thread 11: 1.75 ms/allocation
Thread 12: 1.86 ms/allocation
Thread 0: 1.86 ms/allocation
Thread 3: 1.94 ms/allocation
Thread 7: 2.07 ms/allocation
Thread 1: 2.08 ms/allocation
Thread 2: 2.12 ms/allocation
Thread 4: 2.14 ms/allocation
Thread 5: 2.16 ms/allocation
Thread 9: 2.13 ms/allocation
Here's the webrev:
http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.10.part2/
So what do you think?
Regards, Peter
>
> That is, the caller looks something like (not even pretending to write
> Java)
>
> alloc = tryAllocatation(allocSize)
> if alloc != NULL
> return alloc
> endif
> // Maybe add a retry+wait with a short timeout here,
> // to allow existing cleanups to run before requesting
> // another gc. Not clear that's really worthwhile, as
> // it only comes up when we get here just after a gc
> // and the resulting cleanups are not yet all processed.
> System.gc()
> while true
> epoch = getEpoch()
> alloc = tryAllocation(allocSize)
> if alloc != NULL
> return alloc
> elif !waitForCleanup(epoch)
> throw OOME // No cleanup progress for a while
> endif
> end
More information about the core-libs-dev
mailing list