RFR (S) CR 6857566: (bf) DirectByteBuffer garbage creation can outpace reclamation

Fri Oct 4 07:54:50 UTC 2013

Hi Aleksey,

I played with reference handling code and got the following idea: 
Instead of iterating over the set of active Cleaners looking for those 
that were cleared by the VM, make ReferenceQueue.poll/remove help 
ReferenceHandler thread in enqueue-ing the references. This assumes VM 
links the References into a discovered list at the same time as clearing 
them. Here's a prototype of this approach:

http://cr.openjdk.java.net/~plevart/jdk8-tl/Cleaners/webrev.01/

It is maybe to aggressive to hook helping enqueue references on the 
public ReferenceQueue.poll/remove methods which affects other code too, 
but that could be changed (use package-private API between 
ReferenceQueue and Cleaner). With this variant, I was not able to fail 
the DirectBufferTest on my machine (4 cores i7) with 1,2,4,8,16,32,64 
threads and -XX:MaxDirectMemorySize=100m. It sometimes fails quickly at 
128 threads and sometimes passes 60 seconds without failure. There's 
certainly room for improvement. Without the patch it fails after ~500 
iterations as soon as 2 threads are used.

So what do you think of the approach in general? You see, I tried to 
avoid Thread.sleep() calls to prove the approach is very predictable 
even without them. The help-enqueue-references code is executed out of 
ReferenceQueue.poll/remove synchronized blocks, so there is no guarantee 
that all pending Cleaners have been processed before giving-up with 
OOME. Adding a short Thread.sleep() in the Bits loop:

             System.gc();
             try {
                 Thread.sleep(100L);
             }
             catch (InterruptedException x) {}
             cleans = Cleaner.assistCleanup();

Might help. It could even be exponential backoff.

Regards, Peter

On 10/03/2013 02:40 PM, Aleksey Shipilev wrote:
> On 10/03/2013 04:32 PM, Paul Sandoz wrote:
>> Alexsey, what do you observe if you revert back Cleaner to a
>> PhantomReference and retain QUEUE/CLEANERS but not
>> assistCleanupSlow?
> I observed the minuscule probability (my estimate is <0.1%) we hit the
> OOME with the original test. This is literally the very aggressive
> fallback strategy.
>
> -Aleksey.