RFR: JDK-8149925 We don't need jdk.internal.ref.Cleaner any more

Mon Mar 28 17:18:48 UTC 2016

Hi Mandy, Kim, Per and Roger

I'd like to continue the discussion about the 2nd part of removing 
jdk.internal.ref.Cleaner in this discussion thread.

There was some discussion about whether to synchronize with 
ReferenceHandler thread and wait for it to enqueue the Reference(s) or 
simply detect that there are no more pending Reference(s) by timing out 
on waiting for cleanup actions in discussion thread: "Re: Analysis on 
JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails 
intermittently". Based on that discussion, I have prepared a webrev that 
uses an approach where the detection is performed using timeout:

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.10.part2/

While this webrev passes the DirectBufferAllocTest, I don't have a good 
feeling about this approach since it is not very robust. I can imagine 
situations where it would not behave optimally - it would either trigger 
reference discovery (System.gc()) more frequently that necessary or it 
would cause delays in execution. So I still prefer the approach where 
allocating thread(s) explicitly synchronize with ReferenceHandler thread 
and wait for it to enqueue pending Reference(s). Luckily this can be 
performed in an easy way (as I will show you shortly). Waiting on 
discovery of pending references by ReferenceHandler thread and handing 
them to it could be moved to native code so that no notification would 
have to be performed in native code from the ReferenceHandler thread to 
the allocating thread(s).

But first, let me reply to Mandy's comments...

On 03/25/2016 11:20 PM, Mandy Chung wrote:
>> On Mar 19, 2016, at 7:00 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>
>> Here's the webrev:
>>
>>      http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.08.part2/
>>
>>> On 03/07/2016 07:35 PM, Mandy Chung wrote:
>>>> I studied webrev.06priv and the history of JDK-6857566.
>>>>
>>>> I’m not comfortable for any arbitrary thread to handle the enqueuing of the pending references (this change is more about the fix for JDK-6857566).
>>>>
>>> Why? A Thread is a Thread is a Thread... When legacy Cleaner is removed, ReferenceHandler thread will be left with swapping pointers only - no custom code will be involved. The only things I can think of against using arbitrary thread are:
>>> :
> My uncomfort was the fix for JDK-6857566 - both enqueuing pending ref and invoking the cleaning code in an arbitrary thread.
>
> Looking at it again - enqueuing the pending reference is not so much of a concern (simply updating the link) but the common cleaner could be used by other code that may only expect to be invoked in system thread that’s still my concern (thinking of thread locals).

As you'll see in the webrev below, enqueueing is performed solely be 
ReferenceHandler thread. Allocating thread(s) just wait for it to do its 
job. There's a little synchronization action performed at the end of 
enqueueing a chunk of pending references that notifies waiters 
(allocating threads) so that they can continue. This actually improves 
throughput (compared to helping enqueue Reference(s) one by one) because 
there's not much actual work to be done (just swapping pointers) so 
synchronization dominates. The goal here is to minimize synchronization 
among threads and by executing enqueuing of the whole bunch of pending 
references in private by a single thread achieves a reduction in 
synchronization when lots of Reference(s) are discovered at once - 
precisely the situation when it matters.

OTOH helping the Cleaner thread is beneficial as cleanup actions take 
time to execute and this is the easiest way to retry allocation while 
there's still chance it will succeed. As the common Cleaner is using 
InnocuousThread, cleanup actions can't rely on any thread locals to be 
preserved from invocation to invocation anyway - they are cleared after 
each cleanup action so each action gets empty thread locals. We could 
simulate this in threads that help execute cleanup actions by saving 
thread-locals to local variables, clearing thread-locals, executing 
cleanup action and then restoring thread-locals from local variables. 
Mandy, if you think this is important I'll add such save/clear/restore 
code to appropriate place.

>    On the other hand, invoking Deallocator::run (deallocating the native memory) in arbitrary threads has no problem.  Consider me being paranoid of the fix for JDK-6857566.  The current list of clients using CleanerFactory::cleaner may be safe being called from arbitrary threads but I can’t say what will be added in the future.

Right, save/clear/restore thread locals then (left for next webrev)...

>>>> The allocating thread may do a System.gc() that may discover phantom reachable references.  All it’s interested is only the direct byte buffer ones so that it can deallocate the native memory.  What is the downside of having a dedicated Cleaner for direct byte buffer that could special case for it?
>>> A dedicated Cleaner for direct buffers might be a good idea if other uses of shared Cleaner in JDK become heavy. So that helping process Cleanable(s) does not involve other unrelated Cleanable(s). But it comes with a price of another dedicated background thread.
>>>
> Perhaps provide one Cleaner specific for native memory deallocation or anything safe to be called in arbitrary thread.  It could provide the entry point for the allocating thread to assist the cleaning (i.e. Bits::reserveMemory could call it).  That will make it explicit that this cleaner provides explicit control for other threads to assist the cleaning action (and JavaLangRefAccess would only be used by this special cleaner and not in NIO).
>
> All clients of Unsafe.freeMemory could use that special cleaner for native memory deallocation use such as IOVecWrapper, DirectByteBuffer, Marlin’s OffHeapArray.
>
> The common cleaner would be kept for other things to use and it should be lazily created to avoid another thread.
>
> Does this sound reasonable?
>
> Mandy
>

Of course. Having specialized Cleaner(s) with additional capability 
requires extension to the Cleaner API for some cleaners. Unfortunately 
java.lang.ref.Cleaner is a final class.

Here's what I propose: by transforming java.lang.ref.Cleaner into an 
interface implemented by a class in a concealed package 
(jdk.internal.ref.CleanerImpl) the public API can be left unchanged 
while the implementation is actually simplified (there's no injection of 
Cleaner.impl access function into CleanerImpl class needed any more). 
The result of that transformation is also the ability to specify an 
extension interface (ExtendedCleaner)  located in a concealed package so 
it can only be used by system code (java.base and modules to which 
jdk.internal.ref is explicitly exported) and the ability to extend the 
functionality of implementation by subclassing it 
(CleanerImpl.ExtendedImpl). The guts of previous CleanerImpl are simply 
moved into a private nested class CleanerImpl.Task:

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.11.part2/

I'm interested in what Roger has to say about this transformation. It is 
source compatible, but not binary compatible (invokevirtual vs. 
invokeinterface). So it can be safely performed only before JDK 9 ships.

I packed the entire retry-while-helping mechanics into the 
implementation of this ExtendedCleaner interface. java.nio.Bits is 
consequently much simplified. The common cleaner is now ExtendedCleaner 
as other usages besides handling deallocation of native memory are minor 
and are not problematic from the standpoint of arbitrary threads helping 
with cleanup, especially when saving/clearing/restoring of thread-locals 
is implemented. It would not be a problem to provide another instance, 
simple java.lang.ref.Cleaner this time, for other usages if needed.

And now a few words about ReferenceHandler thread and synchronization 
with it (for Kim and Per mostly). I think it should not be a problem to 
move the following two java.lang.ref.Reference methods to native code if 
desired:

     static Reference<?> getPendingReferences(int[] discoveryPhaseHolder)
     static int getDiscoveryPhase()

The 1st one is only invoked by a ReferenceHandler thread while the 2nd 
is invoked by arbitrary thread. The difference between this and 
webrev.09.part2 is that there's no need any more for ReferenceHandler 
thread to notify the thread executing the 2nd method and that there's no 
need for the 2nd method to perform any waiting. It just needs to obtain 
the lock briefly so that it can read the consistent state of two 
fields.  Those two fields are Java static fields currently: 
Reference.pending & Reference.discoveryPhase and those two methods are 
Java methods, but they could be moved to native code if desired to make 
the protocol between VM and Java code more robust.

So Kim, Per, what do you think of supporting those 2 methods in native 
code? Would that present any problem?

With webrev.11.part2 I get a 40% improvement in throughput vs. 
webrev.10.part2 executing DirectBufferAllocTest in 16 allocating threads 
on a 4-core i7 CPU.

Regards, Peter