tutorial on using Cleaner-based finalization

Peter Levart peter.levart at gmail.com
Sun May 7 19:41:28 UTC 2017

Hi Rick,

On 05/07/2017 08:59 PM, Rick Hillegas wrote:
> Thanks for investigating that sample case, Peter, and for providing a 
> candidate solution. It is much appreciated. Your expert experience 
> confirms my amateur hunch that this migration is a non-trivial 
> re-factorization which merits its own mini-project. I have logged a 
> new technical debt issue to track that effort: 
> https://issues.apache.org/jira/browse/DERBY-6932
> If I correctly understand what has been said so far, then I think that 
> we should be able to get away with a single Cleaner instance for each 
> component, essentially, one for each jar file which we build today. 
> After we convert Derby into jigsaw modules, that might translate into 
> one Cleaner instance for each module. I don't see any advantage to the 
> extra complexity of a separate Cleaner instance for each class which 
> currently implements finalize().
> Does that sound reasonable to you?

Each Cleaner instance means a separate thread which processes cleanup 
actions. If you create an instance in some common module and expose it 
to other modules, the same instance can be used in other modules that 
depend on it.

Regards, Peter

> Thanks,
> -Rick
> On 5/6/17 2:19 PM, Peter Levart wrote:
>> Thinking of this for some more time...
>> Although this is a nice exercise in converting finalize() to Cleaner 
>> API, I strongly suspect that ClientConnection.finalize() is doing 
>> unnecessary things. What it does is it:
>> - prints some trace message to logWriter
>> - closes logWriter (whatever it is)
>> - closes raw socket input and output streams
>> - closes the socket
>> - notifies listeners
>> I believe all this is unnecessary (apart from notifying listeners 
>> perhaps?) as those objects already have their own cleanup mechanism 
>> when they are left behind. I believe there would be no resource leaks 
>> if ClientSocket.finalize() was simply removed. Before doing the 
>> conversion from finalize() to Cleaner API one should always ask 
>> himself whether finalize() method is actually needed. All 3rd party 
>> code (with notable exceptions which use JNI) are based on Java SE 
>> APIs and these APIs already take care of resources held by objects 
>> that are left behind. There's usually no need to do the same in the 
>> higher layers of 3rd party code.
>> What do you think?
>> Regards, Peter
>> On 05/06/2017 10:01 PM, Peter Levart wrote:
>>> Hi Rick and others,
>>> On 05/04/2017 06:48 PM, Lance Andersen wrote:
>>>> Here are a few examples I believe:
>>>> http://svn.apache.org/repos/asf/db/derby/code/trunk/java/client/org/apache/derby/client/am/ClientConnection.java
>>>> http://svn.apache.org/repos/asf/db/derby/code/trunk/java/client/org/apache/derby/client/ClientPooledConnection.java
>>>> http://svn.apache.org/repos/asf/db/derby/code/trunk/java/engine/org/apache/derby/impl/jdbc/EmbedPreparedStatement.java
>>> I took a bite at the 1st one (ClientConnection). Here's the result:
>>> http://cr.openjdk.java.net/~plevart/misc/Cleaner/derby/ClientConnection_finalize2cleaner.patch
>>> I haven't tested this, but I believe it should work. It was quite a 
>>> challenge, because of the way current ClientConnection code is 
>>> structured. I tried to make the patch not incompatibly change public 
>>> API of ClientConnection and related classes and I almost succeeded. 
>>> The problematic part was the protected boolean 
>>> ClientConnection.closed_ flag. If there is any sublclass of 
>>> ClientConnection (apart from NetConnection which is derby code) that 
>>> modifies this field, you are out of luck as changing (not only 
>>> reading) this field directly my have an undesirable consequence (or 
>>> it may not, since the only thing that changing these field to false 
>>> would do is it would redundantly force performing the cleanup 
>>> action. If the cleanup is idempotent, then all is OK).
>>> Further complication with ClientConnection is that it maintains a 
>>> split state - some of it resides in ClientConnection and subclasses 
>>> (such as NetConnection) and some of it in embedded object of Agent 
>>> class and subclasses (such as NetAgent). Both - some of this state 
>>> from connection object and some from the agent state are needed to 
>>> perform the cleanup that is currently executed from the connection 
>>> finalize() method. When using Cleaner API, we have to capture this 
>>> state from both places (or more since each class has a hierarchy) 
>>> and then arrange for cleanup action to use this state. Captured 
>>> state can not reference the tracked object (ClientConnection in this 
>>> case) either directly or indirectly since then it will never be 
>>> GCed. When cleanup action is run, the tracked object is already 
>>> unreachable - this is the main difference from finalize() where 
>>> there is a phase in object's life-cycle where it is still reachable, 
>>> albeit guaranteed only from the thread executing finalize() method. 
>>> We can not capture the Agent object either, since it maintains a 
>>> reference back to the ClientConnection object. All this is further 
>>> complicated by the fact that captured state is mutable and we have 
>>> to arrange for it to be mutated in both places. If the mutable state 
>>> is captured by reference and the instance containing it never 
>>> changes during the lifetime of the tracked container object, then it 
>>> is easy - we just capture the object after the tracked container 
>>> instance is constructed. If the captured state includes mutable 
>>> fields directly in the tracked container object, then we must 
>>> arrange for them to be synchronously mutated in two places. Such 
>>> fields are:
>>> - ClientConnection.open_ (replicated in 
>>> ClientConnection.CleanupAction.open)
>>> - Agent.logWriter_ (replicated in Agent.CleanupAction.logWriter)
>>> - NetAgent.rawSocketInputStream_ (replicated in 
>>> NetAgent.NetCleanupAction.rawSocketInputStream)
>>> - NetAgent.rawSocketOutputStream_ (replicated in 
>>> NetAgent.NetCleanupAction.rawSocketOutputStream)
>>> Fortunately all of this state is encapsulated with protected field 
>>> ClientConnection.open_ being an exception.
>>> Note that Cleaner API also allows for cleanup action to be triggered 
>>> explicitly, which then de-registers it. This is one of its 
>>> advantages over finalize() where you can not deregister an object 
>>> when it is already explicitly closed for example. finalize() will 
>>> always be called even if closed explicitly. If you create lots of 
>>> finalizable objects (such as connections, statements, etc...) and 
>>> promptly close() them, they still wait for finalization and use 
>>> resources (heap, CPU when GC searches for them, ReferenceHandler 
>>> enqueues them, and finally finalize() method which is executed after 
>>> the fact). Explicit triggering and de-registration of the cleanup 
>>> action is performed in the ClientConnection.closeResourcesX() 
>>> (called from public close() and closeResources()) after the 
>>> connection has already being marked as closed. Cleanup action will 
>>> be a no-op at this point, but it will also be de-registered. This is 
>>> important to not bother GC with reference processing when it is not 
>>> needed any more. In situations whre cleanup action logic is the same 
>>> as explicit closing logic (in the case of ClientConnection it is 
>>> not), close() method could just invoke cleanable.clean() and 
>>> delegate the meat of processing to the cleanup action.
>>> Hope this non trivial example helps illustrate what is needed when 
>>> converting finalize() to Cleanup API.
>>> Regards, Peter

More information about the core-libs-dev mailing list