RFR 9: 8138696 : java.lang.ref.Cleaner - an easy to use alternative to finalization

Tue Dec 8 15:47:29 UTC 2015

Hi,

On 12/08/2015 08:41 AM, Peter Levart wrote:
> (once again, for the list - I can't seem to hit the right button :-)
>
> On 12/08/2015 01:51 PM, David M. Lloyd wrote:
>> Yeah I think that replacing finalize is a bad example.  With 
>> Reference.reachabilityFence() around the corner, if you want finalize 
>> you can (and should, I guess) just use finalize.
>>
>> IMO Cleaners are better when you do not want the instance of an 
>> object to be accessible at all before you clean it - because it 
>> corresponds to mapped memory, a file descriptor, or some other native 
>> resource, usually, and any invocation at all after it is cleaned 
>> would result in Major Problems.  In this case you would put the FD, 
>> memory address/size/etc. on the cleaner, so that the clean method can 
>> do the cleanup.
>
> Cleaners are also more efficient in that:
>
> - they register cleanup actions manually from java code after the 
> referent is constructed. finalize() uses a call-back from VM into Java 
> to register a Finalizer on finalizable referent construction.
> - they can invoke and de-register cleanup actions prematurely by user 
> explicitly invoking Cleanable.clean() and therefore avoid being 
> processed by reference processing machinery (in the VM and in Java). 
> For example, FileInputStream, using this API instead of finalize(), 
> would in majority of cases invoke and de-register the cleanup 
> prematurely as most code makes sure it closes the file after use. 
> finalize() OTOH has no premature de-registration feature.
> - with a fix for "8071507: (ref) Clear phantom reference as soft and 
> weak references do", the opportunity for GC to collect the referent in 
> the same phase as discovering it phantom-reachable opens up. Finalizer 
> OTOH keeps the referent reachable in order to invoke finalize() on it 
> and then clear()s the reference, delaying the collection of referent 
> to the next GC round.
+3

Removing the reliance on finalizers reduces the overhead the VM incurs 
to maintain and process
finalized objects.  It removes the possibility of resurrecting 
references to objects that have been
finalized, most likely in an unusable state, etc.

Cleaners are not the best tool for every cleanup, but are preferable in 
most cases.
>>
>> Also, one hopes that each framework/library/application would only 
>> register one cleaner for its usages, rather than (say) one per class...
>
> There was a debate about whether to provide a common cleaner instance: 
> Cleaner.commonCleaner() (like ForkJoinPool.commonPool()). It could be 
> created lazily, but then it would run forever...
I'll file a separate issue for a common cleaner.
If the common cleaner reference were kept has a weak reference then it 
would
stay alive only as long as any client package/class had a strong 
reference/use for it,
so not necessarily forever.

Roger

>
> Regards, Peter
>
>>
>> On 12/08/2015 06:44 AM, David Holmes wrote:
>>> But thinking more on this approach this is simply not scalable. A
>>> Cleaner per cleanable-class could result in hundreds of threads being
>>> active!
>>>
>>> And this certainly does not seem easier to use than finalization.
>>>
>>> What exactly are the advantages over finalization again?
>>>
>>> Thanks,
>>> David
>>>
>>> On 8/12/2015 10:29 PM, David Holmes wrote:
>>>> On 8/12/2015 10:25 PM, Peter Levart wrote:
>>>>>
>>>>>
>>>>> On 12/08/2015 09:22 AM, David Holmes wrote:
>>>>>> Actually I'm having more doubts about this API.
>>>>>>
>>>>>> Library writers use finalize() as a last ditch cleanup mechanism in
>>>>>> case the user doesn't explicitly call any "cleanup" method. So as a
>>>>>> library writer I would think I am now expected to register my
>>>>>> instances with a Cleaner and provide a Runnable that does what
>>>>>> finalize() would have done. But in that usage pattern the end 
>>>>>> user of
>>>>>> my objects never has any access to my Cleanables so can never call
>>>>>> clean() themselves - instead they should be calling the cleanup
>>>>>> function directly, just as they would previously. So the whole 
>>>>>> "invoke
>>>>>> at most once" for the clean() method seems somewhat unnecessary; and
>>>>>> the way we should write the cleanup method and the Runnable need 
>>>>>> to be
>>>>>> more cleary explained as the idempotentcy of the cleanup needs to be
>>>>>> handled in the library writers code not the Cleaner/Clenable
>>>>>> implementation.
>>>>>>
>>>>>> David
>>>>>
>>>>> Hi David, (once again for the list)
>>>>
>>>> Thanks Peter! :)
>>>>
>>>>> I agree that an example would be most helpful. Here's how a normal
>>>>> finalizable object is typically coded:
>>>>>
>>>>>      public class FinalizeExample implements AutoCloseable {
>>>>>
>>>>>          private boolean closed;
>>>>>
>>>>>          @Override
>>>>>          public synchronized void close() {
>>>>>              if (!closed) {
>>>>>                  closed = true;
>>>>>                  // cleanup actions accessing state of 
>>>>> FinalizeExample,
>>>>> executed at most once
>>>>>              }
>>>>>          }
>>>>>
>>>>>          @Override
>>>>>          protected void finalize() throws Throwable {
>>>>>              close();
>>>>>          }
>>>>>      }
>>>>>
>>>>>
>>>>> Re-factoring to use Cleaner is a process that extracts the state
>>>>> representing native resource from the user-facing class into a 
>>>>> private
>>>>> nested static class and makes the user-facing object just a facade 
>>>>> that
>>>>> has access to the state object and is registered with a Cleaner. The
>>>>> Cleaner.Cleanable instance is also made accessible from the 
>>>>> user-facing
>>>>> object, so it can provide the on-demand cleaning:
>>>>>
>>>>>      public static class CleanerExample implements AutoCloseable {
>>>>>
>>>>>          private static class State implements Runnable {
>>>>>              @Override
>>>>>              public void run() {
>>>>>                  // cleanup actions accessing State, executed at most
>>>>> once
>>>>>              }
>>>>>          }
>>>>>
>>>>>          private static final Cleaner cleaner = Cleaner.create();
>>>>>
>>>>>          private final State state = new State();
>>>>>          private final Cleaner.Cleanable cleanable =
>>>>> cleaner.register(this, state);
>>>>>
>>>>>          @Override
>>>>>          public void close() {
>>>>>              cleanable.clean();
>>>>>          }
>>>>>      }
>>>>
>>>> Thanks for clarifying that - the process is really quite different
>>>> compared to using the finalize() approach. This certainly needs to be
>>>> included in the documentation as an example, and also used to guide 
>>>> the
>>>> javadoc descriptions.
>>>>
>>>> Thanks again,
>>>> David
>>>>
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>>>
>>>>>> On 8/12/2015 6:09 PM, David Holmes wrote:
>>>>>>> Hi Roger,
>>>>>>>
>>>>>>> Sorry I had no choice but to look at this more closely ... and
>>>>>>> apologies
>>>>>>> as this is very late feedback ... I only looked at the API not the
>>>>>>> details of the implementation.
>>>>>>>
>>>>>>> On 8/12/2015 4:50 AM, Roger Riggs wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> Thanks for the comments,
>>>>>>>>
>>>>>>>> Updated the javadoc and webrev with editorial changes.
>>>>>>>>
>>>>>>>> [1]http://cr.openjdk.java.net/~rriggs/webrev-cleaner-8138696/
>>>>>>>> [2]http://cr.openjdk.java.net/~rriggs/cleaner-doc/index.html
>>>>>>>
>>>>>>> Should cleaning and cleanables be mentioned as part of the 
>>>>>>> package-doc
>>>>>>> for java.lang.ref? Else they seem to be an overlooked add-on not
>>>>>>> part of
>>>>>>> the core reference related functionality. Perhaps even state how 
>>>>>>> they
>>>>>>> are preferred to use of finalization?
>>>>>>>
>>>>>>> Cleaner.Cleanable:
>>>>>>>
>>>>>>> It was unclear to me what the usage model was for this. I'm 
>>>>>>> assuming
>>>>>>> that the intent is that rather than register a "thunk" (lets call
>>>>>>> it an
>>>>>>> "action") that can be invoked directly by user-code, the user 
>>>>>>> should
>>>>>>> invoke the action via the call to clean(). In which case I think it
>>>>>>> should be explained somewhat more clearly - see below.
>>>>>>>
>>>>>>> I would describe the Cleanable class as:
>>>>>>>
>>>>>>> Cleanable: Represents an object that has been registered for
>>>>>>> cleanup by
>>>>>>> a Cleaner. The object can be cleaned directly, by a call to
>>>>>>> clean(), if
>>>>>>> it is no longer to be used, else it will be cleaned 
>>>>>>> automatically when
>>>>>>> the object becomes phantom-reachable.
>>>>>>>
>>>>>>> Cleanable.clean: Unregisters this Cleanable and performs the 
>>>>>>> cleanup
>>>>>>> action that was associated with it. If this Cleanable has 
>>>>>>> already been
>>>>>>> unregistered nothing happens. The cleanup action is invoked at most
>>>>>>> once
>>>>>>> per registered Cleanable, regardless of the number of calls to
>>>>>>> clean().
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> Looking at Cleaner ....
>>>>>>>
>>>>>>>
>>>>>>> "Cleaner manages a set of object references and corresponding 
>>>>>>> cleaning
>>>>>>> functions"
>>>>>>>
>>>>>>> I would say "cleaning actions" rather than functions as they 
>>>>>>> yield no
>>>>>>> value. This change needs to be made throughout.
>>>>>>>
>>>>>>> "The most efficient use is to explicitly invoke the clean method 
>>>>>>> when
>>>>>>> the object is closed or no longer needed. The cleaning function 
>>>>>>> is a
>>>>>>> Runnable to be invoked at most once when the object is no longer
>>>>>>> reachable unless it has already been explicitly cleaned."
>>>>>>>
>>>>>>> To me this doesn't quite capture the need to not use the Runnable
>>>>>>> directly. I would rephrase:
>>>>>>>
>>>>>>> "In normal use a object should be cleaned up when no longer
>>>>>>> required, by
>>>>>>> invoking the clean() method of the associated Cleanable. This
>>>>>>> guarantees
>>>>>>> that the cleaning action will be performed at most once per object:
>>>>>>> either explicitly, or automatically if it becomes
>>>>>>> phantom-reachable. If
>>>>>>> cleaned explicitly the object should not be used again. Note 
>>>>>>> that the
>>>>>>> cleaning action must not refer to the object ..."
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> Question: what happens if an object is registered simultaneously 
>>>>>>> with
>>>>>>> multiple Cleaners? Do we need to warn the user against that?
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> The phrase "process the unreachable objects and to invoke cleaning
>>>>>>> functions" doesn't quite seem right to me. The objects 
>>>>>>> themselves are
>>>>>>> never processed, or even touched - right? So really the thread is
>>>>>>> started to "invoke the cleanup actions for unreachable objects".
>>>>>>>
>>>>>>> create(): can also throw SecurityException if not allowed to
>>>>>>> create/start threads.
>>>>>>>
>>>>>>> register(Object obj, Runnable thunk): thunk -> action
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/6/15 7:46 PM, David Holmes wrote:
>>>>>>>>> Hi Roger,
>>>>>>>>>
>>>>>>>>> Sorry to be late here but was trying not to get involved :)
>>>>>>>>>
>>>>>>>>> It is already implicit that ThreadFactory.newThread should return
>>>>>>>>> unstarted threads - that is what a new Thread is - so I don't 
>>>>>>>>> think
>>>>>>>>> IllegalThreadStateException needs to be documented here as it is
>>>>>>>>> documenting behaviour of a broken ThreadFactory (and a broken
>>>>>>>>> ThreadFactory could throw anything) .
>>>>>>>> It does seem that new is fairly well understood but one can 
>>>>>>>> read of
>>>>>>>> ThreadFactory is as a bit ambiguous, lacking a direct reference to
>>>>>>>> the
>>>>>>>> Thread.State of the new thread
>>>>>>>> and since it allows various attributes of the thread to be 
>>>>>>>> modified
>>>>>>>> after the constructor.
>>>>>>>> Since the runnable is supplied as an argument it is ready to be
>>>>>>>> started,
>>>>>>>> why not.
>>>>>>>> It seemed useful to reinforce the salient points.
>>>>>>>>>
>>>>>>>>> Also the no-arg cleaner() can also throw SecurityException.
>>>>>>>> The thread construction is done in doPriv so it should not throw.
>>>>>>>> Did I miss some edge case?
>>>>>>>>>
>>>>>>>>> Also this:
>>>>>>>>>
>>>>>>>>> 127      * On each call the {@link 
>>>>>>>>> ThreadFactory#newThread(Runnable)
>>>>>>>>> thread factory}
>>>>>>>>>  128      * should return a {@link Thread.State#NEW new 
>>>>>>>>> thread} with
>>>>>>>>> an appropriate
>>>>>>>>>  129      * {@linkplain Thread#getContextClassLoader context 
>>>>>>>>> class
>>>>>>>>> loader},
>>>>>>>>>  130      * {@linkplain Thread#getName() name},
>>>>>>>>>  131      * {@linkplain Thread#getPriority() priority},
>>>>>>>>>  132      * permissions, etc.
>>>>>>>>>
>>>>>>>>> then begs the questions as to what is "appropriate". I think this
>>>>>>>>> can
>>>>>>>>> be said much more simply as: "The ThreadFactory must provide a
>>>>>>>>> Thread
>>>>>>>>> that is suitable for performing the cleaning work". Though 
>>>>>>>>> even that
>>>>>>>>> raises questions. I'm not sure why a ThreadFactory is actually
>>>>>>>>> needed
>>>>>>>>> here ?? Special security context? If so that could be mentioned,
>>>>>>>>> but I
>>>>>>>>> don't think name or priority need to be discussed.
>>>>>>>> It was intended to prod the client to be deliberate about the
>>>>>>>> threadFactory.
>>>>>>>> Since the client is integrating the Cleaner and respective 
>>>>>>>> cleaning
>>>>>>>> functions
>>>>>>>> with the client code, the ThreadFactory makes it possible for the
>>>>>>>> client to
>>>>>>>> initialize a suitable thread and the comments serve as a reminder.
>>>>>>>> I agree that the phrase 'suitable for performing the cleaning
>>>>>>>> work' is
>>>>>>>> the operative one.
>>>>>>>>
>>>>>>>> Thanks, Roger
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>
>>>>>
>>
>