RFR [9] 8148117: Move sun.misc.Cleaner to jdk.internal.ref

Thu Jan 28 19:14:23 UTC 2016

Hi,

On 01/25/2016 08:32 PM, Gil Tene wrote:
> I assume your goal here is to get the resources released with the next newgen collections (following a close()), rather than wait for an oldgen (if the resource was held by an old object). That's a cool thing.
>
> With that in mind, you can replace the repeated periodic polling/flipping/allocation and external calling changeGuard() with a simple internal GC-detecor that would call changeGuard() allocate a new guard only once per newgen GC cycle.
>
> This can take the form of adding a simple GCDetector inside your implementation:
>
> private class GCDetector {
>      @Override
>      protected void finalize() throws Throwable {
>          GCDetector detector = new GCDetector();
>          changeGuard();
>          Reference.reachabilityFence(detector);
>      }
> }
>
> // The reason to use finalize here instead of a phantom ref
> // based cleaner is that it would trigger immediately after the cycle,
> // rather than potentially take an extra cycle to trigger.
> // This can be done with a weakRef based cleaner instead
> // (but probably when one is added to the JDK, otherwise you'd
> // need your own polling thread and logic here...).

Good idea. This will adapt the rate of guard changing to the rate of GC 
and there's no need for special executor. A phantom-ref cleaner would be 
equally prompt as finalize(). When there's a finalze() method on the 
referent, finalize() method is invoked 1st, then the FinalReference to 
the referent is cleared and only after that the PhantomReference to the 
same referent can be processed in the next GC cycle. But when there's no 
final() method on the referent, PhantomReference is processed right away.

Also, when a finalize() method creates new finalizable object that has a 
finalize() method which creates new finalizable object that has a 
finalize()  method ...

public class Test {

     static class GCDetector {
         @Override
         protected void finalize() throws Throwable {
             System.out.println("GC detected: " + this);
             new GCDetector();
         }
     }

     public static void main(String[] args) throws Exception {
         System.runFinalizersOnExit(true);
         new GCDetector();
     }
}

...and you set to run finalizers on exit, you get a never-ending loop 
and VM doesn't exit:

...
GC detected: Test$GCDetector at 1088249a
GC detected: Test$GCDetector at 7e73ecc
GC detected: Test$GCDetector at 4da49a96
GC detected: Test$GCDetector at 667a971f
GC detected: Test$GCDetector at 3787d3be
...

So a phantom-ref cleaner would work better here. The function of 
Deallocator and guard-changer can even be merged in the same object, so 
guard-changing and reference tracking can be performed by the same 
Cleaner(s) like in:

http://cr.openjdk.java.net/~plevart/misc/CloseableMemory/v2/CloseableMemory.java

Regards, Peter

> You'll need to allocate a single instance of GCDetector during construction of a CloseableMemory, without retaining a reference to it after construction. This will start a finalize-triggering chain per instance, with the chain "ticking" once per newgen cycle.
>
> If you want to avoid having one of these (coming and going on each GC cycle) per CloseableMemory instance, you can use a common static detector and a registration mechanism (where each registered instance would have it's changeGuard() method call…
>
> — Gil.
>
>
>> On Jan 24, 2016, at 9:10 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>
>> Hi,
>>
>> I had an idea recently on how to expedite the collection of an object. It is simple - just don't let it live long.
>>
>> Here's a concept prototype:
>>
>> http://cr.openjdk.java.net/~plevart/misc/CloseableMemory/CloseableMemory.java
>>
>> The overhead of the check in access methods (getByte()/setByte()) amounts to one volatile read of an oop variable that changes once per say 5 to 10 seconds. That's the period a special guard object is alive. It's reachability is tracked by the GC and extends to the end of each access method (using Reference.reachabilityFence). Every few seconds, the guard object is changed with new fresh one so that the chance of the guard and its tracking Cleaner being promoted to old generation is very low.
>>
>> Could something like that enable a low-overhead CloseableMappedByteBuffer?
>>
>> Regards, Peter
>>
>> On 01/23/2016 09:31 PM, Andrew Haley wrote:
>>> On 23/01/16 20:01, Uwe Schindler wrote:
>>>
>>>> It depends how small! If the speed is still somewhere between Java 8
>>>> ByteBuffer performance and the recent Hotspot improvements in Java
>>>> 9, I agree with trying it out. But some volatile memory access on
>>>> every access is a no-go. The code around ByteBufferIndexInput in
>>>> Lucene is the most performance-critical critical code, because on
>>>> every search query or sorting there is all the work happening in
>>>> there (millions of iterations with positional ByteBuffer.get*
>>>> calls). As ByteBuffers are limited to 2 GiB, we also need lots of
>>>> hairy code to work around that limitation!
>>> Yes, I see that code.  It would be helpful if there were a
>>> self-contained but realistic benchmark using that code.  That way,
>>> some simple experiments would allow changes to be measured.
>>>
>>>> If you look at ByteBufferIndexInput's code you will see that we
>>>> simply do stuff like trying to read from one bytebuffer and only if
>>>> we catch an BufferUnderflowException we fall back to handling buffer
>>>> switches: Instead of checking bounds on every access, we have
>>>> fallback code only happening on exceptions. E.g. if you are 3 bytes
>>>> before end of one buffer slice and read a long, it will throw
>>>> BufferUnderflow. When this happens the code will fall back to read
>>>> byte by byte from 2 different buffers and reassemble the long):
>>> I'm surprised you don't see painful deoptimization traps when that
>>> happens.  I suppose it's rare enough that you don't care.  There's a
>>> new group of methods in JDK9 called Objects.checkIndex() which are
>>> intended to provide a very efficient way to do bounds checks. It might
>>> be interesting to see if they work well with ByteBufferIndexInput:
>>> that's an important use case.
>>>
>>> BTW, does anyone here know why we don't have humongous ByteBuffers
>>> with a long index?
>>>
>>> Andrew.