RFR: JDK-8149925 We don't need jdk.internal.ref.Cleaner any more
Peter Levart
peter.levart at gmail.com
Tue Apr 5 14:41:26 UTC 2016
Hi Roger,
On 04/04/2016 11:50 PM, Roger Riggs wrote:
> Hi Peter,
>
> Stepping back just a bit.
Right, let's clear up.
>
> The old Cleaner running on the Reference processing thread had a few
> (2) very well controlled
> functions, reference processing and deallocating DirectByteBuffers.
> Maybe we can't do too much
> better than that.
...yes, at the beginning, until it was (re)used for other purposes too,
in: java.lang.ProcessImpl,
java.lang.invoke.MethodHandleNatives.CallSiteContext,
jdk.internal.perf.Perf, sun.nio.ch.IOVecWrapper, sun.nio.fs.NativeBuffer
and sun.java2d.marlin.OffHeapArray.
But those other usages have been converted to use new
java.lang.ref.Cleaner so old cleaner is now back to basics -
DirectByteBuffers. And with that, DirectByteBuffers allocating threads
only help ReferenceHandler thread enqueue References and execute
DirectByteBuffer deallocators, which is an improvement.
But should we keep that status quo? It's nothing wrong with it as it is,
except I think we can do better.
>
> The old worst case performance/latency wise was the reference
> processing thread
> did the work and the allocating thread did very little synchronizing
> and just did the retries.
The number of retries was exactly the same as the number of References
helped to be enqueued or in case of Cleaner(s), executed:
// retry while helping enqueue pending Reference objects
// which includes executing pending Cleaner(s) which includes
// Cleaner(s) that free direct buffer memory
while (jlra.tryHandlePendingReference()) {
if (tryReserveMemory(size, cap)) {
return;
}
}
If the share of pending References that are also Cleaners was high,
chances were higher that not much helping was needed as one cleaned
DBB.Deallocator could unreserve enough memory for next reservation
attempt to succeed. So allocating thread helped only until it succeeded
in reserving the native memory leaving the rest of work to another
allocating request/thread or to ReferenceHandler.
> In the best case, all the real work was done by the allocating thread,
> if the interactions with GC
> work out perfectly. But it was still the case that the buffer
> alloc/dealloc throughput was
> met with the division of work separating the reference processing
> thread and the allocating thread.
Yes, whichever thread was quicker. If ReferenceHandler thread had been
waking up from wait() for a long time, allocating thread could have
already processed all the References before ReferenceHandler finally
started to look around. If there were lots of new pending referenced
discovered, ReferenceHandler thread could finally join the party and
fight for the same lock...
>
> The function that can only be provided by CleanerImpl / Reference
> processing thread state is
> knowledge that the cleaning queue is empty.
...and that the discovered pending references have actually been
enqueued before that...
> The helping functions were/are a bit troublesome because of mixing
> execution
> environments of the thread allocating direct buffers and the
> cleanables and it seemed that
> more than a little complexity was needed to compensate.
I totally agree.
>
> If the bottleneck in processing is between the reference processing
> and cleanup
> then it should be ok (based on previous comments) for the CleanerImpl
> to help with
> reference processing (after it has an empty queue and before it blocks
> waiting or in every loop).
> Though if you already tried this combination, I don't recall the results.
I don't thing there is a problem because of any bottleneck. And if there
was a bottleneck we would only have a problem with
allocation/deallocation throughput and not with OOME(s). The problem is
because reference discovery is not triggered as a result of native
memory reservation approaching or reaching the limit. There is no heap
memory pressure from DirectByteBuffer(s) because they are small objects.
So a mechanism must be in place that triggers reference discovery and
waits for discovered references to be processed before failing the
native memory allocation. A mechanism that tries to simulate what
happens with GC when there is heap memory pressure. GC guarantees that
full-GC is executed and heap allocation retried after that before
finally giving up with OOME. We need a mechanism that attempts to do the
same for direct memory. Throughput is a nice property but we are not
directly seeking its improvement. We just not want to make things much
worse.
Helping the Cleaner thread to process cleanup functions is the easiest
way to wait for cleanup functions to be processed and for queue to
drain. Simply because of ReferenceQueue API. If you poll() next
Reference from the queue and get null, you know the queue is empty, but
if you get something, you have to execute it and not just ignore it.
Maybe we could patch into the ReferenceQueue implementation and extend
its API with an internal method that would not return next Reference but
just information that ReferenceHandler thread has done so or that the
queue is empty. I'll think about it.
>
> As you pointed out it would be more efficient if the allocating thread
> could be aware
> when it was known there was nothing ready to cleanup so it can retry
> and invoke GC or
> throw out of memory if appropriate.
> Adding a method that returned the count of completed cleaning cycles
> (or similar)
> to CleanerImpl could exist with a minimal of coupling and still provide
> the information needed without commingling the execution threads.
I'll think about how to surface this functionality in the CleanerImpl
most elegantly. The functionality of providing only the counter of
cleaning cycles as a getter might not be most appropriate. What we also
need is some mechanism to wait and be woken up to retry reservation only
at appropriate points in time otherwise allocating threads could just
spin eating CPU time. So my latest attempt was to encapsulate the entire
retry logic inside ExtendedCleaner with ByteBuffer/Bits only providing
allocation function to this logic, which in my view of API is pretty
decoupled and general.
>
> I don't see the need to change Cleaner to an interface to be able to
> provide
> an additional method on CleanerImpl or a subclass and a factory method
> could
> provide for a clean and very targeted interface to Bits/Direct buffer.
I would like this to be an instance method so it would naturally pertain
to a particular Cleaner instance. Or it could be a static method that
takes a Cleaner instance. One of my previous webrevs did have such
method on the CleanerImpl, but I was advised to move it to Cleaner as a
package-private method and expose it via SharedSecrets to internal code.
I feel such "camouflage" is very awkward now that we have modules and
other mechanisms exist. So I thought it would be most elegant to make
Cleaner an interface so it can be extended with an internal interface to
communicate intent in a type-safe and auto-discoverable way. The change
to make it interface:
http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.part2.1.rev01/
...actually simplifies implementation (33 lines removed in total) and
could be seen as an improvement in itself.
Are you afraid that if Cleaner was an interface, others would attempt to
make implementations of it? Now that we have default methods on
interfaces it is easy to compatibly extend the API even if it is an
interface so that no 3rd party implementations are immediately broken.
Are you thinking of security implications when some code is handed a
Cleaner instance that it doesn't trust? I don't think there is a utility
for Cleaner instances to be passed from untrusted to trusted code, do you?
In the end it doesn't really matter. We can do it one way or the other.
I just feel that using an interface is cleaner.
>
> I'm sorry I haven't had time to try out concretely what I have in mind.
> Please correct or remind me of missing salient considerations.
The bottom line is that we need a mechanism that:
- triggers reference discovery when native memory limit is approached or
reached
- retires native memory reservation at appropriate time slots until
succeeding or until all pending references have been processed and
Cleanables executed at which time native memory reservation can fail
with OOME.
- if possible, doesn't execute cleanup functions by the allocating
thread but just waits for system threads to do the job.
- when triggered, does not make native memory allocation a bottleneck.
I think that what I did in my latest webrevs with ReferenceHandler
thread is an improvement in minimizing contended synchronization and
interference of allocating thread(s) with Reference enqueue-ing. But
interaction of allocating thread(s) with Cleaner background thread could
be improved and I have a couple of ideas to explore.
>
> Thanks, Roger
>
Regards, Peter
>
> On 4/2/2016 7:24 AM, Peter Levart wrote:
>> Hi Roger,
>>
>> Thanks for looking at the patch.
>>
>> On 04/02/2016 01:31 AM, Roger Riggs wrote:
>>> Hi Peter,
>>>
>>> I overlooked the introduction of another nested class (Task) to
>>> handle the cleanup.
>>> But there are too many changes to see which ones solve a single
>>> problem.
>>>
>>> Sorry to make more work, but I think we need to go back to the
>>> minimum necessary
>>> change to make progress on this. Omit all of the little cleanups
>>> until the end
>>> or do them first and separately.
>>>
>>> Thanks, Roger
>>
>> No Problem. I understand. So let's proceed in stages. Since part1 is
>> already pushed, I'll call part2 stages with names: part2.1, part2.2,
>> ... and I'll start counting webrev revisions from 01 again, so webrev
>> names will be in the form: webrev.part2.1.rev01. Each part will be an
>> incremental change to the previous one.
>>
>> part2.1: This is preparation work to be able to have an extended
>> java.lang.ref.Cleaner type for internal use. Since
>> java.lang.ref.Cleaner is a final class, I propose to make it an
>> interface instead:
>>
>> http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.part2.1.rev01/
>>
>> This is a source-compatible change and it also simplifies
>> implementation (no injection of Cleaner.impl access function into
>> CleanerImpl needed any more). What used to be java.lang.ref.Cleaner
>> is renamed to jdk.internal.ref.CleanerImpl. What used to be
>> jdk.internal.ref.CleanerImpl is now a nested static class
>> jdk.internal.ref.CleanerImpl.Task (because it implements Runnable).
>> Otherwise nothing has changed in the overall architecture of the
>> Cleaner except that public-facing API is now an interface instead of
>> a final class. This allows specifying internal extension interface
>> and internal extension implementation.
>>
>> CleanerTest passes with this change.
>>
>> So what do you think?
>>
>> Regards, Peter
>>
>>>
>>>
>>>
>>>
>>> On 4/1/16 5:51 PM, Roger Riggs wrote:
>>>> Hi Peter,
>>>>
>>>> Thanks for the diffs to look at.
>>>>
>>>> Two observations on the changes.
>>>>
>>>> - The Cleaner instance was intentionally and necessarily different
>>>> than the CleanerImpl to enable
>>>> the CleanerImpl and its thread to terminate if the Cleaner is not
>>>> longer referenced.
>>>> Folding them into a single object breaks that.
>>>>
>>>> Perhaps it is not too bad for ExtendedCleaner to subclass
>>>> CleanerImpl with the cleanup helper/supplier behavior
>>>> and expose itself to Bits. There will be fewer moving parts. There
>>>> is no need for two factory methods for
>>>> ExtendedCleaner unless you are going to use a separate ThreadFactory.
>>>>
>>>> - The Deallocator (and now Allocator) nested classes are identical,
>>>> and there is a separate copy for each
>>>> type derived from the Direct-X-template. But it may not be worth
>>>> fixing until the rest of it is settled to avoid
>>>> more moving parts.
>>>>
>>>> I don't have an opinion on the code changes in Reference, that's
>>>> different kettle of fish.
>>>>
>>>> More next week.
>>>>
>>>> Have a good weekend, Roger
>>>>
>>>>
>>>> On 4/1/2016 12:46 PM, Peter Levart wrote:
>>>>>
>>>>>
>>>>> On 04/01/2016 06:08 PM, Peter Levart wrote:
>>>>>>
>>>>>>
>>>>>> On 04/01/2016 05:18 PM, Peter Levart wrote:
>>>>>>> @Roger:
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>> About entanglement between nio Bits and
>>>>>>> ExtendedCleaner.retryWhileHelpingClean(). It is the same level
>>>>>>> of entanglement as between the DirectByteBuffer constructor and
>>>>>>> Cleaner.register(). In both occasions an action is provided to
>>>>>>> the Cleaner. Cleaner.register() takes a cleanup action and
>>>>>>> ExtendedCleaner.retryWhileHelpingClean() takes a retriable
>>>>>>> "allocating" or "reservation" action. "allocation" or
>>>>>>> "reservation" is the opposite of cleanup. Both methods are
>>>>>>> encapsulated in the same object because those two functions must
>>>>>>> be coordinated. So I think that collocating them together makes
>>>>>>> sense. What do you think?
>>>>>>
>>>>>> ...to illustrate what I mean, here's a variant that totally
>>>>>> untangles Bits from Cleaner and moves the whole Cleaner
>>>>>> interaction into the DirectByteBuffer itself:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.13.part2/
>>>>>>
>>>>>>
>>>>>> Notice the symmetry between Cleaner.retryWhileHelpingClean :
>>>>>> Cleaner.register and Allocator : Deallocator ?
>>>>>>
>>>>>>
>>>>>> Regards, Peter
>>>>>>
>>>>>
>>>>> And here's also a diff between webrev.12.part2 and webrev.13.part2:
>>>>>
>>>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.diff.12to13.part2/
>>>>>
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list