RFR: 8344332: (bf) Migrate DirectByteBuffer to use java.lang.ref.Cleaner [v11]

Tue Feb 18 14:33:17 UTC 2025

On Tue, 18 Feb 2025 11:01:51 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Revert waitForReferenceProcessing removals, see JDK-8305186
>
> There are several changes here that I think are of interest.
> 
> (1) Put the cleanables for DirectByteBuffer in a separate j.l.r.Cleaner.  I
> don't think that was done in previous experiments with eliminating the use of
> j.i.r.Cleaner.  That seems like a good thing to do, to avoid interleaving the
> processing of these short and potentially more performance-critical cleanables
> with others.
> 
> (2) Do a GC for each iteraton of the slow path (i.e. after each exponentially
> increasing sleep period).  Neither the existing code (which I had a hand in)
> nor its predecessor did that.  I'm currently doubtful of this being a good
> idea.  The exponential backoff in the current code was just retained from the
> prior code, and I think doesn't actually do much other than delay OOME to
> allow other threads to happen to close any direct buffers they might be using.
> In the current code (using j.i.r.Cleaner), once waitForReference returns false
> all Cleaners discovered by a prior GC will have been processed.  I wonder,
> with the current code, under what circumstances do we actually get into those
> sleeps, but don't ultimately OOME?
> 
> To provide something similar with j.l.r.Cleaner would require an operation
> on that class similar to waitForReferenceProcessing.  If there is pending
> cleanup work then wait for some and then return true.  If there isn't any
> pending cleanup work then return false.  I've spent some time thinking about
> how this could be done, and think it's feasible (working on a prototype that
> seems like it should work, but hasn't been tested at all yet), though not as
> simple as one might wish.
> 
> (3) Put the slow path under a lock.  That seems clearly essential with the
> change to iterate GCs.  It might be that it's a mistake in the old code to not
> do something similar.  The current code allows essentially back to back to
> back GCs as multiple threads hit the slow path more or less simultaneously.
> You could have several threads come in and all waitForReferenceProcessing, all
> finish that loop at the same time and gang-request GCs.  That doesn't seem
> good.
> 
> (4) Add the canary mechanism.  I think this is a poor replacement for what
> waitForReferenceProcessing does currently, since it is not even remotely
> reliable.  It might make the path to OOME less likely, but bad luck could make
> it do nothing useful at all.

Thanks for the comment, @kimbarrett!

I started replying, but then figured my reply on current canary mechanics would be better captured in the code comment, so I pushed it right there. Then I ran out of time. I would think about this a bit more and try to reply to the your comments sometime later this week.

So far, I still believe Cleaner canaries are the best way to balance the probability of success for the best effort attempt to allocate when short on memory and the implementation simplicity. Maybe filling the only theoretical gap I see at the moment (see code comments) would be marginally better with waiting for `Cleaner` directly, maybe not. Maybe waiting approaches strike a better balance, maybe not.

What I am sure, though, is that lots of theoretical arguments I made to myself, including re-using waiting infrastructure, failed empirical reliability/performance tests after my attempts at implementing them. Admittedly, I have not tried to go as far as coordinating _both_ reference processing and Cleaner threads, maybe, _maybe_ the answer is there. So I would reserve final judgement on whether GC+RefProc+Cleaner waiting approaches really work until I see them actually work :) In contrast, current PR both fits my theoretical understanding why it works, and passes the empirical tests. Looking forward to your attempt!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22165#issuecomment-2665881712