RFR: 8339687: Rearrange reachabilityFence()s in jdk.test.lib.util.ForceGC

Brent Christian bchristi at openjdk.org
Tue Sep 10 20:14:09 UTC 2024


On Sat, 7 Sep 2024 00:59:25 GMT, Stuart Marks <smarks at openjdk.org> wrote:

>> From the bug description:
>> ForceGC would be improved by moving the Reference.reachabilityFence() calls for 'obj' and 'ref'.
>> 
>> Reference.reachabilityFence(obj) is currently placed after 'obj' has been set to null, so effectively does nothing. It should occur before obj = null;
>> 
>> For Reference.reachabilityFence(ref): 'ref' is a PhantomReference to 'obj', and is registered with 'queue'. ForceGC.waitFor() later remove()s the reference from the queue, as an indication that some GC and reference processing has taken place (hopefully causing the BooleanSupplier to return true).
>> 
>> The code expects the PhantomReference to be cleared and be put on the queue. But recall that a Reference refers to its queue, and not the other way around. If a Reference becomes unreachable and is garbage collected, it will never be enqueued.
>> 
>> I argue that the VM/GC could determine that 'ref' is not used by waitFor() and collect it before the call to queue.remove(). Moving Reference.reachabilityFence(ref) after the for() loop would prevent this scenario.
>> 
>> While this is only a very minor deficiency in ForceGC, I believe it would be good to ensure that the code behaves as expected.
>
> I added a couple specific comments on the code that I thought ought to be addressed in this PR.
> 
> There is a broader issue with the timeout logic that we should be aware of, however, we might or might not choose to address it in this PR. The main issue is that the caller has requested a particular amount of time as the timeout, and the timeout loop divides by 200ms to determine the maximum number of retries. This _assumes_ that each loop will take 200ms. However, this might not be true, because we don't know how long the booleanSupplier takes, we don't know how long System.gc() takes, and we don't know how long queue.remove() takes. This isn't an idle concern. Somebody might pass in a booleanSupplier that itself has a timeout (say of 1 second) which will cause this method to take about six times longer than expected to time out.
> 
> The usual approach for timeout logic is to take the initial System.nanoTime() value and compare subsequent return values of nanoTime() to the timeout duration, and exit the loop if the timeout duration has been exceeded. See the nanoTime() javadoc for an example.

> The sketchy timeout handling mentioned by @stuart-marks seems to me to be out of scope for this issue.

I also would prefer that updates to timeout handling be done as a separate issue.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20898#issuecomment-2341925965


More information about the core-libs-dev mailing list