OpenJDK 17: Loitering AbstractQueuedSynchronizer$ConditionNode instances (also on latest master branch)

Sun Feb 11 13:57:26 UTC 2024

Hello Core-Libs-Dev team,

may I ask you about your opinion about a tiny one-liner change in
AbstractQueuedSynchronizer, just as a suggestion how to make
ConditionObjects / Nodes even more garbage collector friendly?

Checked out
https://github.com/openjdk/jdk/blob/jdk-17%2B35/src/java.base/share/classes/java/util/concurrent/locks/AbstractQueuedSynchronizer.java
(the same on master branch with different line numbers near to line 1506):

@@ -1431,40 +1431,41 @@ public abstract class AbstractQueuedSynchronizer
      public class ConditionObject implements Condition,
java.io.Serializable {
          // ...
          private void doSignal(ConditionNode first, boolean all) {
              while (first != null) {
                  ConditionNode next = first.nextWaiter;
+                first.nextWaiter = null;  // GC-friendly: avoid chains
of dead ConditionNodes
                  if ((firstWaiter = next) == null)
                      lastWaiter = null;
                  if ((first.getAndUnsetStatus(COND) & COND) != 0) {
                      enqueue(first);
                  // ...

By setting the nextWaiter to null of the first condition node, which is
transferred from the condition queue to the sync queue in this method,
long chains of ConditionNode instances can be avoided. Though a single
ConditionNode is small, these chains of ConditionNodes become very huge
on the heap (I've seen more than 1GB on an application server over time)
if at least one node was promoted to the old generation for any reason.
They survive minor collections and are cleaned up only on mixed / full
collections, and thus put unnecessary pressure on G1 garbage collector.

The same change could also be applied to 'AbstractQueuedLongSynchronizer'.

I know premature optimization is the root of all evil, on the other hand
I could image that many applications benefit from GC-friendly
ConditionObjects, since they are frequently used in various classes like
PriorityBlockingQueue / LinkedBlockingDeque / LinkedBlockingQueue, the
latter one as default work queue for executor services like fixed thread
pools for processing asynchronous tasks.

Thank you all for your time and help!

Best regards
Frank

Am 08.02.2024 um 12:15 schrieb Frank Kretschmer:
> Hello Thomas, hello Core-Libs-Dev,
>
> thank you for cc'ing my email. In deed my idea/suggestion is to modify
> the AbstractQueuedSynchronizer$ConditionNode handling in such a way that
> it gets unlinked from the chain of condition nodes if it is not needed
> any more (it might be the "nextWaiter" node), in order to be more
> GC-friendly.
>
> @core-libs-dev: I've just attached the “G1LoiteringConditionNodes” demo
> class and "gc.log" again so that you can have a look if you like.
>
> Best regards
>
> Frank
>
>
> Am 08.02.2024 um 11:04 schrieb Thomas Schatzl:
>> Hi,
>>
>>   since this looks like a suggestion for a change to the libraries
>> similar to the mentioned JDK-6805775, and not actually GC, cc'ing the
>> core-libs-dev mailing list.
>>
>> Hth,
>>   Thomas
>>
>> On 07.02.24 15:20, Frank Kretschmer wrote:
>>> Hi Java GC-experts,
>>>
>>> I'm facing an interesting G1 garbage collector observation in OpenJDK
>>> 17.0.9+9, which I would like to share with you.
>>>
>>> My application runs many asynchronous tasks in a fixed thread pool,
>>> utilizing its standard LinkedBlockingQueue. Usually, it generates
>>> just a
>>> little garbage, but from time to time, I observed that the survivor
>>> spaces grow unexpectedly, and minor collection times increase.
>>>
>>> This being the case, many
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode
>>> instances can be found on the heap. In fact, the whole heap (rank 1 as
>>> shown in jmap) was filled up with ConditionNode instances after a
>>> while.
>>>
>>> After some tests, I figured out that G1 seems to be able to collect
>>> “dead” ConditionNode instances during minor collections only if no
>>> formerly alive ConditionNode instances were promoted to the old
>>> generation and died there.
>>>
>>> To illustrate that, I've attached a “G1LoiteringConditionNodes” class
>>> that can be run for demo purposes, e.g. under Linux with OpenJDK
>>> 17.0.9+9 (VM options see comments within the class), and its gc-log
>>> output. It shows that during the first two minutes, everything is fine,
>>> but after a promotion to the old generation, survivors grow and minor
>>> pause time increase from 3 to 10ms.
>>>
>>> For me, it looks like an issue similar to
>>> https://bugs.openjdk.org/browse/JDK-6805775 “LinkedBlockingQueue Nodes
>>> should unlink themselves before becoming garbage”, which was fixed in
>>> OpenJDK 7.
>>>
>>> What’s your opinion about that? Wouldn’t it be worth to enable G1 to
>>> collect those AbstractQueuedSynchronizer$ConditionNode instances during
>>> minor collections, as it is done for LinkedBlockingQueue Nodes?
>>>
>>> Best regards
>>>
>>> Frank
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240211/0221f965/attachment.htm>