<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><p>Hi Viktor, Archie, and Daniel,</p><p>I hope everything is going well on your end. I really appreciate the thoughtful feedback you've provided so far, and I wanted to follow up on our previous discussion about the potential fairness issue in <code>ArrayBlockingQueue</code> when using <code>ReentrantLock</code> with <code>Condition.await()</code>.</p><p>In our previous conversation, I raised a concern regarding the separate waiting spaces for <code>ReentrantLock</code>'s AQS and <code>Condition.await()</code> in <code>ArrayBlockingQueue</code>. Specifically, my understanding is that when a thread is signaled from <code>Condition.await()</code>, it is re-enqueued at the <strong>end of the AQS queue</strong>, which could potentially allow newly arriving threads to acquire the lock before threads that have been waiting on <code>Condition.await()</code> for longer. This separation of the AQS queue and the condition waiting space seems to introduce the possibility of <strong>FIFO violations</strong> and <strong>starvation</strong>, even in fair mode.</p><p>If the potential starvation caused by this behavior is something that needs to be addressed, I’ve been considering a solution that could help. One approach could be tracking the number of threads waiting when the queue is full. Here’s an example of how this might be implemented:</p><pre><code class="language-java">private volatile long putWaitingCount = 0L;
public void put(E e) throws InterruptedException {
Objects.requireNonNull(e);
final ReentrantLock lock = this.lock;
// waitedBefore: A flag indicating whether the current thread has entered the await state in this iteration.
boolean waitedBefore = false;
lock.lockInterruptibly();
try {
/**
* If there are waiting threads, the current thread joins the wait queue.
* Threads entering this condition are implicitly guaranteed that count == capacity.
* Therefore, it's safe to use await() without wrapping it in a loop here.
*/
if (putWaitingCount > 0) {
++putWaitingCount;
waitedBefore = true;
notFull.await();
}
while (count == items.length) {
if(!waitedBefore) {
++putWaitingCount;
waitedBefore = true;
}
notFull.await();
}
enqueue(e);
} finally {
// If the thread had waited, decrement the waiting thread count.
if (waitedBefore) {
--putWaitingCount;
}
lock.unlock();
}
}
</code></pre><pre><code class="language-java"><br></code></pre><p>In this approach, the number of threads currently waiting on <code>Condition.await()</code> is tracked by a simple counter, <code>putWaitingCount</code>. This ensures that newly arriving threads do not bypass those that have already been waiting when the queue is full.</p><p>I’d love to hear your thoughts on whether the starvation issue caused by the separation of AQS and <code>Condition.await()</code>waiting spaces is something worth addressing, and if so, whether the approaches I’ve suggested could be effective in resolving it. If this direction seems promising, I’d be happy to continue refining the solution based on your feedback.</p><p>Thanks again for your time and consideration. I truly appreciate the opportunity to engage with you and learn through this process.</p><p>Best regards,</p><p>Minju Kim</p><div><br><blockquote type="cite"><div>2024. 9. 11. 오후 1:33, 김민주 <miiiinju00@gmail.com> 작성:</div><br class="Apple-interchange-newline"><div><meta http-equiv="content-type" content="text/html; charset=utf-8"><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><p>Hi Archie and Viktor,</p><p>I apologize for the delay in my response.</p><p><br></p><p>After further consideration, I believe my understanding aligns more closely with Interpretation B.</p><p>To clarify my understanding of Interpretation B, threads waiting on a <code>Condition</code> (linked through <code>ConditionNode</code> in <code>ConditionObject</code>) receive signals in the order in which they started waiting, but receiving a signal doesn't guarantee immediate reacquisition of the <code>ReentrantLock</code>. Instead, as far as I understand, the signal simply enqueues the thread at the end of the <code>AQS queue</code>(<code>ReentrantLock</code>).</p><p><br></p><p>As a result, threads that received a <code>signal</code> from <code>ConditionObject</code> do not have priority over the threads already waiting in the <code>AQS</code> <code>queue</code> and are simply added to the end of that queue.</p><p>From what I understand, the difference between <code>NonFairSync</code> and <code>FairSync</code> lies in whether a thread can acquire the lock when there are other threads already waiting in the <code>AQS queue</code>, rather than anything related to the threads signaled from a <code>Condition</code>.</p><p><br></p><p>Additionally, as I understand it, the <code>enqueue()</code> and <code>doSignal()</code> methods are declared <code>final</code> in <code>AQS</code>, and both <code>Sync</code>implementations use the <code>AQS</code> methods directly.</p><p><br></p><p>Therefore, I wanted to highlight that this isn't necessarily an issue with <code>AQS</code> or <code>FairSync</code>. Instead, I suspect that the behavior we're seeing could arise from the fact that threads signaled in a <code>BlockingQueue</code> implementation may not execute immediately, which is a characteristic of the implementation.</p><p><br></p><p>However, there may be aspects I’m misunderstanding, and I would appreciate any corrections or clarifications if that’s the case.</p><p><br></p><p>Thank you very much for your time and understanding.</p><p>Best regards,</p><p>Kim Minju</p><div><br><blockquote type="cite"><div>2024. 9. 8. 오후 11:39, Archie Cobbs <archie.cobbs@gmail.com> 작성:</div><br class="Apple-interchange-newline"><div><div dir="ltr"><div>Hi Kim,</div><div><br></div><div>Thanks for the details.</div><div><br></div><div>So to summarize:</div><div><ul><li>Kim is saying that "Interpretation B" is how it actually works.</li><li>Viktor is saying that "Interpretation A" is how it actually works.</li></ul></div><div>Do I have that right?</div><div><br></div><div>-Archie</div><div><br></div><div>P.S. Viktor: my apologies for misspelling your name before<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Sep 7, 2024 at 8:04 PM 김민주 <<a href="mailto:miiiinju00@gmail.com">miiiinju00@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>Hi Arhice,</p><p><br></p><p>First of all, I want to apologize if I may not have explained things clearly since English is not my first language. I’m really sorry about that.</p><p><br></p><p>Even so, I deeply appreciate your response and taking the time to reply.</p><p><br></p><p>First, I would like to confirm whether my understanding is correct.</p><p>From what I know, <code>ReentrantLock</code> is based on <code>AQS</code>, and internally, threads are queued by being linked as <code>Node</code>.</p><p>When <code>ReentrantLock.newCondition</code> is called, a <code>ConditionObject</code> is created. When <code>Condition::await</code> is called, the thread waits based on a <code>ConditionNode</code>, and the <code>AQS</code> head is replaced with AQS::<code>signalNext</code>, allowing the lock to be released. At this point, I understand that the node disappears from the <code>AQS</code> queue and starts waiting within the <code>ConditionNode</code> of the <code>ConditionObject</code>.</p><p>As a result, I understand that the waiting places for <code>ReentrantLock</code> and <code>Condition</code> are different.</p><p>To summarize, when <code>Condition::await()</code> is called, the node that was previously waiting in <code>AQS</code> receives the lock, disappears from the <code>AQS</code> queue, and starts waiting within the <code>ConditionNode</code> of the <code>ConditionObject</code>.</p><p>Additionally, from what I understand, in <code>Condition::doSignal</code>, the first <code>ConditionNode</code> is removed, and then <code>AQS::enqueue</code> adds it to the tail of <code>AQS</code>.</p><pre><code>public class ConditionObject implements Condition, java.io.Serializable {
private void doSignal(ConditionNode first, boolean all) {
while (first != null) {
ConditionNode next = first.nextWaiter;
if ((firstWaiter = next) == null)
lastWaiter = null;
if ((first.getAndUnsetStatus(COND) & COND) != 0) {
enqueue(first);
if (!all)
break;
}
first = next;
}
}
</code></pre><p>When <code>enqueue</code> is called:</p><pre><code>public abstract class AbstractQueuedSynchronizer
extends AbstractOwnableSynchronizer
/*
* Tail of the wait queue. After initialization, modified only via casTail.
*/
private transient volatile Node tail;
final void enqueue(Node node) {
if (node != null) {
for (;;) {
Node t = tail;
node.setPrevRelaxed(t); // avoid unnecessary fence
if (t == null) // initialize
tryInitializeHead();
else if (casTail(t, node)) {
t.next = node;
if (t.status < 0) // wake up to clean link
LockSupport.unpark(node.waiter);
break;
}
}
}
}
</code></pre><p>From my understanding, this attaches the node to the tail of <code>AQS</code>.</p><p>To elaborate further on the situation:</p><ol><li><p>Threads T1 to T10 are waiting on <code>Condition::await()</code> because the queue is full.</p><p>(At this point, T1 to T10 are linked through <code>ConditionNode</code>.) [The <code>AQS</code> queue is empty, while <code>ConditionNode</code> holds T1 to T10.]</p></li><li><p>T11 calls <code>take()</code> and holds the lock via <code>lock.lockInterruptibly()</code>.</p><p>(Since no threads are waiting in the <code>AQS</code> queue, T11 will acquire the lock immediately.)</p><p>[Now, <code>AQS</code> holds T11 at its head, and <code>ConditionNode</code> holds T1 to T10.]</p></li><li><p>T12 calls <code>queue.put()</code> and enters the wait queue for <code>lock.lockInterruptibly()</code>. (Since T11 is holding the lock with <code>take()</code>, T12 will be queued behind it in <code>AQS</code>.)</p><p>[Now, <code>AQS</code> holds T11 and T12, while <code>ConditionNode</code> holds T1 to T10.]</p></li><li><p>T11 reduces the count and sends a signal, then releases the lock.</p></li><li><p>T1 receives the signal and moves to the lock queue. Since <code>ReentrantLock</code> is in fair mode,</p><p>(When T11 sends the signal, T1, the first thread linked in <code>ConditionNode</code>, will be enqueued via <code>AQS::enqueue</code>. Now, <code>AQS</code> holds T11, T12, and T1, while <code>ConditionNode</code> holds T2 to T10.)</p></li><li><p>T11 releases the lock and wakes up T12.</p><p>[Now, <code>AQS</code> holds T12 and T1, while <code>ConditionNode</code> holds T2 to T10.]</p></li><li><p>T12 acquires the lock and proceeds to enqueue in <code>ArrayBlockingQueue</code> without being blocked by <code>while(count==length)</code>.</p></li><li><p>T12 releases the lock, and the next node in <code>AQS</code> is unparked.</p><p>[Now, <code>AQS</code> holds T1, while <code>ConditionNode</code> holds T2 to T10.]</p></li><li><p>T1, having reacquired the lock after <code>Condition::await()</code>, fails to exit the <code>while</code> loop and waits again.</p><p>[Now, <code>ConditionNode</code> holds T1 and T2 to T10.]</p></li></ol><p><br></p><p>This is how I currently understand the situation.</p><p>If there are any mistakes in my understanding, I would greatly appreciate your clarification.</p><p>Best Regards,</p><p> Kim Minju</p><div><br><blockquote type="cite"><div>2024. 9. 8. 오전 3:34, Archie Cobbs <<a href="mailto:archie.cobbs@gmail.com" target="_blank">archie.cobbs@gmail.com</a>> 작성:</div><br><div><div dir="ltr"><div>Hi Kim,<br></div><div dir="ltr"><br></div><div dir="ltr">On Sat, Sep 7, 2024 at 10:36 AM 김민주 <<a href="mailto:miiiinju00@gmail.com" target="_blank">miiiinju00@gmail.com</a>> wrote:</div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>Here's a clearer outline of the scenario:</p><ul><li>Threads T1 to T10 are waiting on <code>Condition::await()</code> because the queue is full.</li><li>T11 calls <code>take()</code> and holds the lock via <code>lock.lockInterruptibly()</code>.</li><li>T12 calls <code>queue.put()</code> and enters the wait queue for <code>lock.lockInterruptibly()</code>. (As I understand, the wait queue for ReentrantLock and Condition are separate.)</li><li>T11 reduces the count and sends a signal, then releases the lock.</li><li>T1 receives the signal and moves to the lock queue. Since the ReentrantLock is in fair mode, T12 (which was already waiting) gets priority, and T1 will acquire the lock later.</li><li>T12 acquires the lock and successfully enqueues.</li></ul></div></blockquote><div>From one reading of the Javadoc, your step #5 ("T12 gets priority") is not supposed to happen that way. Instead, one of T1 through T10 should be guaranteed to acquire the lock.<br></div><div><br></div><div>Here it is again (from ReentrantLock.newCondition()):<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The ordering of lock reacquisition for threads returning
from waiting methods is the same as for threads initially
acquiring the lock, which is in the default case not specified,
but for <i>fair</i> locks favors those threads that have been
waiting the longest.
</blockquote></div><div><br></div><div>But part of the problem here is that this documentation is ambiguous.<br></div><div><div><br></div><div>The ambiguity is: are ALL threads trying to acquire the lock, whether on an initial attempt or after a condition wakeup, ordered for fairness together in one big pool? → In this case step #5 can't happen. Call this Interpretation A.</div></div><div><br></div><div>Or is this merely saying that threads waiting on a condition reacquire the lock based on when they started waiting on the condition, but there are no ordering guarantees between those threads and any other unrelated threads trying to acquire the lock? → In this case step #5 can happen. Call this Interpretation B.</div><div><br></div>So I think we first need to clarify which interpretation is correct here, A or B.<br><div><div><br></div><div>On that point, Victor said this:<br></div><div><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I've re-read ReentrantLock and AQS, and from my understanding on the
logic the Condition's place in the wait queue should be maintained,
which means that T3 shouldn't be able to "barge".</blockquote></div></div><div><br></div><div>So it sounds like Victor is confirming interpretation A. Victor do you agree?<br></div><div><br></div><div>If so, then it seems like we need to do two things:</div><div><br></div><div>1. File a Jira ticket to clarify the Javadoc, e.g. to say something like this:<br></div><div><br></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The ordering of lock reacquisition for threads returning
from waiting methods is the same as for threads initially
acquiring the lock, which is in the default case not specified,
but for <i>fair</i> locks favors those threads that have been
waiting the longest. <b>In the latter case, the ordering consideration includes all threads attempting to acquire the lock, regardless of whether or not they were previously blocked on the condition.</b><br></blockquote></div><div><br></div><div>2. Understand why Kim's updated test case is still failing (it must be either a bug in the test or a bug in ArrayBlockingQueue).<br></div><div><br></div><div>-Archie<br></div><br><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature">Archie L. Cobbs<br></div></div>
</div></blockquote></div><br></div></blockquote></div><br clear="all"><br><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature">Archie L. Cobbs<br></div>
</div></blockquote></div><br></div></div></blockquote></div><br></body></html>