blocking peek for blocking queues

Thu Apr 27 05:22:55 UTC 2023

Hello Mark,

core-libs-dev is a better place for this discussion 
https://mail.openjdk.org/mailman/listinfo/core-libs-dev

-Jaikiran

On 27/04/23 8:21 am, mark.yagnatinsky at barclays.com wrote:
>
> I suspect this is the wrong list; if so I hope someone will redirect me.
>
> I’d like to make a case for adding a blocking variant of peek() to 
> BlockingQueue.
>
> This has apparently been suggested before:
>
> https://bugs.openjdk.org/browse/JDK-6653412
>
> At the time, the verdict was that the proposed use case for a blocking 
> peek was not very compelling.
>
> For one thing, that use case was inherently racy, and ideally the 
> standard library should not be encouraging races.
>
> And for another thing, this was before the invention of default 
> methods, so a new interface would be needed.
>
> I’d like to present what might be a more sympathetic use case, where a 
> blocking peek actually makes it easier to avoid races.
>
> (This is not hypothetical; I’ve spent all day trying to cope with the 
> lack of a blocking peek.  I think I succeeded, but it was tough.)
>
> Let’s start at the beginning.  The game known as “multithreaded 
> programming” can be played on 3 difficulty levels: easy, medium, and hard.
>
> Easy mode is when the JDK happens to have a high-level tool (e.g., a 
> Collection or an Executor or something) such as DelayQueue that does 
> exactly what you want.
>
> Sadly, the JDK is finite, and even Maven Central is finite, so this 
> not guaranteed to happen, though it’s nice when it does.
>
> At the other extreme is hard mode, when every nanosecond counts and 
> you must resort to the sorts of mental gymnastics that were involved 
> when String.hashCode was re-worked to avoid re-computing hash codes 
> equal to zero.
>
> This email is about the medium difficulty level.  In this mode, you 
> must glue together two or three collections to get what you want.
>
> To be concrete, suppose we want to glue a map together with a blocking 
> queue, since that’s a simplified version of what I was doing.  (In 
> particular I was using a DelayQueue and a HashMap.)
>
> Since we’re not playing on hard mode, we’re going to allow ourselves 
> to guard the entire data structure by one giant lock whenever it seems 
> convenient, without worrying about performance.
>
> In fact, let’s give a name to this lock. Since I’m not feeling very 
> creative right now, let’s call it LOCK.  Similarly, let’s call our 
> compound data structure STRUCT.
>
> My usual heuristic to “win” (that is, to make sure my code is correct, 
> or at least non-racy) on medium difficulty goes something like this.
>
> First, figure out whether there is any way for STRUCT to be “invalid”.
>
> For instance, maybe “it’s valid if and only if every entry in the 
> queue has a corresponding entry in the map”.
>
> One could then write a brief comment in the code explaining what 
> “valid” means, or perhaps even write an “isValid()” method for testing 
> purposes.
>
> Once we’ve decided what it means for STRUCT to be “valid” (aka, 
> non-corrupt), we can try to maintain the following invariant:
>
> EITHER some thread holds the LOCK, and no other thread is using STRUCT 
> right now,
>
> OR ELSE the LOCK  is not held by any thread, and thus STRUCT is 
> currently in a valid state.
>
> Sometimes the invariant above is a bit too strong, and it’s convenient 
> to weaken it slightly.  So we can instead consider the following rule:
>
> RULE: we must never mutate STRUCT unless we hold the LOCK.
>
> If we uphold the invariant, then we must be following the rule, but 
> not vice versa.
>
> (The rule doesn’t forbid queries without the lock; it only forbids 
> writes.)
>
> Finally, we come to the point.  The lack of a blocking peek in 
> BlockingQueue has forced me to break the above “weakened” rule.
>
> I had to invent a new much weaker rule, something like “please 
> carefully think through all possible interleaved executions; if they 
> all work, great!”.
>
> What went wrong?  Well, one of the methods I wanted on STRUCT was 
> basically “call take() on the queue, and then do things with the result”.
>
> The trouble is that you can’t usually afford to hold a lock while 
> calling take(), since it can block forever, and in the meantime you’ve 
> “locked out” anyone who could have added something and thus unblocked you.
>
> Thus, I split my method into two pieces: In PART_ONE we call take() 
> without the LOCK, and once take() returns I grab the lock right away 
> and run PART_TWO.
>
> But notice that this breaks the RULE from the previous paragraph: I 
> mutated the STRUCT without holding the LOCK.
>
> This means my method is not atomic: it’s possible for some other 
> thread to observe that PART_ONE is done but PART_TWO is not started.
>
> Typically this means that other threads can now observe the STRUCT in 
> an invalid (aka corrupt/broken) state.
>
> Now, suppose we had a blocking peek.  How does that help?  In that 
> case, we would still break our method into two pieces.
>
> To avoid confusion, let’s call them PART_A and PART_B.  In PART_A we 
> call blockingPeek() without holding the LOCK.
>
> Note that we completely ignore the return value from peek; for all we 
> care, the return type is “void” instead of “T”.
>
> This breaks the “invariant” from a few paragraphs ago since we’re 
> technically “using” the STRUCT but it respects the “rule” since we’re 
> not mutating it.
>
> Once peek returns, PART_B begins.  Here, we grab the lock, and do a 
> non-blocking poll() on the queue.
>
> It’s might return “null” if someone raced us to get the lock.  But 
> from our perspective that’s okay: it’s just like a spurious wakeup 
> from a raw wait() call.
>
> The reason it’s okay is that we haven’t done anything yet, so we can 
> just call blockingPeek() again and hope for better luck next time.
>
> Similarly we don’t care if poll() returns a different item than 
> blockingPeek() originally did, since we never looked at what peek() 
> returned.
>
> The rest of PART_B is basically PART_TWO from the “peek-less” world.
>
> Note that this way, our method appears atomic to other threads, since 
> PART_A doesn’t actually do anything; all the work happens in PART_B, 
> which is guarded by the LOCK.
>
> Thoughts?
>
> This message is for information purposes only. It is not a 
> recommendation, advice, offer or solicitation to buy or sell a product 
> or service, nor an official confirmation of any transaction. It is 
> directed at persons who are professionals and is intended for the 
> recipient(s) only. It is not directed at retail customers. This 
> message is subject to the terms at: 
> https://www.cib.barclays/disclosures/web-and-email-disclaimer.html.
>
> For important disclosures, please see: 
> https://www.cib.barclays/disclosures/sales-and-trading-disclaimer.html 
> regarding marketing commentary from Barclays Sales and/or Trading 
> desks, who are active market participants; 
> https://www.cib.barclays/disclosures/barclays-global-markets-disclosures.html 
> regarding our standard terms for Barclays Corporate and Investment 
> Bank where we trade with you in principal-to-principal wholesale 
> markets transactions; and in respect to Barclays Research, including 
> disclosures relating to specific issuers, see: 
> http://publicresearch.barclays.com.
> __________________________________________________________________________________ 
>
> If you are incorporated or operating in Australia, read these 
> important disclosures: 
> https://www.cib.barclays/disclosures/important-disclosures-asia-pacific.html.
> __________________________________________________________________________________
> For more details about how we use personal information, see our 
> privacy notice: 
> https://www.cib.barclays/disclosures/personal-information-use.html.
> __________________________________________________________________________________
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jdk-dev/attachments/20230427/537d0ce4/attachment.htm>