RFR: 8305819: LogConfigurationTest intermittently fails on AArch64

Wed May 3 09:30:30 UTC 2023

On Wed, 3 May 2023 09:08:31 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> @navyxliu  thank you for looking at this. As you note the RCU code is intended to ensure that a reader can either see the old node, or the new node, and work correctly. In that case  we would logically expect to use release semantics so that if a reader sees the new node then it also sees the correct values for all fields of the node. The release barrier is a `storeStore|loadStore` barrier. Release semantics then require matching acquire semantics when we read the node from the list. You seem to be suggesting that RCU obviates the need for the acquire side of this - is that correct? If so then perhaps the raw storeStore barrier will suffice - as we state in orderAccess.hpp:
>>> If not used in such a pair, it is advised to use a membar instead: acquire/release only make sense as pairs.
>> 
>> Personally I would find the acquire/release easier to reason about as I'm not that familiar with the subtleties of RCU. That said I'm also concerned that this RCU implementation is again found to be flawed due to missing barriers and have to wonder if this is the last of them? If RCU is too difficult to get right then perhaps we should stop trying to be overly clever and use a simpler, potentially slower, lock-based solution?
>
>> @dholmes-ora
>> 
>> > You seem to be suggesting that RCU obviates the need for the acquire side of this - is that correct? If so then perhaps the raw storeStore barrier will suffice - as we state in orderAccess.hpp
>> 
>> yes, that's my understanding. Release has compound effect, but I still think we only need a storestore barrier. In the runtime, there's only one active writer. Other writers are all blocked by ConfigurationLock.
>> 
>> Even in weak memory model, we don't need to worry about loadStore order because they are either control dependency (line 8) or true data dependency (line 9).
> 
> That's not true, because load- and control-dependency ordering is not a feature of the C++ programming language. While it sometimes works, it sometimes doesn't. Data races are UB. Please read Boehm.
> 
> https://www.hboehm.info/c++mm/no_write_fences.html
> 
>> if we replace the current solution with read/write lock, I guess we won't lose much in terms of performance. but this seems a separated task.
> 
> It won't lose anything measurable.

> @theRealAph I didn't mean that current code is like linux kernel RCU, which is well-documented synchronization. I mean current solution in UnifiedLog borrow the idea.

Yes, it does borrow, the idea, but AFAICS without understanding just how tricky it is. And gets it all wrong as a consequence.

> Analysis so far leads us to believe this error manifests in weak memory model, right? it's the processor reorders some instructions but not compiler optimizations. Do we really need to consider that MACRO?

I don't think that's the real problem. While HotSpot will very probably never be standard C++, we usually try to write standard-correct code, going outside the standard guarantees when we really need to.  I don't think this is such a case.

Data races are undefined behaviour. With undefined behaviour, all bets are off. You cannot test for the absence of bugs, only their presence.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13421#discussion_r1183448372