RFR: 8305819: LogConfigurationTest intermittently fails on AArch64

David Holmes dholmes at openjdk.org
Thu May 4 21:34:07 UTC 2023


On Thu, 4 May 2023 18:05:13 GMT, Xin Liu <xliu at openjdk.org> wrote:

>> hi, @theRealAph 
>> 
>> I learn your comments and Boehm's. Actually, I found that a discussion on control dependency ordering took place in the PR of JDK-8288904 before. The knowledge is non-trivial but I think I got your point.
>> 
>> IMHO, you insist the general rule to a niche case. you are still right, but now the bar is too high to resolve this immediate error. 
>> 
>> 1. compiler may flip orders regardless control dependency -- check! you also makes reference of kernel RCU. However, UL configuration is much simpler than C++ standard libraries and Linux kernel. In this case, I think we should take the advice of "Recipient only reads x" from Boehm.  there's a single writer. it's manipulating linked-list. within a CPU, control and data dependency is guaranteed. 
>> 
>> 2. I believe that the error reported in JDK-8305819 is caused by hardware instead of compiler optimization.  Evidence: I was told by @gaogao-mem this intermittent crash disappeared after he integrated a storestore barrier. He has ran the tests millions of times and the error never occurs again.  Can we settle that a store-only barrier can practically resolve this error? 
>> 
>> We also provided reasoning why we need a storestore barrier in prior comments. we will put comments in PR. The process is same as JDK-8288904. Reason, merge and resolve the error practically. 
>> 
>> 3. I am not saying it's perfect. I agree the current solution is still "buggy" in presence of compiler optimizations.  we never enforce it to the C++ standard so it's "UB". 
>> 
>> That's why we should work on replacing RCU with RWL. Like David pointed out, we need more time to review, test a solution. To achieve that, we need a better baseline. now we know that the concurrent test is broken on the weak memory model. If this bandaid works, I think we will be in the better position for the development.
>
>> I assume that is the memory barrier that RCU is supposed to be implicitly implementing? If this is missing from this RCU-like thing can we just use release/acquire as suggested much earlier?
> 
> currently, it uses Atomic::add in LogOutputList::increase_readers/descrease_readers. The default order of Atomic::add is `memory_order_conservative` so it emits a full memory barrier.

Right but presumably the missing read-side barrier relates to the missing store barrier that we want to insert.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13421#discussion_r1185537858


More information about the hotspot-runtime-dev mailing list